Tux the Penguin reading books

FOSS Academic

(Almost) De-Googled Research

Big news on the search front! The Brave Browser is now offering a search engine. According to their announcment, Brave Search promises “independent option for search which gives them unmatched privacy” and will use an independent index, rather than rely on Google or Microsoft. I’ve already started playing with Brave Search and am curious to see how it develops.

However, any time someone offers a search engine and their corporate parent isn’t named Alphabet, there’s a pretty predictable reaction: “Good on them for trying, but you just can’t beat Google.” I’ve also heard academics say the same thing: “I can’t find anything without Google.”

Search is, of course, a very common approach to conducting research, and it’s especially important to my work on Internet cultures. One key way to study the Internet is by searching it. And it seems as though that means all my work ought to flow through Google.

I hadn’t really thought much about this situation until just recently. Along with Sean Lawson, I co-authored a book, Social Engineering: How Crowdmasters, Phreaks, Hackers, and Trolls Created a New Form of Manipulative Communication. It will come out from MIT Press in a few months.

The book is the result of four years of research on hacker social engineering practices as well as early 20th century propaganda and public relations (a field that also referred to its activity as “social engineering”). After finishing this latest book, I realized I’m a living testament to the fact that yes, you can de-Google your search and still conduct research. Because that’s what I did.

Google logo with a line through it

Mostly.

Here’s how I conducted mostly de-Googled searches.

Startpage

The bulk of my research – especially my research on the history of hacker social engineers – relied on Startpage searches, especially the advanced search features. I used those find obscure documents and websites.

Ok, I can already hear many of you: “Startpage uses Google’s index!” Indeed, it does. One way to think of it is using Google through a proxy. But that’s not entirely the case. First of all, Startpage uses its own algorithms to sort Google’s index, so it differs in that regard.

But more importantly, Startpage does not customize search results based on my previous searches. I find that to be extremely important, because as a scholar focusing on the Internet cultures, I don’t want my previous search biases affecting what I find. I want to find unusual stuff that surprises me, not the most optimized result for me. I’m not looking for a donut shop near me that has the type of donuts I prefer; I’m looking for obscure text files, PDFs, and websites – things I haven’t found before.

Archive.org

The Internet Archive is… I can’t think of good superlatives. Awesome? Invaluable? The coolest thing ever? In any case, all Internet Studies work should engage with the IA at some point.

The Internet Archive has a massive database – roughly 90 petabytes. Despite its size, the IA also offers its own search engine, mostly returning text-based results. Those results usually get me in the vicinity of what I’m looking for – often it’s a matter of searching within PDFs, which the IA also allows for.

Worldcat.org

Worldcat is a metasearch for holdings in libraries around the world. I often use it to get very good metadata on books for Zotero. Google Books and Amazon metadata are usually awful, missing fields like publisher, publisher city, and sometimes the author’s name. Worldcat, however, has been consistently good for metadata.

But beyond that, Worldcat’s metasearch of libraries reveals a wealth of historical insight. Knowing the rhythm of book publications – for example, Walter Lippmann’s Public Opinion appeared in 1922, and Edward Bernays’s Crystalizing Public Opinion appeared in 1923 – can tell me a lot about the “dialog” happening among authors. Plus, Worldcat also returns results for scholarly articles and other media.

Unfortunately: Google Scholar

I said “mostly de-Googled,” and by that I did not mean the fact that Startpage uses Google’s index.

No, by that I mean that there’s no substitute for Google Scholar, and over the course of writing Social Engineering, I relied on Google Scholar very heavily.

I am always watching for alternatives to Google Scholar, but they consistently come up short. I often use a vanity search for my own work to test a given database – mainly because I know what I’ve published over the years. Google Scholar finds my publications right away. Microsoft’s academic search seems to miss quite a bit, though it’s better than it used to be. Scinapse – who argues they’re “better than Google Scholar” – also fails to find many of my papers. I’m not sure what Refseek is trying to do.

The one I had high hopes for is the Bielefield Academic Search Engine (BASE), mainly because it is run by a university library (Bielefeld University Library). But the results there – for my admittedly vain search for my own work – is also underwhelming. I believe I can “claim” my profile and add papers manually, but who has time for that? After years of building profiles in Academia.edu, ResearchGate, or SSRN, I’m done with all that. I could spend my life building profiles of myself instead of, you know, doing research. Instead, I just put my papers on my own website.

The upshot to Google Scholar is, for better or worse, it finds my papers automatically. And because of that, I’m pretty confident that I can find other people’s papers in Google Scholar – which is, of course, what I actually use it for.

Keep Searching

So, there you have it. There is search beyond Google. It can be done. At the very least, thanks to Startpage, you don’t have to use Google proper – and this is not even to mention DuckDuckGo or, of course, Brave Search.

Yes, Google Scholar still seems indispensible, but I can even imagine doing a project without using Google Scholar, relying on a mix of Worldcat.org and maybe Microsoft Academic.

Have I missed anything? Any of you done research without having it flow through Google? Let me know in the comments!

Post Tags