Friday, January 19, 2007

No more searching for you: Google drops the SOAP

In case you were asleep at the helm like I was, Google has pulled the plug on their SOAP-based web search API. On Dec 5, 2006, Google stopped giving users new API keys. They claim the API service will continue to run, but without a method for obtaining new keys, it essentially becomes worthless (API keys can't be shared since they are tied to a specific individual's Google account, and I can't let you run my application unless you supply it with your own key).

Google has decided their AJAX search API is the wave of the future. But why the "odd move"? I think Jason Lefkowitz summed it up best:
Today, though, Google isn’t about search. It’s about displaying ads. And in that context, an open API makes no sense — the developer can reformat the search results, and even show them (gasp) without ads!

Hence the “AJAX API”, which forces you to take the ads along with the search results. You can’t really do much with it, but it does create a new place for Google to show ads on — your blog/site/Web app.
I don’t have a problem with Google focusing on their AJAX search API... I’m sure it’s very useful in many contexts, but I do have a problem with them abandoning their SOAP search. Not only is Google putting the smack down on the SEO business (one of their intended victims, in my opinion), they are hurting us web researchers who depend on automated methods of querying Google.

I can point to a huge stack of academic papers that, without an effective method of automatically querying Google, are un-reproducible (Google- do you really want everyone to go back to page-scraping?). And it’s really hurting my research: Warrick will not work for new users without API keys. I’ve spent lots of time writing wrappers around the SOAP API code, now I’ll have to redo most of when I find an effective method of accessing Google’s cache. Until then, you can kiss your lost website goodbye if Google is the only one who has cached it.

It sometimes appears that have a love/hate relationship with Google. Yesterday I was singing it's praises, today not so much. In honor of the SOAP API, I’ve put together a brief timeline for us all to reflect upon:
  • Pre 2002 - Page-scraping is the norm, and there is great frustration.
  • 2002 – Google launches the first search engine API, and there is great rejoicing.
  • 2002-2005 – Researchers use the API to for all sorts of interesting experiments, SEOs do their best to reverse engineer PageRank, new services are built, books are written, and, despite many technical difficulties along the way, there is much satisfaction.
  • 2006 – Google tightens the lid on extra queries per key, and there is much displeasure.
  • Late 2006 – Google refuses to give new API keys, and there is much sadness and anger.
  • Late 2007 (My prediction) - Google’s SOAP API breaks, no one fixes it, and there is no surprise. RIP

Update on July 27, 2007:

Google has just released an academic API for researchers: University Research Program for Google Search. Now that's more like it.

Update on Sept 30, 2009:

Google has finally killed its SOAP Search API.