Thursday, May 25, 2006

Google limiting researchers to 1000 queries

I recently read a poster from ISSI 2005 entitled “Google Web APIs - an Instrument for Webometric Analyses?” The poster was written by Philipp Mayr and Fabio Tosques to introduce the Google API to webometric researchers. They ran several experiments to demonstrate that the API was useful. One experiment queried Google’s web interface and API with the term “webometrics” over 240 days. Their results showed a huge difference between the web interface and the API which made me wonder how you can consider an API useful if it gives you far different responses from what the rest of the world is seeing.

In their conclusion, Mayr and Tosques reported a limit of 10,000 requests per day. Google only allows 1000, so I emailed Mayr to see why they reported 10,000. He replied that Google would give researchers more queries, but when I emailed requesting a bump up, they replied with this:
Due to overwhelming demand, we are no longer accepting requests for additional queries or for commercial use permission.
So researchers are in a quandary: use Google’s public web interface to perform searches which frequently (in my experience) leads to being blacklisted for hours at a time (even when less than 1000 daily queries are being made), or use the buggy (502 errors are common) API with only 1000 daily query limit which returns very different results than those obtained through the web interface.

Inspired by this dilemma, I have decided to put the APIs from Google, MSN, and Yahoo to the test. I am running a series of experiments comparing what the APIs return to what the web interfaces return. I’m hoping this will result in something that will give researchers a little more information on how to go about using search engines in their experiments and what to expect when using the APIs. Now if I can just find a free server that I can use to make requests for a few months…