I’m not sure when it first started, but the Google API has been bombing out over the last few months when returning over 2^31 (2,147,483,648) results for a query. The API has bombed-out almost every day in June when my script searching for “database” and “list” which each return several billion results. Apparently Google’s SOAP interface is using a 32-bit integer for returning the total pages returned, but they need to be using a 64-bit long integer.
Michael Freidgeim made note of the problem on his blog a few weeks ago. Others have noticed this problem going back to April 2006. Who knows when Google will make a fix. If it's not one thing, it's something else... ;)
When searching to see when Google started using the larger total results, I came across a posting by Danny Sullivan that shows how he was attempting to use a “trick” to reveal how many pages Google has indexed. Danny suggested issuing a query that says, “give me all the pages that don’t have the word asdkjlkjasd.” I just tried –asdkjlkjasd on Google, and it gives me back 20.7 billion results. MSN gives around 5.2 billion results, but Yahoo and Ask won’t accept the query. Interesting…
No comments:
Post a Comment