Monday, July 28, 2008

How cool is Cuil?

Today a new competitor enters the world of Web search: Cuil (pronounced "cool"). What's notable about this newcomer is that it's president and founder, Anna Patterson, is an ex-Googler as are several of Cuil's VPs.

In 2004, Patterson developed a search engine called Recall that was used to search the Internet Archive's massive corpus (apparently the search engine didn't last long... the Archive is only searchable by URL today). Shortly thereafter, she was hired by Google only to leave in 2006 to startup her own Google competitor. How much of Google's intellectual property went with her? That's a tough one to answer.

So why does Cuil think it can compete with Google?
  1. Cuil supposedly index three times as much content as Google.
  2. Cuil presents results in a magazine-like, multi-column format with more snippet text than Google, including embedded images.
  3. Cuil has an "Explore by Categories" widget that attempts to categorize pages.
Considering Google doesn't index every page they know about, it's hard to argue that the size difference is really significant. What will make or break their search engine is the quality of results and the interface. Some have already done some testing and named Google the winner. I did a little test querying for my name, and the results were not quite up to par.

Here's how I scored it:
  • Result number 1 (top-left) links to my old website at Old Dominion University instead of my current site at Harding (next result to the right). -1 point
  • The photo of me in result 1 comes from a different website entirely, so I'm impressed they made the connection. +1 point
  • The photo in the Harding result is not me (wish I was that tan). -1 point
  • The result at the bottom-left is from DBLP which indexes academic papers. It's certainly relevant. +1 point
  • The photo in the DBPL result is not me (I'm a lot more buff)- it's the actor Frank McCown, better known as Rory Calhoun. -1 point
  • The next result to the right points to celebrity entry for Frank McCown AOL's Television website. This is a website that does their own web mining and erroneously marked my blog and Harding website as belonging to Frank McCown the actor. (BTW, this is a really tough problem to solve.) -1 point
  • The first categorization labels in the upper-right under Digital Libraries were somewhat descriptive of my research interests or projects I've been involved with: Digital preservation, Open Archives Initiative, and LOCKSS. +1 point
  • But when I click on National Science Digital Library, I get 0 results. -1 point
So my overall score:-2 points. Using the same query at Google shows 8 of the top 10 results are about me (result #1 points to my blog, #3 to my Harding website), but Google is less ambitious and doesn't mix in photos or categories. Still, I'd have to give Google a higher score than -2.

Does any else have any thoughts on Cuil?

Update on 7/30/2008:

Someone at Java Rants has created a parody of Cuil using Yahoo's new BOSS Search API: Yuil.