Saturday, November 11, 2006

CIKM days 2 and 3


The keynote speaker this morning was Gary Flake of Microsoft Labs who entitled his talk: How I Learned to Stop Worrying and Love the Imminent Internet Singularity. (I guess he really liked the title of my blog. ) Some of the topics included power laws, long tails, network effects, and the Innovator’s Dilemma. Essentially the talk was about how human knowledge, the ability to analyze the online world, and the ability to create digital artifacts are all converging to create an Internet singularity which is going to take over the world (or at least seriously change the way we do things). During the Q&A session after the talk, Gary briefly spoke about the “parasitic relationship” between publishers and academia and how we should throw the bums out and publish only on on-line journals. Stevan Harnad would have been proud.

I sat in on several presentations that mostly focused on database enhancements- not really my thing. One of the few papers that I did find interesting though was Xiaoguang Qi’s paper entitled Knowing a Web Page by the Company that it Keeps. Xiaoguang presented an interesting way to know more about what a web page is about by examining the parents, siblings, and children of the page. They also used theYahoo web search API to discover parents.

The banquet Wednesday evening was ok. I didn’t know anyone, but I had a decent chat with a fellow from Jordan who worked in the database area. He told me he was somewhat disappointed with the conference and suggested VLDB was much more interesting. I guess I was a little disappointed too since the focus of most of the research was only peripherally related to my own interests, but I probably should have expected that coming into the conference.


The keynote speaker this morning was Joseph Kielman from the Dept of Homeland Security. Basically HS would like to model the way the behavior of the entire world and have the computer say, “Hey, I think Joe Mohammad is about to go jihad on us.” Kielman gave some indication that the bureaucracy at HS made getting things done very difficult.

The one presentation I really liked today was written by a group from Yahoo and Stanford and entitled Estimating Corpus Size via Queries. They showed a method that could be used to answer the question: How many pages in Chinese from US-registered servers are indexed by Yahoo? Their method requires several assumptions to be true such as the query must produce less than 1000 results since search engines do not give access to more than 1000 results.

I skipped out on the last session of the conference so I could catch a matinee showing of The Prestige. It’s a movie about two magicians who are obsessed with discovering each other’s secrets (excellent movie, by the way). It got me thinking… if CIKM would introduce a couple of magic tricks between presentations, maybe get the session chair to make boring speakers suddenly disappear in a flash of smoke, this might turn into one of the “can’t miss” conferences of the year. As it currently stands, I have to admit that librarians know how to have more fun (see JCDL).