I presented my paper Lazy Preservation: Reconstructing Websites by Crawling the Crawlers today at the Workshop on Web Information and Data Management (WIDM). I was also the session chair for the Web Organization session. Joan was able to fight through her cough and present her mod_oai paper as well.
This was a competitive workshop (only 11 of 51 submitted papers were accepted), but I was a little disappointed with the small number of attendees (only a dozen or so). The presentations though were quite good. My favorite was “Coarse-grained Classification of Web Sites by Their Structural Properties” where they looked at website characteristics like the number of slashes in a URL and average URL length to determine if a website was a blog, a personal site, a commercial site, etc. Who would have thought you could guess which category a website fell into by looking at URL properties?
I also really enjoyed the keynote speaker, Sihem Amer-Yahia from Yahoo Research, who talked about a project at Yahoo where they are trying to personalize web search based on the community interests of the searcher.
Next year WIDM is going to be in Portugal along with CIKM. Hmm…