Tuesday, June 19, 2007

JCDL 2007 - day 1

I'm in Vancouver at JCDL 2007. I flew in yesterday with Martin, and the rest of the ODU posse flew in today. Between the 3 of us Ph.D. students, we'll be making 6 presentations at JCDL and IWAW this week.

This is my first trip to Vancouver, and I must say it's one of the most beautiful cities I've been to. Just wish I had Becky and the Bean here with me to enjoy the beauty. The photo on the right shows what the view looks like from my room at the Westin Bayshore hotel.

This morning I joined Kris Carpenter and Brad Tofel of the Internet Archive in a tutorial about researching the Internet Archive. I shared some of my research with Warrick and a recent study analyzing IA overlap with search engine caches. (You can see my slides here.)

I learned a lot about IA from Kris and Brad. Here are just a few items of note:
  • The IA currently has archived around 96 billion resources (html, pdf, images, etc.), or about 1.9 petabytes of data, 51% of which is unique.

  • Although the Archive's holdings are 6-12 months out-of-date, beginning July 1, the Archive will receive updates on the first of each month and will only be 2 months out-of-date.

  • The Archive is currently working on adding full-text search to its contents from 1996-2000 in a project called 20th Century Find. (No URL yet.)
Tomorrow morning I'll be making a presentation about a Warrick experiment, and Joan will be presenting a short paper after me. Looks like I'll be burning the midnight oil getting prepared for tomorrow...