On that note, I have updated Warrick with the most recent changes:
- Warrick now uses the Google API for accessing cached pages.
- Warrick issues lister queries (queries using “site:” param) to Google using page scraping.
- Yahoo API libraries were updated due to a March 2006 change.
- Several minor bugs were corrected.
I also received several emails from the Internet Archive last week about Warrick. Apparently the guys that do backups for people with missing websites are excited about the tool, and IA will start informing users to use it:
If you are tech-savvy and know how to use command-line utilities, you can also refer to the Warrick tool here: http://www.cs.odu.edu/~fmccown/research/lazy/warrick.html and be sure to email the makers as they track who is using the tool. For this tool, a third party has put it together and we cannot guarantee the results. If you have questions about this tool, please refer your questions to the makers themselves.One of the IA employees told me she has performed at least 200 recoveries for individuals in the past year. That’s a lot of people using “lazy preservation” and sure does support the need for research in this area.
Speaking of the Internet archive, I don't have a clue how they store all that information. I don't know how they raise the funds to do so or what their motivation is. BUT, it's impressive.
ReplyDelete