Friday, July 13, 2007

Warrick queue is growing

Last week I wrote about the new web interface for Warrick. Well apparently the interface has made Warrick a very popular tool, because there are currently 18 jobs in the queue, 5 jobs currently running, and 16 completed jobs. Unfortunately, the jobs currently being ran are huge, most of them with more than 10K resources that need to be recovered. One job has over 136,000 URLs needing to be recovered!

I upped the daily queries to the Internet Archive from 1000 to 2000, but the changes only get read the next time Warrick is ran, so the current jobs can't be sped up. Not sure what to do but hope some of these jobs finish soon... there are possibly numerous web pages falling out of search engine caches each day for those jobs waiting in the queue!

I guess it's better to be too popular than the shy kid in the corner of the room. wink

