Showing posts with label publishing. Show all posts
Showing posts with label publishing. Show all posts

Saturday, October 24, 2009

Article in CACM

Check out my article Why web sites are lost (and how they're sometimes found) in the November edition of the Communications of the ACM. My co-authors were Cathy Marshall (Microsoft Research) and Michael Nelson (Old Dominion University).

If you don't have an ACM Digital Library subscription, you can access the pre-print here.

Abstract:
We have surveyed individuals who have lost their websites (through hard drive crashes, ISP bankruptcies, etc.) or have tried to recover websites that once belonged to others. We investigate why these websites were lost and how individuals reconstructed them, including how they recovered data from search engine caches and web archives. The findings suggest that digital data loss is likely to continue since backups are frequently neglected or performed incorrectly; furthermore, respondents perceive that loss is uncommon and that data safety is the responsibility of others. Finally we suggest that this benign neglect be countered by lazy preservation techniques.

Monday, January 23, 2006

Paper Rejection

I recently received a “sorry but your paper was not accepted” notice for one of my papers. This paper was probably the best one I’ve written to date. The conference that rejected my paper is a top-notch, international conference that is really competitive.

According to the rejection letter, they only accepted 11% of the papers (about 80) and therefore rejected around 700 papers. If each paper had on average just 2 authors (that’s probably a little low) then around 1400 people received the same rejection notice. If each paper took on average 100 hours to write (collecting data, preparing, writing, etc., and again that’s got to be too low of an estimate) then 70,000 hours have completely been wasted, not to mention the time required by the reviewers to read all these rejected papers.

Now these rejected individuals (most with PhDs) get to re-craft and re-package their same results for a new conference which has different requirements (less pages, new format, etc.), adding another 5 hours minimum per paper, which results in another 3500 hours spent on the same set of results. Meanwhile these re-formulated papers will compete with a new batch of papers that have been prepared by others. Also the results are getting stale. Unless the new paper gets accepted at the next conference, the cycle will continue.

This seems like a formula guaranteed to produce madness.