Wednesday, July 25, 2007

7 days left in Virginia

A week from today, Becky, Ethan, and I will be heading back west to Searcy, Arkansas so I can start teaching again at Harding University. The condo is sold, the boxes are starting to pile up, and we're getting excited about returning to Searcy.

At the same time, we are going to really miss Norfolk, especially our friends and church which have been such a great blessing to us. We'll also miss living in a neighborhood of beautiful, older homes surrounded by water.

God has really blessed us while we've been here these past three years. Becky was able to find a really great job at Regent University, and my Ph.D. studies have gone really well working with Michael. And now God has given us a beautiful son whose smile will brighten anyone's day. Like they say, Virginia is for lovers. smile

Sunday, July 22, 2007

Citizendium or Wikipedia?

A few weeks ago I applied to be an editor on Citizendium, a new wiki project intended to be a more accurate and reputable Wikipedia. Citizendium does not allow anonymous postings, and they put new articles under a review process. My application was accepted a few weeks later, and a user account was created for me along with a user page listing my brief CV.

I decided to warm up with an article on digital preservation, a topic which I feel qualified to write about. Rather than start from scratch, I imported the Wikipedia article on digital preservation. Although Citizendium frowns upon importing articles from Wikipedia, I had previously written a large portion of the Wikipedia article, so I didn't feel too bad about doing it. I cleaned up the article by focusing the definition, cleaning up the references, and removing the numerous external links.

I didn't spend a whole lot of time editing the article because I got to thinking... can Citizendium really compete with Wikipedia? Larry Sanger (Citizendium's founder and co-founder of Wikipedia) seems to think so. Although I agree that Citizendium's policies in theory would result in more reputable articles, I don't think Citizendium can possibly scale to Wikipedia's size. First, there are a number of people who want to make a minor contribution to a Wikipedia article, a slight correction or clarification, for example. And since they don't have to register, the barrier to entry is sufficiently low enough for them to contribute.

Second, there are a number of people who, for whatever reason, want to remain anonymous or known by some alias. They are not likely to sign on with Citizendium and convey to the world why they are qualified to write about XYZ.

Third, if someone wants to make a contribution to an article on a particular subject, now they have to decide do they make the contribution just on Wikipedia, or on Citizendium, or both? Do they monitor both sites for changes to an article that is important to them? If a Citizendium article is actually better than the Wikipedia article, what is stopping Wikipedia from just importing the entire Citizendium article?

Forth, choosing the name "Citizendium" was a poor choice. It doesn't exactly roll off the tongue.

I really do hope Citizendium takes off, but I think its going to take a very, very long time before they have anything near the number of articles that Wikipedia has to offer. In the meantime, I would much rather put my efforts into something I know is going to rank high in search results and gets far more page views. I'll be watching Google to see if my Citizendium article will ever beat out Wikipedia's (currently ranked number 8) in the search results for "digital preservation". If it ever does, I'll defect to Citizendium.

Saturday, July 21, 2007


My pick of the week's top 5 events, articles, or items of interest:
  1. Researchers at Harvard have developed a robotic fly which could some day be used for spying, detecting harmful chemicals, or just annoying families out picnicking.

  2. What comes next after Vista? Windows 7.

  3. Here's an interesting article about Luis von Ahn's latest foray into utilizing humans for AI. The reCaptchas project is brilliant, but what caught my eye was a new game he is developing called Matchin' which enlists humans to rate the attractiveness of people's photos. The results could be used to make image archives searchable by attractiveness. Sorry bon Ahn, but I think this has already been done.

  4. A new search engine called Lexxe is using natural language technology to answer user's queries (think Ask Jeeves). I tried the query "why should I use lexxe instead of google?" and got my answer on the first result. Not bad.

  5. Becky finally joined Facebook this week. I'm a little jealous though because I've been on there for several months, but she already has more friends than I do. Just a thought, but why does Facebook limit you to just "friends"? What about "casual acquaintances" or "person I saw on the bus"? I guess it's up to me to invent the next Web 6.0 social networking site.

Friday, July 20, 2007

Note to Acer

To the Acer Desktop Design Team:

Guess what happens when you put a power button that activates at the slightest touch on a tower which sits by the operator's knee? You get one very irritated user who feels like hurling your tower out the window.


Frustrated in Norfolk

Thursday, July 19, 2007

Fourth wedding anniversary

Becky and I were married four years ago today in Searcy, Arkansas. We celebrated today by hiring a babysitter and having a nice lunch at The Trellis in Williamsburg (and their famous Death by Chocolate cake) and then walking around and enjoying our time away.

I cannot thank God enough for providing such an encouraging and loving spouse. I honestly don't think I could have made it through the Ph.D. process without her support. Although I could go on and on about Becky and embarrass her endlessly, I'll just say this: someday I pray our son will find a wife as incredible as his mother. smile

Wednesday, July 18, 2007

Cool visualization of the day

I came across this today. It's a figure from The Political Blogosphere and the 2004 U.S. Election: Divided They Blog (2005) by Lada Adamic and Natalie Glance showing a citation network of the liberal and conservative blog communities.

There are 676 liberal (blue) and 659 conservative (red) blogs represented, and the size of each blog point reflects the number of other blogs that link to it. Blue links connect liberal blogs, red connect conservative blogs, yellow links go from liberal to conservative, and purple from conservative to liberal.

Saturday, July 14, 2007


My pick of the week's top 5 events, articles, or stories:
  1. The CACM has several excellent articles on the science of gaming. Once of the articles discusses Carnegie Mellon's masters degree in video games which sounds like something I would leap at in a heartbeat if I was a graduating senior. I also enjoyed Kelleher and Pausch's article on Storytelling Alice which I would seriously like to investigate using at Harding in the near future.

  2. While on the subject of gaming, who is preserving our video games for future generations? An interesting article in the Guardian laments the fact that, besides a few game archives (like the new one at Univ of Texas), there is no industry initiative to preserve video games and consoles like the Amiga CD32, Pioneer LaserActive, and Bandai Playdia. The article concludes with a statement that may have been tongue-in-cheek, but there's some truth to it:
    The Virtual PC thing is worthy, but are you really telling me that the preservation of obscure census data is more important than saving the software catalogue of the Sega Dreamcast?
  3. I stumbled across a link to Pagefactor while scanning the Wikipedia article on link rot this week. Pagefactor is attempting to use humans to locate pages that disappear. I'm not sure if the approach is going to work (hmmm... I can't find the web page I'm looking for, oh now I found it, and now I'll go do some work to let others know the new location), but it's interesting nonetheless.

  4. Six months into Vista, and users are still griping. So am I. I got a new PC in June loaded with Vista, and although I have enjoyed the much improved graphics and speed, I've been annoyed by two blue screens of death, video that won't play smoothly in QuickTime, and constant pop-ups asking if I really meant to do what I asked the computer to do. So why don't I give up and buy a Mac? To quote For Your Consideration:
    You can't throw the baby out with the bathwater because then all you have is a wet, critically injured baby.
    Actually there are two main reasons: 1) I've invested an incredible amount of time learning Windows and don't want to start over, and 2) I want to be running what most people are running. It's embarrassing be asked a computer question (by family, friends, or students) that I know nothing about. And as a CS faculty member, I should be very familiar with how the majority of the world is interacting with their computer.

  5. This week some masked bandits cracked a safe and stole $12,000 with the help of Google. After struggling with opening a safe for over an hour, the buglers went to a nearby PC and Googled "safe-cracking" which led them to an article on "How to Open Safes". Soon they were in and left with $12,000.

Friday, July 13, 2007

Warrick queue is growing

Last week I wrote about the new web interface for Warrick. Well apparently the interface has made Warrick a very popular tool, because there are currently 18 jobs in the queue, 5 jobs currently running, and 16 completed jobs. Unfortunately, the jobs currently being ran are huge, most of them with more than 10K resources that need to be recovered. One job has over 136,000 URLs needing to be recovered!

I upped the daily queries to the Internet Archive from 1000 to 2000, but the changes only get read the next time Warrick is ran, so the current jobs can't be sped up. Not sure what to do but hope some of these jobs finish soon... there are possibly numerous web pages falling out of search engine caches each day for those jobs waiting in the queue!

I guess it's better to be too popular than the shy kid in the corner of the room. wink

Oh... Ethan slept 8 hours straight last night! Not bad for a 3.5 month old!

Saturday, July 07, 2007

Ethan update

It's 7-7-07, so I had to post something on this statistical anomaly of a day. Here's a few pics of Ethan from the past month.

Looking good in my golf outfit!

With grandma and mom at church

With dad and grandpa

I'm not amused

Happy with Mom

Friday, July 06, 2007


TGIF! My pick of the week's top 5 events, articles, or items of interest:
  1. The Internet Archive just increased their web archive (the Wayback Machine) by 25%. Brad Tofel, an archivist at IA, told me that from here on out the Wayback Machine should be 3 months out-of-date rather than 6-12 months like they have been in the past.

  2. It's a great time to be a talented college student looking for an internship.

  3. Some brilliant blogging advice from one of the most prolific bloggers at Google:
    Don't post when you're angry.
    These and other wise sayings from Matt Cutts were in response to a Google employee's blogging about her dislike of the new movie Sicko. I'll add to this...
    Don't post a response to someone else's blog posting when you're angry.
    I've learned this lessons the hard way.

  4. Dell is warning their customers about moving from XP to Vista. I recently made the move a few weeks ago when I purchased a new desktop for home use. It hasn't exactly been the smoothest transition, but I'll blog more about that later. I would certainly advise anyone making the leap to be on their guard.

  5. If you haven't played the Wii yet, find a friend who owes one and give it a shot! On the 4th our friends Brad and Katie threw my family a going away party (we're heading back to Arkansas in Aug). Brad owns a Wii, and it must have been the most popular activity at the party! No wonder the Wii is currently out-selling the PS3 six-to-one.

Thursday, July 05, 2007

Link rot irony

What do you get when you do a search on Google Scholar for the paper titled

Runaway Train: Problems of Permanence, Accessibility, and Stability in the Use of Web Sources in Law Review Citations

(a paper about how problematic link rot is) and then try to access the link? Irony at its best. smile

Monday, July 02, 2007

Finally, a web interface for Warrick

Warrick finally has a web interface. Actually it's more than just an interface, it's a queueing system that we call Brass. You can read more about Brass in a paper I presented at IWAW 2 weeks ago. We're hoping the interface is intuitive enough for those who are not very technically inclined to be able to recover their lost website (or someone else's lost website) without too much difficulty.

We had a masters student (Amine) working on Brass for quite some time, and over the last few months I've been overhauling the interface and tightening up the security. (I'm sure there's still some holes in it, but it's much better than it used to be!)

Below is a screen shot of the opening interface.

After submitting the requested info, the user is sent a confirmation email (to avoid recovering websites for bots). Once the user confirms the request, the job is queued and later dispatched to a free machine. The user is emailed when the job completes, and then the user can download the recovered set of files from the lost website.

Go ahead and give it a spin, and let me know if you have any problems.

*By the way, the really cool Warrick logo was created by two graphic design majors at Harding: Andrew Murray and Luke Jones.