Sunday, May 31, 2009

Thousands of websites about to bite the dust...

Yahoo announced a month ago that it was pulling the plug on GeoCities, one of the Web's first free web-hosting services. There doesn't appear to be any plan to migrate the thousands (millions?) of websites this will affect to other services. If you don't act by the end of the summer, you're Geocities website will disappear.

That is unless the Internet Archive has grabbed a copy, but they aren't likely to have many pages from each Geocities website archived. I've been conversing with someone who lost a backup of her Geocities website years ago, and IA only had a handful of pages archived. This is likely going to be a recurring story in the years ahead.

My first website was on Geocities. In fact, that's how I first learned how to use HTML in 1997. I'm so embarrased by that first website that I'm keeping the address a secret. I fear the day the Internet Archive's Wayback Machine has full-text search, because someone's going to pull it up and post it on Facebook or something. That's one stream of bites I'm not afraid of losing.

Tuesday, May 26, 2009

Flight simulator site AVSIM destroyed by hackers

This morning I got a call from an individual who alerted me to the AVSIM tragedy. Apparently this popular flight simulator website with 13 years of articles, forum posts, etc. was not being backed-up properly, and a hacker took them out.

Tom Allensworth, the website's founder, stated:
"The method of the hack makes recovery difficult, if not impossible, to recover from. AVSIM is totally offline at this time and we expect to be so for some time to come. We are not able to predict when we will be back online, if we can come back at all."

It's possible Warrick could recover a significant amount of lost content, but I have not heard from anyone at AVSIM about it. Perhaps they are using it now as we speak.

Thursday, May 21, 2009

Braden William McCown has arrived!

Braden made his appearance at 2:38 pm this afternoon. He was 8 pounds, 4 ounces, and 22" long. He had a basketball in one hand and a tennis racket in the other which made the birth very painful wink, but we were very thankful he decided to come during the day instead of the middle of the night. Ethan was excited to meet Braden and even gave him a couple of kisses. Let's hope they remain good buddies!

Becky and I are very appreciative of all the calls, emails, and Facebook messages we've received. We are excited to introduce you all to the little guy.

God is good!

Wednesday, May 20, 2009

Java Sitemap Parser

I've just released the Java Sitemap Parser on The software is capable of reading Sitemaps in XML, Atom, RSS, and text format. As far as I can tell, this is the first open source Sitemap-parsing software available on the Web.

The Java Sitemap Parser was the final project for my Search Engine Development class. I talked about the project a few weeks ago and how prevalent Sitemaps are becoming. Originally we wanted to add Sitemap support to Nutch, but developing just the parser proved to be quite a task. By releasing it as an independent project, I'm hoping Nutch, Heritrix, and other open-source crawlers will integrate it into their systems.

Tuesday, May 12, 2009

I love my teacher evaluations

Every semester I usually get evaluated by my students (I just got my results back today). They answer questions like, "How effective has the instructor been in this course?" and "Rate the instructor's command of the subject matter." All responses are anonymous.

This is common practice at most universities, and it often creates terror in the hearts of many faculty. I've known colleagues who have never read their teacher evaluations for fear of what their students might say, and I've known others who can still recite word-for-word some of the cruelest comments made by students over 20 years ago.

I've received my share of poor evaluations, especially when I was a new teacher. It took me a few semesters to get the hang of teaching, and now my evaluations are generally good (not great, but typically a little higher than the average Harding professor).

What I've found over my 10+ years of teaching is that some students give really helpful comments that can help you improve your class next time around. "I wish we could have spent some time discussing how to apply some of the new principles we learned to our project." Some students are going to really like you and let you know it. "The professor had good teaching skills, was responsive and helpful to questions, and was very knowledgeable."

Other students... well... you have to take their comments with a grain of salt. You have to realize that some students are not going to like it if you require them to work hard (many students think they should receive a B just for attending every lecture). Some students are just poor at evaluating others' performance. Others have yet to realize that they are responsible for their own learning. Occasionally a student is going to be having a bad day, and you're anonymous evaluation is going to be the perfect target.

What really helped me was learning how to properly interpret students' remarks and judge whether the criticism has merit or not. I think learning this skill is important to any new faculty member, otherwise you'll be crying yourself to sleep after reading your evaluations.

Here are a few comments I've received over the past couple of years along with my interpretation of said comment and response. smile

  1. Student 1: The projects expected a lot from the students.
    Student 2: Smaller, less-brutal projects would not be a bad idea.

    Interpretation: I thought this class was supposed to be easy!

    Response: If computer science was easy, we wouldn't be getting paid like we are, and everyone would be doing it. The projects are tough because I'm preparing you for the far more difficult and complex projects you'll encounter when you enter the workforce. You'll thank me later.

  2. Have different projects that we can choose from instead of making everyone do the same project.

    Interpretation: I like my classes like my Burger King - my way!

    Response: I always entertain ideas for new projects, but it's unreasonable for any teacher to spend hours coming up with a menu of project choices to cater to every whim. In a software development job, you are unlikely to have a boss ask you which project you'd like to work on... you'll work on what needs to be completed.

  3. Instead of making us use the programming language you want us to use, let us use one we are already familiar with.

    Interpretation: Learning something new is highly overrated.

    Response: If you graduate from Harding being comfortable with only one or two languages, you should get your money back, because we haven't adequately prepared you. You'll need to learn new languages all the time as a working professional.

  4. Disable the Internet on the classroom computers so that we can only access web sites are necessary for class. Remove Solitaire, Minesweeper, Hearts, etc. from the computers.

    Interpretation: Save me from myself!

    Response: I appreciate this student's honesty. I asked our lab administrator today to remove all games. There's going to be some very disappointed students next Fall. wink

  5. Student 1: The fast pace of the class made it difficult to fully learn concepts.
    Student 2: It felt like sometimes you paced the classes very slowly.

    Interpretation: The pace of the class is perfect!

    Response: If roughly the same number of students complain that the pace of the course is too fast and too slow, I know I'm covering it at just the right pace.

  6. You try to cover too much material for a semester. Your previous classes didn't have to learn as much as we've had to. :-(

    Interpretation: Curse you ever-evolving technology!

    Response: One of the enigmas of higher education is that the consumers (the students) are often happier to receive less for what they are paying for (education). Can you imagine the same student being upset if McDonald's gave him a large order of fries for the price of a medium? Harding should fire me if I quit trying to keep my classes current and just teach the exact same stuff every semester.

  7. Don't give us really hard assignments, and don't expect us to have them done by the next class period... we do have other classes and lives!

    Interpretation: I'm serious about "me" time.

    Response: You should schedule 2-3 hours of outside-class time for each hour you are in class. (This is a universal rule that applies to all your major courses, not just mine.) So if I give a homework assignment on Mon and expect it due Wed, you should have already allocated 2-3 hours (at least) to getting the assignment finished. If your assignments are taking much longer than that to complete on a regular basis, that's a sign that you need to start getting some extra help and adjust your schedule accordingly. Remember that half of the class thinks we're going too slowly (see #5 above).

  8. Weaknesses of the instructor: Calvinism

    Interpretation: ???

    Response: "Isms in my opinion are not good. A person should not believe in an ism - he should believe in himself. I quote John Lennon: 'I don't believe in Beatles - I just believe in me.' A good point there. Of course, he was the Walrus. I could be the Walrus - I'd still have to bum rides off of people." - Ferris Beuller


Inspired by Jordan's comments, I have added a little to my original post.

Friday, May 08, 2009

Spring semester is over

I wrapped up all my grading today. We have a senior reception tonight and the graduation ceremony tomorrow.

Below is the grade distribution for my Intro to Programming, Internet Development, and Search Engine courses. The average was 79.0, and the median 85.4. If I had time I'd compare this to my past semesters, but I don't think much has changed.

I guess I'm a little guilty of grade creep... the average student is supposed to get a C, right? I think my students would argue with that conclusion. A recent survey found that 30% of college students agree with the statement: "If I show up to every class, I deserve at least a B." Surely that percentage isn't nearly as high at Harding. wink

Wednesday, May 06, 2009

Team Digital Preservation

In an effort to bring digital preservation to the masses, DigitalPreservationEurope (DPE) is developing an entertaining series of short animations introducing and explaining digital preservation problems and solutions. Below is their first video. It's a throw-back to animated cartoons of the 1960s, and it is fantastic. Watch as Team Digital Preservation thwarts Team Chaos' plans to disrupt digital information from a nuclear power plant.
"You fiend! It's essential to have long term stable and trusted information on how nuclear power plants are built and what's inside them!" - DigiMan

Future cartoons will be made available on DPE's You Tube Channel.

Monday, May 04, 2009

Improving movie recommendations

If you haven't yet checked out the new CACM blogs, you need to soon. One of the posts that caught my attention was Greg Linden's What is a Good Recommendation Algorithm? Linden wonders if Netflix's one million dollar reward for a better recommendation engine is a little short-sighted. The goal for their recommendation system is to only show people how much they might like a movie. But Linden points out:
However, this might not be what we want. Even in a feature that shows people how much they might like any particular movie, people care a lot more about misses at the extremes. For example, it could be much worse to say that you will be lukewarm (a prediction of 3 1/2 stars) on a movie you love (an actual of 4 1/2 stars) than to say you will be slightly less lukewarm (a prediction of 2 1/2 stars) on a movie you are lukewarm about (an actual of 3 1/2 stars). Moreover, what we often want is not to make a prediction for any movie, but find the best movies. (emphasis mine)

Shifting gears a little, I want talk about a couple of small fixes to an existing movie recommendation system that could make customers a lot happier.

I haven't used Netflix, but I've been using Blockbuster Online for over a year, and I've played with their recommendation feature a lot. I would assume their recommender is on par with Netflix (hint: someone needs to compare the two).

One feature Blockbuster offers allows you to select "Do not show me this movie again", a little icon on the side of each movie's ratings. I've clicked this icon a lot (is it just me, or there's a lot of garbage out there?), hoping Blockbuster would stop recommending these specific movies to me and others like them. However, the screen shot below is what I saw this morning when I logged into my account:

Note how I was recommended "Zack" and "Quarantine" despite having clicked on the no-show icon weeks ago. They also recommend , a movie I've already rated (and therefore obviously seen). But since I didn't rent "Changeling" directly from Blockbuster, they still offer it as a movie I "might have missed."

These movies do not appear in my formal set of recommendations (the screen that results from clicking on the Recommendations link), so my guess is Blockbuster is using a different set of algorithms to populate their might-have-missed list from their formal recommendation list. However, I suggest that the might-have-missed list should take advantage of previous ratings to improve overall customer satisfaction.

This should be common sense: Do not suggest a movie that a user has already marked "do not show me this movie again". Especially not on the first page the user sees when logging into your site.

One more point. Below is a screen shot from the first page of recommendations made by Blockbuster. None of the movies below appeal to me, but I can see how they might have been recommended based on my viewing history and ratings.

But one movie really stands out as a bad recommendation: "Swing" (bottom-left). Note how it has only received two stars on average, equivalent to "I didn't like this movie".

Why would Blockbuster think I would like this movie when most people don't?

I know my taste in movies is probably not typical, but I don't think I've ever given a movie with an average rating of two stars a rating better than two stars. Even if Blockbuster thinks this movie matches my tastes, it would make much more sense to put movies with higher overall ratings on the first result page and bump lower rated movies back a few pages.

My experience in general has been that Blockbuster's recommendations don't really work. I've found one recommended movie in the past year that I thought looked interesting. Then again, I don't often try iffy movie recommendations because I'm not ready to gamble on two hours of a nice evening.

I'm looking forward to a time when the recommendation system really works well, but until then, I'll be consulting with my friends and family who have a much better idea of what I really like to see.

Saturday, May 02, 2009

Micah Pate has been found

If you haven't already heard, Micah Pate's body has been found. Micah's husband Thomas is being charged with the killing this morning.

Micah Rine Pate was a Harding University graduate and Searcy native. Her parents are employees of Harding and Harding Academy. As you can imagine, the Searcy community has been rocked with this story. Our prayers go out to the Rine family and to Thomas' family.

The photo on the right is a screen shot of Micah's Facebook page. Many of her friends are posting sad farewells to her and telling her family how much they loved her. Her account will likely remain active as long as Facebook is around. I imagine her family is going to "capture" her Facebook account as well as an artifact of remembrance. I'm presenting a paper on this subject in June at JCDL 2009.


Two vigils in Searcy were held for Micah and the Pates, one at Harding. KARK 4 News had a news story about it last night. One thing that comes across in the story and interviews is Micah's faith and the positive influence she has had on others.