Questio Verum: 2010

Thursday, December 23, 2010

Merry Christmas from the McCown's

Merry Christmas!

Photo by Stacy Schoen

"For unto you is born this day
in the city of David a Savior,
who is Christ the Lord." - Luke 2:11

Friday, December 17, 2010

Tron's legacy

There's probably no one in Searcy, Arkansas, who is more excited to see TRON: Legacy tonight than myself. If you question my passion for the film, take a look at what is hanging on the most prominent wall in my office:

I saw the first Tron movie when I was about 9 years old. It presented a fictional world inside of computers where programs fought for survival by hurling light discs at each other and racing motorcycle-like vehicles that left a deadly trail of light behind them.

To say that it left an impression on me is an understatement. If you were to ask any NASA scientist why they entered their field, many would point to Star Trek. If you were to ask many computer scientists of a certain age why they got into computing, many would point to Tron (and perhaps War Games).

I'm going into the movie with middle-of-the-road expectations. It's got to be tough making a film that pleases the original Tron fanatics and a younger mainstream audience at the same time. I'll report back later after seeing the film.

In the mean time, check out this Tron music video which features an "alternative ending" to Tron.

End of line.

Update

I really enjoyed the film... 3 out of 4 stars. It had a semi-original plot, but the most interesting thing was just the inner world of Tron. The CG that produced a young Flynn was pretty good, but there is still obvious room for improvement. The 3D was pretty good and was well suited for this type of movie. I think my favorite scene was in the End of the Line Club where Daft Punk had a cameo appearance. Wish the main character, Sam, had his father's zany sense of humor... he's a bit too cool. I'm definitely going to see it again while it's on the big screen.

Tuesday, December 07, 2010

Android Workshop at SIGCSE 2011

I'm offering a workshop on Android application development at SIGCSE 2011 in March 2011: Audacious Android Application Programming. (The workshop name was shamelessly stolen from Michael Rogers' iPhone workshop from last year's SIGCSE.)

Here's a brief description:

As smartphones and mobile devices become ubiquitous, many CS departments are adding mobile computing electives to their curriculum. Google’s Android OS is a freely available and popular smartphone platform with applications programmed in Java. Workshop participants will be introduced to mobile app development and the Android SDK. We will write some simple Android apps with Eclipse and run them on an emulator. For those interested in teaching an upper-level Android course, reusable programming labs and projects will be distributed, and we will discuss some teaching strategies. Participants should be capable of writing Java programs in Eclipse and should bring their own laptop preloaded with Eclipse and the Android SDK.

The workshop will be held Wednesday, March 9, from 7:00 - 10:00 pm at the Sheraton Dallas Hotel in Dallas, Texas. Cost is $65. More details will be made available soon on the workshop website.

Thursday, December 02, 2010

Memento wins Digital Preservation Award 2010

Congratulations to Herbert Van De Sompel and Michael Nelson for being awarded the Digital Preservation Coalition's Digital Preservation Award 2010 for the development of Memento.

"‘Memento offers an elegant and easily deployed method that reunites web archives with their home on the live web,’ explained Richard Ovenden, chair of the Digital Preservation Coalition. ‘It opens web archives to tens of millions of new users and signals a dramatic change in the way we use and perceive digital archives.’"

I've been working with Herbert and Michael on the development of the Memento Browser for Android. It's great to see these guys being recognized for their hard work.

Tuesday, November 30, 2010

Thoughts on cheating

I've been thinking a lot about cheating these past few weeks. It was triggered by a cheating incident that occurred in my CS1 course where a student had copied source code found on the web. Interestingly enough, this coincided with news that Oracle had amended its patent infringement lawsuit against Google to include a line-by-line comparison of code it claimed Google illegally copied.

About the same time, a massive cheating scandal was uncovered at the University of Central Florida involving 200 students who cheated on their midterm exam (approximately one-third of the class). And just a few weeks before I had read that cheating in CS accounted for 23% of all honor code violations at Stanford University where the students involved in the cheating make up only 6.5% of the student body. Oh, and did I mention the article in the Chronicle of Higher Education about the "Shadow Scholar" who has cashed-in by written innumerable papers on behalf of college students?

With so much cheating going on, one can't help but wonder if students today value honesty less than previous generations or is it just easier to cheat (and catch cheating) today? Is there something we could do as CS educators to reduce the amount of cheating going on in CS?

On Monday my chairman placed an article in my mailbox entitled Cheating in Computer Science by William Murray, a faculty member at the Naval Postgraduate School who is well-known in the computer security field. Murray thinks current CS teaching practices in which students must write original programs actually promote cheating by creating an artificial problem for which cheating is often the easiest strategy. Instead, Murray suggests we should employ practices that de-incentivise cheating, perhaps by promoting skills which are highly valued in the workplace like code re-use and teamwork.

I certainly agree that code re-use and teamwork can have positive benefits when learning programming. Pair programming is something I've been using for several years with positive results. I also promote limited code-reuse in my upper level courses when it's code that can augment my students' final projects, as long as the reuse is documented thoroughly and the re-user can adequately explain the code she's reused.

However, I don't see code re-use or peer programming as a panacea for reducing cheating. When I questioned my student who was caught copying code from the Web, it was clear that he didn't really understand what he had copied. He didn't understand it enough to even be able to fashion it into a solution resembling the program specification I had given. How did code re-use help him learn anything? I'm not saying he could not have potentially learned something, just that code re-use is not an immediate fix for cheating.

Teamwork is also not an immediate fix. I still remember vividly during my undergraduate years working on a team with someone who was understanding the class material significantly better than the rest of us. We relied on him heavily to get good grades on our projects, and many times we failed to fully understand our (his) solutions to the problems. And in pair programming, there are times when the weaker partner is just not going to "get" what is so obvious to the stronger partner, and it's easier to just turn in the finished assignment rather than struggle with the solution individually.

Murray goes on to suggest how teaching programming skills using existing programs will somehow remove the incentive to cheat:

"I no longer teach programming by teaching the features of the language and asking the students for original compositions in the language. Instead I give them programs that work and ask them to change their behavior. I give them programs that do not work and ask them to repair them. I give them programs and ask them to decompose them. I give them executables and ask them for source, un-commented source and ask for the comments, description, or specification. I let them learn the language the same way that they learned their first language. All tools, tactics and strategies are legitimate." (emphasis mine)

Let me get this straight: a student can use the strategy of copying someone else's solution, cite the person who did all the work developing the solution, and get credit for the work? Surely this is not what Murray is advocating. These are certainly worthwhile approaches that students could learn a lot from, but Murray does not make it clear how these approaches are less resistant to dishonest practices.

Murray later states that "Nice people do not put others in difficult ethical dilemmas," suggesting that I am somehow a mean guy for putting my students in a difficult situation when I ask them to write original code. I'm sure some of my students would agree when they start working on their assignments just hours before they are due. Perhaps every department on campus is guilty of the same thing since we all create "artificial" situations where students must come up with original solutions instead of borrowing others'.

My goal is not necessarily just to be nice, but to hold my students to a level of rigor where, if they take it seriously and put in the time and honest effort, they will be well prepared to enter the job market and have a basis of knowledge from which they can learn new skills. Many times this will require original work to problems that others may have already solved. However, having students solve these problems on their own or in pairs will put their brain through a mental workout that will prepare them to be a productive member of a development team in the future.

So what can CS instructors do to make cheating less appealing? Coming up with new assignments that are engaging, changing exam questions, and all those other time-consuming tasks are certainly beneficial. But I think a more successful approach is to simply make the case for academic integrity as a relationship between teacher and student, a relationship that can be harmed when deceit is allowed to enter the picture. Deception will potentially harm the student's self-image more than anything and cause serious regrets in the long-term. For those of us that seek to follow God, the relationship is three-way, and deception in a relationship with God is a non-starter.

I think we have to realize that many college students are still quite young and lack the maturity to take the high road. Our job as faculty is to help those who mess-up learn from their mistakes and exhort them to exercise integrity in the small things and the big things. This is something I'm still learning to apply to myself.

Friday, November 26, 2010

Firefox... you're killing me!

I'm trying to get caught up on my grading before the students return from the Thanksgiving break. Unfortunately, Firefox is driving me nuts, so pardon the short rant.

My web development class is mainly composed of freshmen and sophomores, many of whom have only been writing programs for a semester or two, and occasionally they will write CGI programs with infinite loops. Accessing these CGI programs causes an endless amount of data to be sent to the browser. Using Firefox to access these URLs often causes the entire browser to lock-up as shown below.

So I'm stuck using the Task Manager to kill Firefox and then restart the browser, just because one tab has gone haywire. This adds several minutes to my grading time each time I have to restart. Internet Explorer doesn't do much better; only Chrome lets me kill the offending tab without restarting the entire browser.

I suppose this wouldn't bother me so much if it wasn't such an obvious pitfall that a seasoned browser can't accommodate.

Listen up GUI students... this is a prime example of when threading is necessary so your UI thread can continue to respond to the user!

Saturday, November 20, 2010

Get flag images from Wikipedia

I needed a large number of national flag images at a certain resolution for a project my Web Development course was working on. By examining some flag images I saw on Wikipedia, I noticed that they were creating flag images on the fly.

For example, to create the United Kingdom's flag that is 100 pixels wide, you can access this URL:

http://upload.wikimedia.org/wikipedia/commons/thumb/4/45/Flag_of_the_United_Kingdom.svg/100px-the_United_Kingdom.png

which produces this flag:

So I developed a Perl script that would automate this process for me. I've included it below for anyone else who might need flag images. Note that I had to set the user agent string, or Wikipedia would not respond properly to the http request. If you use this script to download a lot of images, please be nice throttle your requests with the sleep() command.


#!/usr/bin/perl

# This script will attempt to download the national flag
# produced by Wikipedia using the $flag country name and 
# $image_size as the image width.  By Frank McCown.

use LWP::Simple;
use HTTP::Response;
use strict;

# Width of the image
my $image_size = 100;

# Country's name
my $flag = 'the United Kingdom';

my $filename = lc $flag;
$filename =~ s/\s/_/g;
$filename = $filename . "_" . $image_size . ".png";

my $url_filename = $flag;
$url_filename =~ s/\s/_/g;

my $img_url = "http://upload.wikimedia.org/wikipedia/commons/thumb/4/45/Flag_of_" .
$url_filename . ".svg/" . $image_size . "px-" . $url_filename . ".png";

print "Getting $img_url\n";

my $ua = LWP::UserAgent->new;
$ua->agent('Mozilla/5.0 Firefox 5.6');
$ua->from('your@email.com');

my $response = $ua->get($img_url);

if ($response->is_success) {
   print "Writing to $filename\n";

   open(IMG, ">$filename");
   binmode(IMG);
   print IMG $response->content;
   close IMG;
}
else {
   print "ERROR: Could not download.\n";
}

Thursday, November 04, 2010

New Web Science course offered in Spring 2011

I'll be offering an Introduction to Web Science course this Spring. Web Science is an emerging field of study which encompasses computer science, law, economics, and a number of other disciplines. This course is for upper-level CS majors and will therefore focus mainly on computing aspects of Web Science. Below is a description of the course. If you are a Harding CS major looking for a challenging and enlightening elective, I hope you'll consider taking it.

The Web has fundamentally changed how we learn, play, communicate, and work. Its influence has become so monumental that it has given birth to a new science: Web Science, or the science of decentralized information structures. Although Web Science is interdisciplinary by nature, this course will be focusing mainly on the computing aspects of the Web: how it works, how it is used, and how it can be analyzed. We will examine a number of topics including: web architecture, web characterization and analysis, web archiving, Web 2.0, social networks, collaborative intelligence, search engines, web mining, information diffusion on the web, cloud computing, and the Semantic Web.

Programming projects will use Python, HTML & JavaScript, some Google APIs, and the Facebook API.

Prerequisites: COMP 245 & 250

Friday, October 08, 2010

Facebook adds ability to download your Facebook data

A few years ago I thought it would be really helpful to create a tool that would allow anyone to archive their Facebook account, just in case something happened to it. Think about it... 20 years from now, wouldn't it be interesting to see what was going on in your day-to-day life? And what if Facebook were to start charging fees to access your account or, Lord forbid, to disappear?

Last year we finally released the ArchiveFacebook Firefox add-on which allows you to save to your hard drive your Facebook account, just as it appears in your web browser.

My hope was that this tool would have a limited life span. I wanted it to nudge Facebook into providing a method to download and even transport user data to other social networks. Finally, it looks like Facebook has caved-in.

Coming soon, you will have the option to download a zip file from Facebook that contains all your wall posts, photos, messages, etc. You can browse the contents of the zip file in your browser.

The video below shows how this will work.

I have not yet been given access to the feature, but I will report back later once I've had a chance to use it. I'm not sure if it will be possible to upload the archived data into another social network. My guess is that someone will need to write a program that converts the zip file into an open format that can then be transported.

Thank you to Carlton Northern, Hany SalahEldeen, and others who have put a lot of time into making the numerous and painful modifications to keeping ArchiveFacebook working as Facebook made website changes. It may finally be time for it to retire.

Update on 10-20-2010

I was able to download my entire Facebook account today. It only took a few minutes after I requested the archive that Facebook made it available to me in a 6MB zip file. As you can see below, it's a spartan set of pages with all your Wall posts, photos, messages, etc.:

I scrolled down the very long Wall page and found my very first Wall post dated September 28, 2006 at 9:03 pm from my friend Stacey: "Welcome to the ridiculous! How's Bean? How are you?" According to the Facebook Wikipedia article, this was two days after Facebook had opened to the general public. I guess that makes me an early adopter (for once). wink

One technical problem I ran across: Facebook has mangled the image src attribute (src="../photos%2FProfile%20Pictures%2F514544861521.jpg" should be src="../photos/Profile%20Pictures/514544861521.jpg"), so I couldn't see my Photos in Firefox. I had no problem seeing them in Chrome.

Saturday, September 18, 2010

Memento Browser for Android is available

I've just created a home for the Memento Browser for Android, a project I started working on this past summer. The free Android app allows you to view older versions of web pages by merely selecting a date. The browser uses the Memento protocol to find archived versions of the page and displays whatever page is closest to the requested date.

For example, the screenshot below shows the browser viewing cnn.com:

If you wanted to see what this page looked like on Sept 7, 2007, you could select that date, and in a few seconds be looking at this archived page from WebCite:

Note that the page displayed is actually one day later than the requested date. That's because the browser was not able to find an archived copy on the exact date requested. The browser is only displaying archived copies from Internet Archive, WebCite, and a few other archives. While they have a huge amount of the web archived, they certainly don't have everything archived.

You can download Memento Browser here. I am working on an iPhone version of the app with a colleague of mine, but I don't have an ETA for it yet.

If you don't have an Android device, you can still download the MementoFox add-on for the Firefox browser which does the same thing.

Finally, you can watch a demo of the browser in action here.

Friday, September 03, 2010

Loving my Droid X

I've had it less than a week, and I'm hooked. The Droid X's screen is large (4.3"), sharp, and bright. The keypad is easy to type, the touch interface is extremely accurate and responsive, and reaction times are quick. The video playback is fantastic. It's quick to connect to the Verizon 3G network or local wifi, and I've gone nearly 3 days without having to recharge. It's running Android 2.1, but Froyo (2.2) is supposedly coming out soon.

Below is a photo I took from chapel this morning using the Droid X's 8-megapixel camera. Not bad considering the lighting. I'm standing in the pit at the front of the auditorium with my fellow faculty members. (The singing, as you could imagine, is awesome with a packed auditorium.)

One small negative: When I first entered my Facebook account credentials, it sucked up all my "friends" and put them in my list of contacts. Now my 500+ contact list is full of people I haven't seen in years and certainly don't contact on a day-to-day basis. If I want to remove an individual from my list of contacts, I'm told I have to remove them from my list of friends on Facebook. Boo.

I haven't added many apps yet (the phone comes with approximately 30 apps pre-installed). But one I did add was the Bible app from YouVersion. It allows me to simply say "Genesis chapter five verse twenty", and boom, you're there. Won't this be fun to play with during Bible class on Sunday. wink

Hey Apple, when are you going to create an iTunes for Android? This is probably the only reason I will hang on to my iPod Touch for now. (Yes, I've heard of doubleTwist, and I'll give it a try soon.)

Any other apps I should install?

Tuesday, August 31, 2010

Some computing history

The fall semester is in full swing here at Harding, and I've decided to convert some of my notes on historical events in computing to slides. If you are interested, here are my slides on Internet and Web history and history of graphical user interfaces (GUIs). I'll admit the GUI slides are slanted toward Microsoft because we focus on Windows programming in my GUI course.

I'm still working on my general history of computing and will post an update later.

Tuesday, August 17, 2010

Why I left Wikipedia

An article in this week's Newsweek reports that Wikipedia has been floundering since the spring: "Thousands of volunteer editors, the loyal Wikipedians who actually write, fact-check, and update all those articles, logged off-- many for good." The WSJ first reported the fallout almost a year ago when it was discovered that 49,000 English editors left Wikipedia during the first three months of 2009 compared to a loss of 4,900 during the same period in 2008.

Update: As one of the comments below states, the WSJ article was hasty in their conclusions. It all hinges on what you call an "editor", and a more balanced definition suggests that editors are not leaving Wikipedia in droves.

As the Newsweek article points out, there are a number of reasons why Wikipedia may be stagnating. There are so many articles already present that there is little new ground to break. Some may be scared away or frustrated by overly aggressive editors. Or perhaps "most people simply don't want to work for free."

Some research at Georgia Tech shows that editing a Wikipedia article is very challenging for computing newbies; the "Editing this way will cause your IP address to be recorded publicly" message causes lots of confusion, and this certainly prevents many from joining the ranks of Wikipedia editors.

I have always been a Wikipedia fan. I first started making serious contributions in 2004 when I was beginning my PhD research and discovered that many of the new concepts I was being introduced to simply didn't exist in Wikipedia.

I wrote a number of articles from scratch like web archiving, web search query, adversarial information retrieval, and URL normalization and made a significant number of edits on other technical topics. I was motivated in part by being the first to write the articles and the fact that I would likely refer back to them as reference material as I continued my research.

However, I found that keeping vandalism at bay and fighting poor edits was quite time-consuming. Some articles that I valued quite highly like web crawler needed tons of work, and although the desire was there, I just didn't have the time... I was trying to complete my PhD, and maintaining Wikipedia articles was not paying the bills.

I had an ah-ha moment at a conference a few years ago when someone quoted from Wikipedia's article on digital preservation, and I could have sworn I had been the sole author of the quoted piece. Wikipedia was given credit as the source, not me. That didn't bother me all that much, but it did make me realize that contributing to Wikipedia is often not in the interests of academics who are often judged by the amount of citable material they produce. Someone citing what you wrote in Wikipedia doesn't "count" like someone citing what you wrote in a journal article.

Over the past year or so, I just have lacked the motivation necessary to put time into an anonymous forum. My time is expensive, and Wikipedia is not paying. It's hard enough just to find time to edit my blog!

I still think Wikipedia is extremely valuable, and I hope it never goes away. I regularly send my students there and encourage them to make a serious contribution.

Have you seen The Book of Eli? At the end of the movie, a group of people are attempting to restore some of the greatest literary works of mankind. They are quite happy to have nearly a complete set of Britannica encyclopedias. No mention is made about the remnants of Wikipedia. :-(

Thursday, August 05, 2010

Students needed to work the WAC

I just received word that my grant proposal with the NSF has been funded. The project is called the "Web Archive Cooperative" or WAC. It's a 3 year grant with Hector Garcia-Molina (Stanford University), Andreas Paepcke (Stanford University), Michael L. Nelson (Old Dominion University), and myself.

In short, the WAC is our attempt to provide services, tools, and data access to web scientists. We are researching methods to provide access to web data like query logs, tag annotations, blogs, profiles and Twitter messages that are often located in disparate archives. We are working on finding this data, building software tools for combining and analyzing the data, and methods to preserve the data for the long term.

What this means is that I will be looking for some highly talented/motivated CS students (currently enrolled at Harding) to work with me over the next 3 years during the summers. You will get to work closely with me and in conjunction with others at Stanford and ODU, and you will receive a stipend. If you think this is something you'd like to get involved with, please let me know.

Tuesday, July 13, 2010

CS library book analysis

For the past three years I've been the designated faculty member in charge of ordering computing books for our campus library. Although it can be tedious at times, I usually enjoy the job, especially since it gives me the chance to browse through the latest books on computing and order pretty much what I would like (and, of course, what I think the students would like smile

).

The other day I got to wondering though, how many of these books are our students actually checking out? Are paper-bound books still useful to them when you can find so much information on the Web?

So several months ago I asked our librarian to give me some usage data on our computing books. I was only able to analyze the data this week, and what I found was somewhat surprising.

First, to see the relative age of books in our library, I created a histogram of the 998 books based on publication date:

The earliest book (Computers and Society, edited by Nikolaieff) is from 1970. Almost half of the books (45%) were published between 1999 and 2003. Only three books published this year had made it into the library by the time this data was obtained.

The check-out data was from 2001 to present. Out of 998 books, 22% have never been checked out (at least since 2001). Eighteen percent have only been checked out once, and only 25% have been checked out more than five times.

Below is a histogram with log scale showing how many times our books have been checked out. The largest bar on the left is the 75% chunk of books that have been checked out 0-5 times. There are only two books that have been checked out 31-35 times and only one book that has been checked out more than 40 times.

In case you were wondering, here are the top 10 most frequently checked-out computing books, along with the book's publication date and number of times checked out. Many of these books are not surprises:

Introduction to Algorithms by Cormen, Leiserson, & Rivest (1990) - 41
C++ Primer Plus: Teach Yourself Object-Oriented Programming by Prata (1995) - 35
Applied cryptography: Protocols, algorithms, and source code in C by Schneier (1994) - 35
Design Patterns: Elements of Reusable Object-Oriented Software by Gamma et al. (1995) - 30
C++ How to Program: Introducing Object-Oriented Design with the UML by Deitel & Deitel (2001) - 26
Computer Virus Crisis by Fites, Johnston, & Kratz (1992) - 26
PASCAL: Programming and Problem Solving by Leestma & Nyhoff (1990) - 25
Mythical Man-Month: Essays on software engineering by Brooks (1995) - 25
C#, A Programmer's Introduction by Deitel et al. (2003) - 25
HTML and CGI Unleashed by December & Ginsburg (1995) - 25

So what about the books that no one checks out? Browsing through the list, I see what I assume would be very popular books like Pattern Hatching: Design Patterns Applied by Vlissides (1998), Object Oriented Perl by Conway (2000), User Interface Design for Programmers by Spolsky (2001), SQL in a Nutshell by Kline et al. (2004), and iPhone SDK 3 Programming by Ali (2009).

To get a better overall picture, I looked at the percentage of books by publication year that have been checked out (at least once since 2001) as shown below.

There is an even decline in check-out rates from 1995 on which suggests that the longer a book is around, the more likely it is to be checked out. That certainly makes sense, however the longer most computing books are around, the less useful they become.

For example, Designing with Web Standards by Zeldman (2007) has been checked out five times. This is arguably a relevant book, at least until HTML5 is released as a new web standard; then its value plummets. Browsing through the titles of our books, many of them fall into this category. Even among our most checked-out books, several of them are somewhat outdated (3?, 6, 7, 9, 10). This is the greatest problem I face when purchasing CS books for the library... I try to purchase books that I think will be immediately useful to our students and at the same time have a shelf-life greater than one year. It's not an easy balance to maintain.

Returning to my original question, are library books still used by our CS students? The data seems to suggest that a fair amount of books are eventually checked out at least once. However, if we estimate that a book costs around $50, and 218 books have never been checked out, that means $10,900 worth of books are sitting unused on the library shelves. Ouch.

Of course, a more thorough analysis would involve surveying our students about their library usage. Why are they checking out a particular book? Are they actually reading what they check out? Is the information they are seeking in the book they've checked out? Are they finding what they need in the library? Are they finding equivalent information on the Web and therefore don't need the book? This would certainly make for an interesting study.

So, do you still find computing books useful? Should we be purchasing fewer books? What would be a better use for the money?

Monday, July 12, 2010

Social networking workshop wrap-up

Last week I attended the HarambeeNet Workshop on Social Networks in Education at Duke University. There were approximately 40 other academics and researchers at the NSF-funded workshop which focused on using social networks and related topics to encourage broader participation in computer science. It was good to see some old friends and make some new ones and enjoy the beautiful Duke campus.

There were a number of excellent presentations and lots of new information. I took some notes and occasionally tweeted, but what I thought was fantastic was using Ning for sharing links, slides, and other resources (sorry, but you can't access the link apparently without a password). Ning allow you to have something like your own private Facebook space.

The biggest thing I took away from the workshop was the desire to integrate some social media into my intro to computing and web development courses. There's so many neat things you can do, like analyze tweets for spam, look for Wikipedia edit wars, and build networks from blog links. I'm hoping to develop some creating CS1 assignments in this area which I'll likely talk about here in the future.

Our keynote on Friday was Jon Kleinberg who is probably best known for his HITS algorithm. Kleinberg recently taught an interdisciplinary course on networks at Cornell and wrote a book with David Easley on the topic: Networks, Crowds, and Markets. After hearing Kleinberg's presentation, I'd love to offer a similar course at Harding.

It was also interesting talking to Ben Shneiderman who worked as an expert witness in the Apple vs. Microsoft case when Apple tried to copyright the GUI. While riding from the airport to the hotel, Shneiderman shared with me that what pushed Jobs to start litigation was when Windows 2 introduced overlapping windows; Windows 1 only had tiled windows which apparently didn't upset Jobs. Shneiderman also presented at the workshop his push for Technology-Mediated Social Participation and a visualization tool for networks using Excel: NodeXL.

Friday, July 02, 2010

The problem with measuring professor quality

Professors James West and Scott Carrell published an article last month in the Journal of Political Economy: Does Professor Quality Matter? Evidence from Random Assignment of Students to Professors. The article has actually been out for a few years, but this is the first time I came across it. Inside Higher Ed has a review of the study from 2008.

The authors looked at student scores and professor evaluations at the U.S. Air Force Academy from 1997 to 2007 and, in their own words, found this:

Results show that there are statistically significant and sizable differences in student achievement across introductory course professors in both contemporaneous and follow-on course achievement. However, our results indicate that professors who excel at promoting contemporaneous student achievement, on average, harm the subsequent performance of their students in more advanced classes. Academic rank, teaching experience, and terminal degree status of professors are negatively correlated with contemporaneous value-added but positively correlated with follow-on course value-added. Hence, students of less experienced instructors who do not possess a doctorate perform significantly better in the contemporaneous course but perform worse in the follow-on related curriculum.

Student evaluations are positively correlated with contemporaneous professor value-added and negatively correlated with follow-on student achievement. That is, students appear to reward higher grades in the introductory course but punish professors who increase deep learning (introductory course professor value-added in follow-on courses). Since many U.S. colleges and universities use student evaluations as a measurement of teaching quality for academic promotion and tenure decisions, this latter finding draws into question the value and accuracy of this practice.

To sum-up, the study found:

1) Students will score better in their intro courses when taught by less experienced professors, but they will do more poorly in subsequent courses.

2) Students will rate their professors higher when they get better grades in their intro courses.

In other words, if you teach a very rigorous course and do a really good job at preparing your students for success once they leave your classroom, you are likely to be punished for it with lower teacher evaluations. And if you make your class easy and everyone gets an A, you'll be rewarded with great evaluations. I suppose you still may be punished later with angry emails from students who can't pass their subsequent courses. smile

The study certainly draws into question how much importance we should place in rating professors based on their teacher evaluations.

Monday, June 21, 2010

Memento: Adding time capabilities to the Web

This summer I'm working on a research project adding Memento support to the Android platform. I'll talk more about my project at a later date, but first I want to provide a quick overview of Memento.

Memento is an architecture which allows a web browser to seamlessly access older versions of web pages. It allows you to "time-travel" on the Web.

The best way to explain this is with an example. If you were to access cnn.com, you would be presented with today's version of the page. But what if you wanted to see how it looked one year ago? You would need to visit the Internet Archive's Wayback Machine to find a list of old copies of the page they had archived, and you would need to click on one of the links. And if IA didn't have the page archived, you would have to search other web archives for the archived version. This is potentially a lot of work.

Memento makes access to archived versions of a page transparent to the user. Using a web browser that supports Memento, you would only need to visit the URL as you normally would and supply a desired date... the browser would automatically locate the archived page from that date and display it to you without the need to manually search through multiple archives.

You can see this in action right now by using the Memento Firefox add-on. Below is a screen shot using the add-on to browse cnn.com as it appeared on July 9, 2009. I actually told the add-on to show the June 21, 2009, version, but the July page (from the European National Archives) is the closest page that was found in any archive. This is not a failing of Memento... it's a limitation of web archiving in general.

Memento uses HTTP content negotiation to add this time dimension to the Web. Instead of discussing the technical details here, I'll instead point you to the Memento Guide Intro if you're interested. Ideally, all web browsers and web servers in the future will support the Memento HTTP headers, and no special add-on will be necessary.

Memento is the brain-child of Michael Nelson (Old Dominion Univ) and Herbert Van de Sompel (LANL). It's made a quite a stir in the past year with a write-up in New Scientist, a paper at the Linked Data on the Web workshop (LDOW2010), and some significant funding from the Library of Congress. Tim Berners-Lee said this about Memento: "This is neat; there is a real need for this."

Friday, June 18, 2010

A Dislike button for yourself

I've been playing with Facebook's new Like button which you can place on any of your web pages. When someone clicks on the Like button, it shows up in their Facebook account. All you have to do is insert an iframe into your webpage. Here's what my iframe looks like for my Harding home page:

<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.harding.edu%2Ffmccown%2F&layout=standard&show_faces=false&width=450&action=like&colorscheme=light&height=35" 
scrolling="no" frameborder="0" style="border:none; 
overflow:hidden; width:450px; height:35px;" allowTransparency="true"></iframe>

It got me wondering though, what if someone wanted to dislike me? So I created a Dislike button and rigged up a little JavaScript to display a snarky message when someone clicks on the button. You can see it on my home page, right under the Like button.

If you'd like to make your own Dislike button, just follow these steps:

Place this image on your website:

Use the following HTML to display the Dislike button and a place-holder for some text:

<a href="javascript:Dislike()"><img
src="dislike.png" width="61" height="24"
title="Click here to dislike this item" border="0" /></a>
&nbsp;
<span id="dislike_text" style="position:relative; top:-7px; 
font-family:Tahoma; font-size:8pt;"></span>

Finally, insert this JavaScript function somewhere in your page which randomly chooses a message to display:

function Dislike()
{
   var responses = [
      "I'm not crazy about you either.",
      "Whatever.", "Seriously?", 
      "My feelings are so hurt.",
      "I'd rather be feared than liked.", 
      "Nice try."];
  
   // Get a random number between 0 and responses.length - 1 
   var num = Math.floor(Math.random() * responses.length);
   document.getElementById("dislike_text").innerHTML = responses[num];
}

Note that this is only for fun, and it will not show up in Facebook.

If you are really interested in having a Dislike button that works in Facebook, check out this Firefox plug-in.

Wednesday, June 16, 2010

Visualizing data with Protovis

In this month's Communications of the ACM, Heer, Bostock and Ogievetsky provide an eye-opening overview of the latest visualization techniques in A Tour Through the Visualization Zoo. The graphs and figures in the article were all produced with Protovis, an open-source library developed at Stanford for producing visualizations using JavaScript and SVG.

I've reproduced a few of my favorite visualizations from the article below.

The first is an index chart which shows the relative change in stock prices of several tech companies over time. The gain/loss factor is based on a fixed point in time (the red line). You can see some correlation between the stock prices, and clearly Apple is the overall winner. The online version allows you to change the location of the red line.

This is a choropleth map showing obesity rates in the US in 2008. The online version allows you to slide the date back to 1995. As you move the dates forward, you can see how the US population gets fatter, with the highest proportion of obesity rates focused in the South. There's a saying here in Arkansas: At least we're not Mississippi! wink

When you view the obesity rates with a Dorling cartogram, you see that Arkansas actually has fewer obese people than Colorado!

The next couple of graphs show how to visualize networks. Both visualize character co-occurrence in chapters from Victor Hugo's novel Les Misérables. The first is a force-directed layout. The node in the middle is, of course, Valjean. The online version allows you to pull nodes around and watch them re-arrange themselves.

The nodes can also be rearranged in an arc diagram as shown below.

Monday, June 07, 2010

Build your own C# screen saver

I wrote a tutorial years ago about creating a Windows screen saver using the C# language. I usually let my GUI students do it as an extra credit assignment.

The tutorial was really showing signs of age, and last week I was finally able to update it. You can find it here. If you'd prefer to just jump straight to the finished product, you can download the project here. I used Visual Studio 2010 to write the demo code.

One more thing to cross off my summer to-do list.

Wednesday, June 02, 2010

Professor's plagiarism case against colleague

Inside Higher Ed reports that a Math/CS professor at Bethel University has accused another professor in the same department of plagiarizing his CS1 labs. The accuser, Benjamin Shults, says that he gave Brian Turnquist his labs for class use which had Shults' name on top. Turnquist subsequently replaced Shults' name and other identifying marks (like swapping out images of Shults with Turnquist's). The modified labs were placed on Turnquist's website.

Turnquist was found guilty of "innocent infringement" by Bethel's Grievance Review Committee, was forced to give Shults an apology, and had to remove all the labs from his website. Apparently Shults is not happy with the committee's slap on the wrist. He has not received an apology yet (see the comment at the end of the article entitled "corrections"), and he's created a website that documents the whole affair.

When I read this article, I immediately thought of all those assignments and labs I had borrowed from my fellow faculty members and the labs/assignments I have given to others to use in their classes. I have always removed their names on the documents and thought nothing of it... I assumed they were doing the same. But I have also never posted the documents on a website for public consumption.

Our department is now discussing this issue, and I think we'll all be more diligent about giving credit when modifying another's lab/assignment for class use. However, we all agreed that the best way to handle a situation like Bethel's is to go directly to the offending party.

Since Bethel is a Christian university like Harding, I couldn't help but wonder how applying the biblical principles set forth by Jesus and Paul would have drastically changed the situation.

Matthew 18:15:

"If your brother sins against you, go and tell him his fault, between you and him alone. If he listens to you, you have gained your brother."

Matthew 5:38-39:

"You have heard that it was said, 'An eye for an eye and a tooth for a tooth.' But I say to you, Do not resist the one who is evil. But if anyone slaps you on the right cheek, turn to him the other also."

1 Corinthians 6:6-7:

"But brother goes to law against brother, and that before unbelievers? To have lawsuits at all with one another is already a defeat for you. Why not rather suffer wrong? Why not rather be defrauded?"

Friday, May 21, 2010

Braden: From 1 to 12 months

My youngest is one today. Just like our oldest son, we took pictures of Braden next to marshmallows every month so you could see how he grows. I'm a little biased, but this guy is one cute kid!

Month 1

Month 2

Month 3

Month 4

Month 5

Month 6

Month 7

Month 8

Month 9

Month 10

Month 11

Month 12

Getting these photos was not easy. Here are a few out-takes:

Wednesday, May 12, 2010

Harding students release Xbox game

Two of my Game Programming students from last fall, Nathan Willingham and Seth Ringling, developed a real-time strategy game called Scatha for their final project. Nathan and Seth put some more time and effort into the project and have now released the Xbox 360 game for purchase.

I'm really proud of the effort these guys have put into their first Xbox game. Not only did they write every line of code in the game, they also created all the graphics. If you enjoy playing RTS games, I think you're going to enjoy playing Scatha.

Update on 7/22/10:

Seth and Nathan have now released a second game called Four Corners and are working on a third title. Their game company is called Living Creature Studios.

Tuesday, May 11, 2010

Good-bye CS graduates

The Spring semester is over, and our department graduated 9 students. Not everyone made it into the photo below, but most are there. This was once again a very talented group of students, and they'll be missed. Thankfully they are entering a good economy with good job prospects, and most of them have already accepted a job.

Tuesday, May 04, 2010

Apple vs. Google vs. Microsoft War Chart

This infographic by Shane Snow beautifully illustrates the epic battle that Apple, Google, and Microsoft are engaged in. From ebooks and phones to office and email, all three giants are battling for more users.

Although many of these services are free to the end users, there are billions of dollars in revenue at stake. Click on the map to zoom-in.

Friday, April 30, 2010

I'm at WWW2010 in Raleigh, NC

I've been here at WWW2010 since Tuesday, but I've been using Twitter to post some of my thoughts on the conference and have been neglecting my blog (yes, I twoatted all over the place). This is only the second time I've attended this conference, and once again I've really enjoyed it. I'm leaving for Searcy in a few hours, so I'll quickly sum-up my impressions.

The keynote speakers were good, but I particularly enjoyed Carl Malamud's talk on being a "rebel". Carl didn't have a PowerPoint presentation, but he didn't need one; his stories about liberating tax-funded data and other exploits were very entertaining. "When I first saw Tim Berners-Lee's Web prototype, I thought to myself, 'That's nice, but it will never scale.'"

I also enjoyed a presentation this morning by Damon Horowitz of Aardvark, a social search engine that was recently purchased by Google. While listening to the presentation, I couldn't help but wonder, "Why didn't I think of that?" As Michael Nelson reminded me at lunch, the best ideas usually result in that response.

Other highlights include the panel Search is Dead! Long Live Search! that examined the future of web search and the Media on the Web developer's track which highlighted some incredible things HTML5 can do.

I'm excited to incorporate some of the things I've seen this week into the next offering of my search engine course. I'm even more excited to see my family again.

Thursday, April 22, 2010

George W. Bush speaking tonight at Harding University

Tonight our former President will be addressing a very pro-Bush Harding audience. Around 2000 students, 700 faculty and staff, and a few hundred other guests will be packed into the Benson Auditorium to hear Bush talk about...? My guess is Bush has a finely tuned speech for addressing college students, packed with jokes about Texas, Democrats, and how even C students can someday be President.

Update

Although I wasn't able to attend last night's talk, I was told by many of my students that Bush came across very eloquent, knowledgeable, and by some accounts "inspirational". This might be a surprise to many who are more familiar with his public gaffs. My parents (who used my tickets) were also quite impressed.

When asked what was the most difficult decision he made as president, Bush said that it was sending in additional troops to Iraq ("The Surge").

Friday, April 16, 2010

When Twitter is gone, your Tweets will live on

The Library of Congress announced on Wednesday that

"Every public tweet, ever, since Twitter’s inception in March 2006, will be archived digitally at the Library of Congress."

That's right... every thoughtless, trivial, and crass remark you ever tweeted is now going to be made available for future generations (your tax money at work wink

).

This is actually a very positive development because this corpus of short messages will provide invaluable to researchers and historians. My guess is some research on this corpus will likely be used to improve web search. I will certainly have my web IR course in the spring do some analysis on the corpus.

My hope is that some day the LoC will also archive all of Facebook. This will prove much more problematic as Facebook data is inherently private, and access to the archive will likely need to be restricted. But losing this treasure chest of bytes would, in my estimation, be far more of a loss to society and future researchers than losing a few tweets.