Tuesday, July 13, 2010

CS library book analysis

For the past three years I've been the designated faculty member in charge of ordering computing books for our campus library. Although it can be tedious at times, I usually enjoy the job, especially since it gives me the chance to browse through the latest books on computing and order pretty much what I would like (and, of course, what I think the students would like smile).

The other day I got to wondering though, how many of these books are our students actually checking out? Are paper-bound books still useful to them when you can find so much information on the Web?

So several months ago I asked our librarian to give me some usage data on our computing books. I was only able to analyze the data this week, and what I found was somewhat surprising.

First, to see the relative age of books in our library, I created a histogram of the 998 books based on publication date:

The earliest book (Computers and Society, edited by Nikolaieff) is from 1970. Almost half of the books (45%) were published between 1999 and 2003. Only three books published this year had made it into the library by the time this data was obtained.

The check-out data was from 2001 to present. Out of 998 books, 22% have never been checked out (at least since 2001). Eighteen percent have only been checked out once, and only 25% have been checked out more than five times.

Below is a histogram with log scale showing how many times our books have been checked out. The largest bar on the left is the 75% chunk of books that have been checked out 0-5 times. There are only two books that have been checked out 31-35 times and only one book that has been checked out more than 40 times.

In case you were wondering, here are the top 10 most frequently checked-out computing books, along with the book's publication date and number of times checked out. Many of these books are not surprises:
  1. Introduction to Algorithms by Cormen, Leiserson, & Rivest (1990) - 41
  2. C++ Primer Plus: Teach Yourself Object-Oriented Programming by Prata (1995) - 35
  3. Applied cryptography: Protocols, algorithms, and source code in C by Schneier (1994) - 35
  4. Design Patterns: Elements of Reusable Object-Oriented Software by Gamma et al. (1995) - 30
  5. C++ How to Program: Introducing Object-Oriented Design with the UML by Deitel & Deitel (2001) - 26
  6. Computer Virus Crisis by Fites, Johnston, & Kratz (1992) - 26
  7. PASCAL: Programming and Problem Solving by Leestma & Nyhoff (1990) - 25
  8. Mythical Man-Month: Essays on software engineering by Brooks (1995) - 25
  9. C#, A Programmer's Introduction by Deitel et al. (2003) - 25
  10. HTML and CGI Unleashed by December & Ginsburg (1995) - 25

So what about the books that no one checks out? Browsing through the list, I see what I assume would be very popular books like Pattern Hatching: Design Patterns Applied by Vlissides (1998), Object Oriented Perl by Conway (2000), User Interface Design for Programmers by Spolsky (2001), SQL in a Nutshell by Kline et al. (2004), and iPhone SDK 3 Programming by Ali (2009).

To get a better overall picture, I looked at the percentage of books by publication year that have been checked out (at least once since 2001) as shown below.

There is an even decline in check-out rates from 1995 on which suggests that the longer a book is around, the more likely it is to be checked out. That certainly makes sense, however the longer most computing books are around, the less useful they become.

For example, Designing with Web Standards by Zeldman (2007) has been checked out five times. This is arguably a relevant book, at least until HTML5 is released as a new web standard; then its value plummets. Browsing through the titles of our books, many of them fall into this category. Even among our most checked-out books, several of them are somewhat outdated (3?, 6, 7, 9, 10). This is the greatest problem I face when purchasing CS books for the library... I try to purchase books that I think will be immediately useful to our students and at the same time have a shelf-life greater than one year. It's not an easy balance to maintain.

Returning to my original question, are library books still used by our CS students? The data seems to suggest that a fair amount of books are eventually checked out at least once. However, if we estimate that a book costs around $50, and 218 books have never been checked out, that means $10,900 worth of books are sitting unused on the library shelves. Ouch.

Of course, a more thorough analysis would involve surveying our students about their library usage. Why are they checking out a particular book? Are they actually reading what they check out? Is the information they are seeking in the book they've checked out? Are they finding what they need in the library? Are they finding equivalent information on the Web and therefore don't need the book? This would certainly make for an interesting study.

So, do you still find computing books useful? Should we be purchasing fewer books? What would be a better use for the money?

Monday, July 12, 2010

Social networking workshop wrap-up

Last week I attended the HarambeeNet Workshop on Social Networks in Education at Duke University. There were approximately 40 other academics and researchers at the NSF-funded workshop which focused on using social networks and related topics to encourage broader participation in computer science. It was good to see some old friends and make some new ones and enjoy the beautiful Duke campus.

There were a number of excellent presentations and lots of new information. I took some notes and occasionally tweeted, but what I thought was fantastic was using Ning for sharing links, slides, and other resources (sorry, but you can't access the link apparently without a password). Ning allow you to have something like your own private Facebook space.

The biggest thing I took away from the workshop was the desire to integrate some social media into my intro to computing and web development courses. There's so many neat things you can do, like analyze tweets for spam, look for Wikipedia edit wars, and build networks from blog links. I'm hoping to develop some creating CS1 assignments in this area which I'll likely talk about here in the future.

Our keynote on Friday was Jon Kleinberg who is probably best known for his HITS algorithm. Kleinberg recently taught an interdisciplinary course on networks at Cornell and wrote a book with David Easley on the topic: Networks, Crowds, and Markets. After hearing Kleinberg's presentation, I'd love to offer a similar course at Harding.

It was also interesting talking to Ben Shneiderman who worked as an expert witness in the Apple vs. Microsoft case when Apple tried to copyright the GUI. While riding from the airport to the hotel, Shneiderman shared with me that what pushed Jobs to start litigation was when Windows 2 introduced overlapping windows; Windows 1 only had tiled windows which apparently didn't upset Jobs. Shneiderman also presented at the workshop his push for Technology-Mediated Social Participation and a visualization tool for networks using Excel: NodeXL.

Friday, July 02, 2010

The problem with measuring professor quality

Professors James West and Scott Carrell published an article last month in the Journal of Political Economy: Does Professor Quality Matter? Evidence from Random Assignment of Students to Professors. The article has actually been out for a few years, but this is the first time I came across it. Inside Higher Ed has a review of the study from 2008.

The authors looked at student scores and professor evaluations at the U.S. Air Force Academy from 1997 to 2007 and, in their own words, found this:
Results show that there are statistically significant and sizable differences in student achievement across introductory course professors in both contemporaneous and follow-on course achievement. However, our results indicate that professors who excel at promoting contemporaneous student achievement, on average, harm the subsequent performance of their students in more advanced classes. Academic rank, teaching experience, and terminal degree status of professors are negatively correlated with contemporaneous value-added but positively correlated with follow-on course value-added. Hence, students of less experienced instructors who do not possess a doctorate perform significantly better in the contemporaneous course but perform worse in the follow-on related curriculum.

Student evaluations are positively correlated with contemporaneous professor value-added and negatively correlated with follow-on student achievement. That is, students appear to reward higher grades in the introductory course but punish professors who increase deep learning (introductory course professor value-added in follow-on courses). Since many U.S. colleges and universities use student evaluations as a measurement of teaching quality for academic promotion and tenure decisions, this latter finding draws into question the value and accuracy of this practice.

To sum-up, the study found:

1) Students will score better in their intro courses when taught by less experienced professors, but they will do more poorly in subsequent courses.

2) Students will rate their professors higher when they get better grades in their intro courses.

In other words, if you teach a very rigorous course and do a really good job at preparing your students for success once they leave your classroom, you are likely to be punished for it with lower teacher evaluations. And if you make your class easy and everyone gets an A, you'll be rewarded with great evaluations. I suppose you still may be punished later with angry emails from students who can't pass their subsequent courses. smile

The study certainly draws into question how much importance we should place in rating professors based on their teacher evaluations.