Tuesday, July 13, 2010

CS library book analysis

For the past three years I've been the designated faculty member in charge of ordering computing books for our campus library. Although it can be tedious at times, I usually enjoy the job, especially since it gives me the chance to browse through the latest books on computing and order pretty much what I would like (and, of course, what I think the students would like smile).

The other day I got to wondering though, how many of these books are our students actually checking out? Are paper-bound books still useful to them when you can find so much information on the Web?

So several months ago I asked our librarian to give me some usage data on our computing books. I was only able to analyze the data this week, and what I found was somewhat surprising.

First, to see the relative age of books in our library, I created a histogram of the 998 books based on publication date:

The earliest book (Computers and Society, edited by Nikolaieff) is from 1970. Almost half of the books (45%) were published between 1999 and 2003. Only three books published this year had made it into the library by the time this data was obtained.

The check-out data was from 2001 to present. Out of 998 books, 22% have never been checked out (at least since 2001). Eighteen percent have only been checked out once, and only 25% have been checked out more than five times.

Below is a histogram with log scale showing how many times our books have been checked out. The largest bar on the left is the 75% chunk of books that have been checked out 0-5 times. There are only two books that have been checked out 31-35 times and only one book that has been checked out more than 40 times.

In case you were wondering, here are the top 10 most frequently checked-out computing books, along with the book's publication date and number of times checked out. Many of these books are not surprises:
  1. Introduction to Algorithms by Cormen, Leiserson, & Rivest (1990) - 41
  2. C++ Primer Plus: Teach Yourself Object-Oriented Programming by Prata (1995) - 35
  3. Applied cryptography: Protocols, algorithms, and source code in C by Schneier (1994) - 35
  4. Design Patterns: Elements of Reusable Object-Oriented Software by Gamma et al. (1995) - 30
  5. C++ How to Program: Introducing Object-Oriented Design with the UML by Deitel & Deitel (2001) - 26
  6. Computer Virus Crisis by Fites, Johnston, & Kratz (1992) - 26
  7. PASCAL: Programming and Problem Solving by Leestma & Nyhoff (1990) - 25
  8. Mythical Man-Month: Essays on software engineering by Brooks (1995) - 25
  9. C#, A Programmer's Introduction by Deitel et al. (2003) - 25
  10. HTML and CGI Unleashed by December & Ginsburg (1995) - 25

So what about the books that no one checks out? Browsing through the list, I see what I assume would be very popular books like Pattern Hatching: Design Patterns Applied by Vlissides (1998), Object Oriented Perl by Conway (2000), User Interface Design for Programmers by Spolsky (2001), SQL in a Nutshell by Kline et al. (2004), and iPhone SDK 3 Programming by Ali (2009).

To get a better overall picture, I looked at the percentage of books by publication year that have been checked out (at least once since 2001) as shown below.

There is an even decline in check-out rates from 1995 on which suggests that the longer a book is around, the more likely it is to be checked out. That certainly makes sense, however the longer most computing books are around, the less useful they become.

For example, Designing with Web Standards by Zeldman (2007) has been checked out five times. This is arguably a relevant book, at least until HTML5 is released as a new web standard; then its value plummets. Browsing through the titles of our books, many of them fall into this category. Even among our most checked-out books, several of them are somewhat outdated (3?, 6, 7, 9, 10). This is the greatest problem I face when purchasing CS books for the library... I try to purchase books that I think will be immediately useful to our students and at the same time have a shelf-life greater than one year. It's not an easy balance to maintain.

Returning to my original question, are library books still used by our CS students? The data seems to suggest that a fair amount of books are eventually checked out at least once. However, if we estimate that a book costs around $50, and 218 books have never been checked out, that means $10,900 worth of books are sitting unused on the library shelves. Ouch.

Of course, a more thorough analysis would involve surveying our students about their library usage. Why are they checking out a particular book? Are they actually reading what they check out? Is the information they are seeking in the book they've checked out? Are they finding what they need in the library? Are they finding equivalent information on the Web and therefore don't need the book? This would certainly make for an interesting study.

So, do you still find computing books useful? Should we be purchasing fewer books? What would be a better use for the money?