This seminar will focus on Google and the technologies they have created or adopted to build their enterprise into what it is today. Although many of the technologies we will study are applicable to all search engines applications, we will focus on a breadth-first discussion of Google's technologies rather than a depth-first examination of a single research area. We will cover 3 basic areas: (1) initial contributions to crawling and ranking; (2) information retrieval applications; (3) custom infrastructure for deploying web-scale applications.Students are presenting 20 papers throughout the semester that Michael has accumulated. Last night the papers The Anatomy of a Large-Scale Hypertextual Web Search Engine and The PageRank Citation Ranking: Bringing Order to the Web were presented to the class. These are the first papers on Google from founders Sergey Brin and Lawrence Page who were graduate students at Stanford at the time. What’s interesting to note is the second paper on PageRank is only a technical report: apparently it was rejected by SIGIR for not rigorously evaluating PageRank (according to a talk by Monika Henzinger). Brin and Page never did the extra work required to get it published, yet it is obviously one of the most influential papers about ranking search results ever written.
If you’d like to see the presentations made in this class, the slides will be posted to the class website on a regular basis.