In this course we'll be learning how search engines like Google are able to return a relevant set of search results from the Web in milliseconds. The following topics will be covered:
- Web characterization
- History of web search
- Information retrieval (IR)
- Web crawling
- Deep web
- Content indexing
- Query processing
- Search results ranking (e.g., PageRank and HITS)
- Search engine optimization (SEO)
- Adversarial IR
- Personalization of search results
There will be several team projects utilizing open source software to build a search engine.
This course will be useful for anyone interested in doing research that could lead to presenting work at a major conference or in getting an internship at a search engine company.
The textbooks I'm currently looking at for this course include:
- Google's PageRank and Beyond: The Science of Search Engine Rankings by Langville and Meyer
- Mining the Web: Discovering Knowledge from Hypertext Data by Soumen Chakrabarti
If anyone has any other suggestions, please let me know by leaving a comment.