We'll be using Java to build a web crawler, indexer, and rank results. At the end of the semester, we'll have a fully-functional web search engine.
Here are some of the topics we'll be covering:
- Web characterization
- History of web search
- Information retrieval (IR)
- Web crawling
- Deep web
- Content indexing
- Query processing
- Search results ranking (e.g., PageRank and HITS)
- Search engine optimization (SEO)
- Adversarial IR
- Personalization of search results
Looks like a fun course...good luck! Let me know when you are ready to come work for the evil empire ;-)
ReplyDeleteTanton- Is that a job offer? I doubt Microsoft could ever match my Harding salary. ;-)
ReplyDeleteThat is true, I doubt they could even come close to Harding's magnanimous offering. However, we do have Mt. Rainier, Mt. St. Helens, a lot of rain, and starbucks on every corner ;-)
ReplyDeleteDear Assoc. Prof. Dr. Frank McCown,
ReplyDeleteI'm master student and my project is based on on-line plagiarism detection with collaboration of search engines. I'v used your java code that quering Google through ajax API. Briefly the idea is to extract N-Grams from the collected corpus and compare it with the suspected document. Unfortunately the result returned from the JSon object is too short and of no help. I'm wondered if there is any other option to retrive the whole document even using other search engines other than Google.
Thank you,
Ahmed Jabr
UTM, Malaysia
Ahmed,
ReplyDeleteI'm not sure if I understand your problem. It might be easier to email me. See my home page for my email address.