CS598CXZ (CS510) Advanced Topics in Information Retrieval (Fall 2016) Instructor: ChengXiang (“Cheng”) Zhai Teaching Assistants: Rongda Zhu (full time) Chase Geigle (part time) Department of Computer Science University of Illinois, Urbana-Champaign
Course Goal Advanced (graduate-level) introduction to the field of information retrieval (IR), broadly including Text mining Goal Provide an in-depth introduction to advanced IR algorithms based on statistical language models Provide an opportunity for students to explore frontier topics via course projects (customized toward the interests of students) Give students enough training for doing research in IR or applying advanced IR techniques to applications Tangible outcome: research paper, open source code, and application system
Prerequisites Basic concepts in CS410 Text Info Systems Programming skills: CS225 or equivalent level A good knowledge of basic probability and statistics Knowledge of one or more of the following areas is a plus, but not required: Information Retrieval, Machine Learning, Data Mining, Natural Language Processing Contact the instructor if you aren’t sure
Format Mixture of Lectures by instructor (about 50%) Presentations by students (project-based) (about 50%) Assignments: ensure solid mastery of concepts and skills of implementation Written assignments + programming Midterm (75 min, in class): mostly to verify your mastery of main concepts and algorithms Course project: multiple options In-depth study of a topic publication/submission Implementation of a major algorithm open source Development of a novel application useful application
Office Hours Instructor: TA (0207SC?) Email us at any time Tue. 1:30pm-2:30pm; Fri. 11:00am-12pm 2116SC TA (0207SC?) Rongda Zhu: Mon & Thur, 10-11am Chase Geigle: TBD to accommodate online students Email us at any time
Grading Assignments: 30% Midterm: 30% Project: 40%
Schedule Part I: Background, overview of IR research (lectures by instructors): relevant math Part II: IR: frameworks and models (lectures by instructors) Covering the major algorithms for optimizing ranking Part III: Text analysis: topic models & neural language models (lectures by instructors) Covering topic models and word embedding for text analysis Part IV: Frontier topics and applications (project-based workshop; presentations by students + discussions) Covering project-related frontier topics, system implementation, and applications
Presentation &discussion Your Work Load Aug Sept Oct Nov Dec Aug 24 Thanks- giving Dec 6 Last Day of Instruction Readings Assignments Midterm Paper/project Presentation &discussion Project
Reference Book ChengXiang Zhai, Chase Geigle, Statistical Language Models for Text Data Retrieval and Analysis, forthcoming. Draft will be available online
Other readings: mostly research papers, survey articles, and book chapters Synthesis Lectures Digital Library: http://www.morganclaypool.com/ Foundations & Trends in IR: http://www.nowpublishers.com/ir/ Recent papers from SIGIR, CIKM, WWW, WSDM, KDD, ACL, ICML,…
Questions? Course website: http://times.cs.uiuc.edu/course/598f16 Piazza: https://piazza.com/class/irvce5pdgfz71d