Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Similar presentations


Presentation on theme: "CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign."— Presentation transcript:

1 CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign

2 Course Goal Advanced (graduate-level) introduction to the field of information retrieval (IR) Goal –Provide an overview of IR research in the past several decades –Systematically review the core research topics in IR –Discuss the most recent research progress (customized toward the interests of students) –Give students enough training for doing research in IR or applying advanced IR techniques to applications More in-depth treatment of topics than CS410: less emphasis on practical skills, more on understanding of principles, models, and algorithms 2

3 IR Research Topics (Broad View) Search Text Filtering Categorization Summarization Clustering Natural Language Content Analysis Extraction Mining Visualization Retrieval Applications Analytics Applications Information Access Text Mining Information Organization Users Text Acquisition

4 IR Topics (narrow view) User query judgments docs results Query Rep Doc Rep Ranking Feedback INDEXING SEARCHING QUERY MODIFICATION LEARNING INTERFACE 1. Evaluation 2. Retrieval (Ranking) Models 4. Efficiency & scalability 3. Document representation/structure 6. User interface (browsing) 7. Feedback/Learning 5. Search result summarization/presentation Our focus: 1, 2, 7 4

5 IR Topics covered & Related Topics User query judgments docs results Query Rep Doc Rep Ranking Feedback INDEXING SEARCHING QUERY MODIFICATION LEARNING INTERFACE 1. Evaluation 2. Retrieval (Ranking) Models 4. Efficiency & scalability 3. Document representation/structure 6. User interface (browsing) 7. Feedback/Learning 5. Search result summarization/presentation Our focus: 1, 2, 7 5 HCI Parallel Prog. NLP ML

6 Core Knowledge that You Should Know IR Evaluation Methodology (Cranfield Lab Test) –Emphasizes on realistic task modeling –Test set creation/sharing –Measures –Comparative analysis of components –Statistical significance test Retrieval Models –Vector-Space (retrieval heuristics) –Probabilistic (language models, statistical estimation) –Machine learning (basic idea) Topic models –EM algorithm 6 Check out the midterm topics for details You’ll likely find these to be useful for your research in general

7 Be Familiar with Some Frontier Topics 7 Document Representation and Content Analysis (e.g., text representation, document structure, linguistic analysis, non-English IR, cross-lingual IR, information extraction, sentiment analysis, clustering, classification, topic models, facets) Queries and Query Analysis (e.g., query representation, query intent, query log analysis, question answering, query suggestion, query reformulation) Users and Interactive IR (e.g., user models, user studies, user feedback, search interface, summarization, task models, personalized search) Retrieval Models and Ranking (e.g., IR theory, language models, probabilistic retrieval models, feature-based models, learning to rank, combining searches, diversity) Search Engine Architectures and Scalability ( e.g., indexing, compression, MapReduce, distributed IR, P2P IR, mobile devices) Filtering and Recommending (e.g., content-based filtering, collaborative filtering, recommender systems, profiles) Evaluation (e.g., test collections, effectiveness measures, experimental design) Web IR and Social Media Search (e.g., link analysis, query logs, social tagging, social network analysis, advertising and search, blog search, forum search, CQA, adversarial IR, vertical and local search) IR and Structured Data (e.g., XML search, ranking in databases, desktop search, entity search) Multimedia IR (e.g., Image search, video search, speech/audio search, music IR) Other Applications (e.g., digital libraries, enterprise search, genomics IR, legal IR, patent search, text reuse) Learn more about this from project presentation!

8 8 Beyond Information Retrieval: Take Other Related Courses Information Retrieval Databases Library & Info Science Machine Learning Pattern Recognition Data Mining Natural Language Processing Applications Web, Bioinformatics… Statistics Optimization Software engineering Computer systems Models Algorithms Applications Systems Human-Computer Interaction Computer Vision

9 Work on your projects: Let me know ASAP if you need help Present your project at 1:30-4:30pm on Friday, Dec. 12 (room 1304 SC) Submit your reports by Dec. 19, Friday Remaining Tasks for You


Download ppt "CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign."

Similar presentations


Ads by Google