Download presentation
Presentation is loading. Please wait.
Published bySarah Walker Modified over 9 years ago
1
Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Feb. 22, 2012
2
IR, Spring 2012NTUT CSIE2 Instructor & TA Instructor –J. H. Wang ( 王正豪 ) –Assistant Professor, CSIE, NTUT –Office: R1534, Technology Building –E-mail: jhwang@csie.ntut.edu.twjhwang@csie.ntut.edu.tw –Tel: ext. 4238 –Office Hour: 9:00-12:00 am, every Tuesday and Wednesday TA –Mr. Liu ( 劉瀚之 ) –R1424, Technology Building
3
IR, Spring 2012NTUT CSIE3 Course Description Course Web Page –http://www.ntut.edu.tw/~jhwang/IR/http://www.ntut.edu.tw/~jhwang/IR/ Time: 9:10-12:00am, Thu. Classroom: R1322, Technology Building Textbook: –Christopher D. Manning, Prabhakar Raghavan and Hinrich Schuetze, Introduction to Information Retrieval, Cambridge University Press, 2008. Introduction to Information Retrieval Available online International Student Edition, imported by Kai-Fa ( 開發 ) Publishing Prerequisites: –Basic knowledge of data structures and algorithms, linear algebra, and probability theory –Programming experience is *required* for homeworks & projects
4
IR, Spring 2012NTUT CSIE4 Additional References References: –Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search, Addison-Wesley, 2011. Modern Information Retrieval: The Concepts and Technology behind Search This is the second edition of their book Modern Information Retrieval in 1999. ( 華通 )Modern Information Retrieval –Stefan Buettcher, Charles L.A. Clarke, and Gordon V. Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.Information Retrieval: Implementing and Evaluating Search Engines –Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison-Wesley, 2010. ( 全華 ) Search Engines: Information Retrieval in Practice
5
IR, Spring 2012NTUT CSIE5 More Books on IR Gerald Salton, Automatic information organization and retrieval, McGraw-Hill, 1968. Gerald Salton and M.J. McGill, Introduction to modern information retrieval, McGraw-Hill, 1983. – Two classics, but out-of-print. C. J. van Rijsbergen, Information Retrieval, Butterworths, 1979.Information Retrieval – The classic. More than 40 years old, but still worth reading. K. Sparck Jones, P. Willett, Readings in Information Retrieval, Morgan Kaufmann, 1997.Readings in Information Retrieval – A collection of classical IR papers. (out of print) I.H. Witten, A. Moffat, T.C. Bell. Morgan Kaufmann, Managing Gigabytes, 2nd edition, 1999. Managing Gigabytes – The authority on index construction and compression.
6
IR, Spring 2012NTUT CSIE6 Grading Policy Homework assignments and programming exercises: 40% Mid-term exam: 25% Term project: 35% –Including the proposal and final report
7
IR, Spring 2012NTUT CSIE7 Programming Exercises and Term Project About 3 programming exercises –Team-based (at most 2 persons per team) –You can either write your own code or reuse existing open source code The term project –Either team-based system development (the same as programming exercises) –Or academic paper presentation Only one person per team allowed –A proposal is required before midterm (Apr. 12, 2012)
8
IR, Spring 2012NTUT CSIE8 About the Term Project The score you get depends on the difficulty and quality of your project –For system development: System functions and correctness –For academic paper presentation Quality and your presentation of the paper Major methods/experimental results *must* be presented Papers from top conferences are strongly suggested –E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, … Proposals are *required* for each team, and will counted in the score
9
IR, Spring 2012NTUT CSIE9 Online Submission Submission instructions –Programs, project proposals, and project reports in electronic files must be submitted to the TA online at: http://140.124.183.39/ir/ –Before submission: User name: Your student ID Please change your default password at your first login
10
IR, Spring 2012NTUT CSIE10 What this Course is NOT about This course will NOT tell you –The tips and tricks of using search engines, although power users might have better ideas on how to improve them There’re plenty of books and websites on that… –How to find books in libraries, although it’s somewhat related to the basic IR concepts –How to make money on the Web, although the currently largest search engine did it
11
IR, Spring 2012NTUT CSIE11 What’s Information Retrieval
12
IR, Spring 2012NTUT CSIE12 On Wikipedia
13
IR, Spring 2012NTUT CSIE13 On Google Images
14
IR, Spring 2012NTUT CSIE14 On Google Video Search
15
IR, Spring 2012NTUT CSIE15 On Google News (TW)
16
IR, Spring 2012NTUT CSIE16 On Google News (US)
17
IR, Spring 2012NTUT CSIE17 On Blogs
18
IR, Spring 2012NTUT CSIE18 On Google Translate…
19
IR, Spring 2012NTUT CSIE19 Or More Related Keywords NBA New York Knicks Linsanity …
20
IR, Spring 2012NTUT CSIE20 What if We Search in Chinese
21
IR, Spring 2012NTUT CSIE21 And More… 紐約尼克 哈佛 台裔球員 … And other languages… And other search engines… And social websites…
22
IR, Spring 2012NTUT CSIE22 In Google Trends
23
IR, Spring 2012NTUT CSIE23 And More…
24
IR, Spring 2012NTUT CSIE24 And Other Keywords…
25
IR, Spring 2012NTUT CSIE25 And Other Keywords…
26
IR, Spring 2012NTUT CSIE26 Palanteer – TW Election
27
IR, Spring 2012NTUT CSIE27
28
IR, Spring 2012NTUT CSIE28
29
IR, Spring 2012NTUT CSIE29 What Is Information Retrieval? “Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968)
30
IR, Spring 2012NTUT CSIE30 Goal Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR
31
IR, Spring 2012NTUT CSIE31 A Big Picture
32
IR, Spring 2012NTUT CSIE32 Inverte d Index User Interface Text Operations Query Expansion Indexing Retrieval Ranking Text query user need user feedback ranked docs retrieved docs Doc representation logical view inverted file Document Collection
33
IR, Spring 2012NTUT CSIE33 Topics Text IR –Indexing and searching –Query languages and operations Retrieval evaluation Modeling –Boolean model –Vector space model –Probabilistic model Applications for IR –Multimedia IR –Web search –Digital libraries
34
IR, Spring 2012NTUT CSIE34 Organization of the Textbook Basics in IR (focus) –Inverted indexes for boolean queries (Ch.1-5) –Term weighting and vector space model (Ch. 6-7) –Evaluation in IR (Ch. 8) Advanced Topics –Relevance feedback (Ch. 9) –XML retrieval (Ch. 10) –Probabilistic IR (Ch. 11) –Language models (Ch. 12) Machine learning in IR (useful) –Text classification (Ch. 13-15) –Document clustering (Ch. 16-18) Web Search –Web crawling and indexes (Ch. 19-20) –Link analysis (Ch. 21)
35
IR, Spring 2012NTUT CSIE35 Pointers to Other Topics Cross-language IR Image, video, and multimedia IR Speech retrieval Music retrieval User interfaces Parallel, distributed, and P2P IR Digital libraries Information science perspective Logic-based approaches to IR Natural language processing techniques
36
IR, Spring 2012NTUT CSIE36 Tentative Schedule Before midterm –Boolean retrieval (1 wk) –Indexing (2 wks) –Vector space model and evaluation (2 wk) –Relevance feedback (1 wk) –Probabilistic IR (2 wk) After midterm –Text classification (1-2 wk) –Document clustering (1-2 wk) –Web search (2 wks) –Advanced topics: CLIR, IE, … (2 wks) –Term Project Presentation (3 wks)
37
IR, Spring 2012NTUT CSIE37 Generic Resources Wikipedia page on Information Retrieval: http://en.wikipedia.org/wiki/Informatio n_retrieval http://en.wikipedia.org/wiki/Informatio n_retrieval Information Retrieval Resources: http://www- csli.stanford.edu/~hinrich/information- retrieval.html http://www- csli.stanford.edu/~hinrich/information- retrieval.html
38
IR, Spring 2012NTUT CSIE38 Academic Resources Journals –ACM TOIS: Transactions on Information Systems –JASIST: Journal of the American Society of Information Sciences –IP&M: Information Processing and Management –IEEE TKDE: Transactions on Knowledge and Data Engineering Conferences –ACM SIGIR: International Conference on Information Retrieval –WWW: World Wide Web Conference –ACM CIKM: Conference on Information Knowledge and Management –JCDL: ACM/IEEE Joint Conference on Digital Libraries –ACM WSDM: International Conference on Web Search and Data Mining –TREC: Text Retrieval Conference
39
IR, Spring 2012NTUT CSIE39 Thanks for Your Attention!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.