Presentation is loading. Please wait.

Presentation is loading. Please wait.

Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Apr. 24, 2013.

Similar presentations


Presentation on theme: "Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Apr. 24, 2013."— Presentation transcript:

1 Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Apr. 24, 2013

2 IR, Spring 2013ESIT2 About me Instructor –J. H. Wang ( 王正豪 ) –Assistant Professor, CSIE, NTUT –Office: R1534, Technology Building –E-mail: jhwang@csie.ntut.edu.twjhwang@csie.ntut.edu.tw –Tel: ext. 4238 –Office Hour: 9:10-12:00 am, every Tuesday

3 IR, Spring 2013ESIT3 Course Description Course Web Page –http://www.ntut.edu.tw/~jhwang/IR_ESIT/http://www.ntut.edu.tw/~jhwang/IR_ESIT/ Time: 9:10-12:00am, Wed. Classroom: R1219, Technology Building Textbook: –Christopher D. Manning, Prabhakar Raghavan and Hinrich Schuetze, Introduction to Information Retrieval, Cambridge University Press, 2008. Introduction to Information Retrieval Available online International Student Edition, imported by Kai-Fa ( 開發 ) Publishing Prerequisites: –Basic knowledge of data structures and algorithms, linear algebra, and probability theory

4 IR, Spring 2013ESIT4 Additional References References: –Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search, Addison-Wesley, 2011. Modern Information Retrieval: The Concepts and Technology behind Search This is the second edition of their book Modern Information Retrieval in 1999. ( 華通 )Modern Information Retrieval –Stefan Buettcher, Charles L.A. Clarke, and Gordon V. Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.Information Retrieval: Implementing and Evaluating Search Engines –Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison-Wesley, 2010. ( 全華 ) Search Engines: Information Retrieval in Practice

5 IR, Spring 2013ESIT5 More Books on IR Gerald Salton, Automatic information organization and retrieval, McGraw-Hill, 1968. Gerald Salton and M.J. McGill, Introduction to modern information retrieval, McGraw-Hill, 1983. – Two classics, but out-of-print. C. J. van Rijsbergen, Information Retrieval, Butterworths, 1979.Information Retrieval – The classic. More than 40 years old, but still worth reading. K. Sparck Jones, P. Willett, Readings in Information Retrieval, Morgan Kaufmann, 1997.Readings in Information Retrieval – A collection of classical IR papers. (out of print) I.H. Witten, A. Moffat, T.C. Bell. Morgan Kaufmann, Managing Gigabytes, 2nd edition, 1999. Managing Gigabytes – The authority on index construction and compression.

6 IR, Spring 2013ESIT6 What this Course is NOT about This course will NOT tell you –The tips and tricks of using search engines, although power users might have better ideas on how to improve them There’re plenty of books and websites on that… –How to find books in libraries, although it’s somewhat related to the basic IR concepts –How to make money on the Web, although the currently largest search engine did it

7 What’s Information Retrieval Tasks that you have been doing all day! –Searching for something interesting –Asking for advices –Discovering the changing world –… User interests are changing all the time… –2011: New Zealand Earthquake –2012: Jeremy Lin –2013: ? (On the next slide…) IR, Spring 2013ESIT7

8 Recent Hot Topics Sichuan earthquake Texas plant explosion Boston Marathon explosions Margaret Thatcher North Korea Gold Price H7N9

9 In News IR, Spring 2013ESIT9

10

11 In Wikipedia IR, Spring 2013ESIT11

12 In Images IR, Spring 2013ESIT12

13

14 IR, Spring 2013ESIT14 In Blogs

15 Related Keywords 2013 Boston Marathon bombings –Explosions, attacks, blasts, victims, suspects Margaret Hilda Thatcher, Baroness Thatcher –United Kingdom, Prime Minister, British politician, Iron Lady Influenza A virus subtype H7N9 –avian influenza virus, bird flu virus 2013 North Korean crisis –Democratic People's Republic of Korea (DPRK), North Korea nuclear threat, missile test, Kim Jong-un

16 What if We Search in Chinese IR, Spring 2013ESIT16

17 More Related Keywords 波士頓馬拉松爆炸案 – 恐怖攻擊, 嫌犯, 受害者, … 2013 年朝鮮半島危機 – 北韓, 金正恩, 核武, 飛彈, … H7N9 – 禽流感, 疫苗, … 柴契爾夫人 – 鐵娘子, 佘契爾, 戴卓爾, … 四川地震 – 災民, 救災, 罹難, 捐款, 雅安, 汶川, … IR, Spring 2013ESIT17

18 IR, Spring 2013ESIT18 And the Search Goes on… Other languages: French, Spanish, Japanese, … Other search engines: Yahoo, Bing, … Social networking websites: facebook, twitter, … …

19 In Google Trends IR, Spring 2013ESIT19

20 IR, Spring 2013ESIT20 Example: Boston bombing

21

22 IR, Spring 2013ESIT22

23 IR, Spring 2013ESIT23 And Other Keywords…

24 IR, Spring 2013ESIT24

25 IR, Spring 2013ESIT25 And Social Search…

26 IR, Spring 2013ESIT26 How to Find What People Care About?

27 IR, Spring 2013ESIT27

28 Google HotTrends for Other Places - TW

29

30

31 IR, Spring 2013ESIT31 What Is Information Retrieval? “Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968)

32 IR, Spring 2013ESIT32 Goal Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR

33 IR, Spring 2013ESIT33 A Big Picture

34 IR, Spring 2013ESIT34 Inverted Index User Interface Text Operations Query Expansion Indexing Retrieval Ranking Text Query user need user feedback ranked docs retrieved docs Doc representation logical view inverted file Document Collection

35 IR, Spring 2013ESIT35 Topics Text IR –Indexing and searching –Query languages and operations Retrieval evaluation Modeling –Boolean model –Vector space model –Probabilistic model Applications for IR –Multimedia IR –Web search –Digital libraries

36 IR, Spring 2013ESIT36 Organization of the Textbook Basics in IR (focus) –Inverted indexes for Boolean queries (Ch.1-5) –Term weighting and vector space model (Ch. 6-7) –Evaluation in IR (Ch. 8) Machine learning in IR (selected) –Text classification (Ch. 13-15) –Document clustering (Ch. 16-18) Web Search –Web crawling and indexes (Ch. 19-20) –Link analysis (Ch. 21) Advanced Topics (skipped) –Relevance feedback (Ch. 9) –XML retrieval (Ch. 10) –Probabilistic IR (Ch. 11) –Language models (Ch. 12)

37 Some Overlap with Other Fields Text mining Machine Learning Natural Language Processing Social Network Analysis … IR, Spring 2013ESIT37

38 IR, Spring 2013ESIT38 Pointers to Other Topics Cross-language IR Image, video, and multimedia IR Speech retrieval Music retrieval User interfaces Parallel, distributed, and P2P IR Digital libraries Information science perspective Logic-based approaches to IR Natural language processing techniques …

39 IR, Spring 2013ESIT39 Tentative Schedule 9 weeks: (Apr. 24 – Jun. 28) –Overview and boolean retrieval (1 wk) –Indexing (2 wks) –Vector space model and evaluation (2 wk) –Text classification & clustering (1 wk) –Web search (2 wks) –Advanced topics: CLIR, IE, … (1 wk)

40 IR, Spring 2013ESIT40 Generic Resources Wikipedia page on Information Retrieval: http://en.wikipedia.org/wiki/Informatio n_retrieval http://en.wikipedia.org/wiki/Informatio n_retrieval Information Retrieval Resources: http://www- csli.stanford.edu/~hinrich/information- retrieval.html http://www- csli.stanford.edu/~hinrich/information- retrieval.html

41 IR, Spring 2013ESIT41 Academic Resources Journals –ACM TOIS: Transactions on Information Systems –JASIST: Journal of the American Society of Information Sciences –IP&M: Information Processing and Management –IEEE TKDE: Transactions on Knowledge and Data Engineering Conferences –ACM SIGIR: International Conference on Information Retrieval –WWW: World Wide Web Conference –ACM CIKM: Conference on Information Knowledge and Management –JCDL: ACM/IEEE Joint Conference on Digital Libraries –ACM WSDM: International Conference on Web Search and Data Mining –TREC: Text Retrieval Conference

42 IR, Spring 2013ESIT42 Thanks for Your Attention!


Download ppt "Course Overview: An Introduction to Information Retrieval and Applications J. H. Wang Apr. 24, 2013."

Similar presentations


Ads by Google