2004.12.09 - SLIDE 1IS 202 – FALL 2004 Lecture 29: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Multimedia Database Systems
Information Retrieval in Practice
Search Engines and Information Retrieval
IB HL2 BUSINESS & MANAGEMENT COURSE OVERVIEW Academic Year.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
SLIDE 1IS 202 – FALL 2004 Lecture 13: Midterm Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Information Retrieval in Practice
Chapter 2Modeling 資工 4B 陳建勳. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.
SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2002
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
SLIDE 1IS 202 – FALL 2003 Lecture 26: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Overview of Search Engines
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Search Engines and Information Retrieval Chapter 1.
Aardvark Anatomy of a Large-Scale Social Search Engine.
Planning & Writing Laboratory Reports A Brief Review of the Scientific Method.
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Introduction to Interactive Media The Interactive Media Development Process.
Information Retrieval and Web Search Lecture 1. Course overview Instructor: Rada Mihalcea Class web page:
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Information Retrieval Models - 1 Boolean. Introduction IR systems usually adopt index terms to process queries Index terms:  A keyword or group of selected.
Biology 200 Tutorial INTRODUCTION. Welcome to the Biology 200 tutorial The purpose of the tutorial is to support student learning in Biology 200. We promote.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Subject (Exam) Review WSTA 2015 Trevor Cohn. Exam Structure Worth 50 marks Parts: – A: short answer [14] – B: method questions [18] – C: algorithm questions.
Autumn Web Information retrieval (Web IR) Handout #0: Introduction Ali Mohammad Zareh Bidoki ECE Department, Yazd University
AB Accounting 1 Unit 1 Seminar July 5, 2012 School of Business and Management.
SLIDE 1IS 202 – FALL 2002 Lecture 27: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
How to read a scientific paper
Intro: FIT1001 Computer Systems S Important Notice for Lecturers This file is in skeleton form only Lecturers are expected to modify / enhance.
Chapter 6: Information Retrieval and Web Search
CSM06 Information Retrieval Lecture 6: Visualising the Results Set Dr Andrew Salway
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Information Retrieval and Web Search Course overview Instructor: Rada Mihalcea.
Information Retrieval
A Puzzle for You. Puzzle Someone is working for you for 7 days You have a gold bar, which is segmented into 7 pieces, but they are all CONNECTED You have.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
SAPIR Search in Audio-Visual Content using P2P Information Retrival For more information visit: Support.
REMINDER: If you haven’t yet passed the Gateway Quiz, make sure you take it this week! (You can find more practice quizzes online in the Gateway Info menu.
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
Search and Retrieval: Finding Out About Prof. Marti Hearst SIMS 202, Lecture 18.
Information Retrieval CIS-462 Dr. Samir Tartir 2013/2014 First Semester.
CSM06: Information Retrieval Notes about writing coursework reports, revision and examination.
Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.
REMINDER: If you haven’t yet passed the Gateway Quiz, make sure you take it this week! (You can find more practice quizzes online in the Gateway Info menu.
IMS 4212: Course Introduction 1 Dr. Lawrence West, Management Dept., University of Central Florida ISM 4212 Dr. Larry West
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Information Retrieval in Practice
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Text Based Information Retrieval
Information Retrieval and Web Search
Information Retrieval and Web Search
Student Success Strategies
Information Retrieval
Introduction to Information Retrieval
Information Retrieval CIS-462
Discussion Class 9 Google.
Student Success Strategies
Presentation transcript:

SLIDE 1IS 202 – FALL 2004 Lecture 29: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall SIMS 202: Information Organization and Retrieval

SLIDE 2IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps

SLIDE 3IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps

SLIDE 4IS 202 – FALL 2004 Final Exam Details Date: December 14 Time: 9:30-12:30 The exam is open-book, open note AND open computer You may use your own laptop, or one of the computers in the lab The results will need to be printed It can be handwritten if you wish, if so be sure to bring pens, pencils, and erasers It is essential that you have access to and/or bring your final facetted classification so that you can analyze it and use it

SLIDE 5IS 202 – FALL 2004 Final Exam Details There will be 8 questions on the exam –Some questions have multiple parts One of the questions will be taken from the Discussion Questions you submitted in class Questions will be worth a specific number of points and these will be stated on the exam itself Partial credit will be awarded for partial answers, so we advise that you do not skip any questions In your answers, please balance conciseness with illustration of all of the requested information –In other words, don't write a lot of things that aren't asked for, but try to address all of what is asked for

SLIDE 6IS 202 – FALL 2004 Final Exam Details The exam will be comprehensive, covering both the Information Organization and Retrieval parts of the course –The emphasis will be on the last half (Organization) (about 70/30 bias towards the last half) Each person will work individually The exam period is three hours –You will likely need the entire time If you use network-accessed material for any part of the exam be sure to cite your sources

SLIDE 7IS 202 – FALL 2004 Study Guide Be sure you understand the material that was covered in lectures and have read and understood the assigned readings Be sure you can do activities similar to what was done in the homework assignments

SLIDE 8IS 202 – FALL 2004 Study Guide We will have questions that require you to generalize from what you've learned and synthesize ideas –So be sure you have thought about the ideas covered in lectures, readings, and homework assignments These ideas and abilities should be at your fingertips –There won't be time during the exam to do a lot of catch-up reading on topics you haven't studied

SLIDE 9IS 202 – FALL 2004 Example Questions These are available on the Class Web site Note that these examples are NOT the exact questions that will be on the exam but are similar to questions that have been used in the past If you have actively participated in the phone project assignments from the last part of the course and are familiar with the facetted classification you designed and built, this will greatly help you on at least 30% of the final exam

SLIDE 10IS 202 – FALL 2004 Review of Course Content We can draw on: –All sets of slides (including this one) –The Course Readers –Textbooks –Handout papers –Assignments –Discussion questions and issues

SLIDE 11IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps

SLIDE 12IS 202 – FALL 2004 Course Schedule Organization –Categorization –Knowledge Representation –Lexical Relations and WordNet –Controlled Vocabularies Introduction –Phone Project Introduction –Semantic Web and RDF –Facetted Classification –Thesaurus Design and Construction –Metadata Standards –Multimedia Information Organization and Retrieval –Metadata for Media –Mobile and Context-Aware Multimedia Systems –Phone Project Presentations –Future of Information Systems Retrieval –Overview –What is Information? –History of Information Systems –Introduction to the Search Process –Boolean Queries and Text Processing –Web Search Issues and Architecture –Implementing Web Site Search Engines –Statistical Properties of Text and Vector Representation –Probabilistic Ranking & Relevance Feedback –Evaluation –Database Design

SLIDE 13IS 202 – FALL 2004 Your Questions What topics and/or questions would you like to discuss today?

SLIDE 14IS 202 – FALL 2004 Information Retrieval Topics Information Document Representation and Statistical Properties of Text Queries, Ranking, and the Vector Space Model IR systems and Implementation Evaluation of IR Systems The Search Process and User Interfaces Relevance Feedback Database Design

SLIDE 15IS 202 – FALL 2004 Information Retrieval Topics Information –What is the information life cycle? –What are different ways of measuring information? What are different ways of defining information? Document Representation and Statistical Properties of Text –What is the significance of Zipf's law for weighting of terms in information retrieval? –What kinds of errors can a stemming algorithm produce?

SLIDE 16IS 202 – FALL 2004 Information Retrieval Topics Queries, Ranking, and the Vector Space Model –What is the difference between a search engine that uses the vector space ranking algorithm on natural language queries and a system that uses Boolean queries? –What is the role of coordination level ranking in a facetted Boolean system? –Describe the following information need in terms of a faceted Boolean query. What kinds of weighting algorithms can be applied to a faceted query like this? “I would like to find articles about the effects of the passage of the independent investigator statute by Congress on how the U.S. president chooses an attorney general.'' –Why do different web search engines return different sets of documents for the same query? –Redo the computations of Assignment 3 part 3 using different values for TF.

SLIDE 17IS 202 – FALL 2004 Information Retrieval Topics IR systems and Implementation –Draw and label a diagram that shows the major components of an IR system. –What are the special features of the Cheshire II information access system? –What is the purpose of an inverted index? How is it used to generate answers to Boolean queries? –Convert the contents of a set of documents (short texts) into an inverted index representation. Evaluation of IR Systems –Define precision. Define recall. Define relevance. How are the three interrelated? –Under what circumstances is high recall desirable? Under what circumstances is high precision? –What is the main purpose of TREC? How does it differ from earlier evaluation efforts?

SLIDE 18IS 202 – FALL 2004 Information Retrieval Topics The Search Process and User Interfaces –Search and retrieval is part of a larger process. Name some other components of that process. –How/why doesn't the Bates berry-picking model fit with the standard information retrieval model? –How (fundamentally) does search on a directory system like Yahoo differ from search on Altavista or Google?

SLIDE 19IS 202 – FALL 2004 Information Retrieval Topics Relevance Feedback –What is main the difference between relevance feedback as defined in the literature and the more current web-based notion of "more like this"? –Given a query, three documents marked as relevant, and the Rocchio formula for relevance feedback given in class, compute the vector for the new query that results. –The Koenemann & Belkin study found results in three conditions for relevance feedback opaque, transparent, and penetrable. Consider the different ways people have recently implemented systems for predicting which web page to show the user next. How do the differences in these systems correspond to the different relevance feedback

SLIDE 20IS 202 – FALL 2004 Information Retrieval Topics Database Design –How is a database different than a file system? –What are the benefits of a database system? –What do we mean by data independence? –What are the benefits/drawbacks of the primary database models? –Entity-Relationship Diagrams -- what are they for, how do you create them? –How do you normalize a relational model database? –What is a join?

SLIDE 21IS 202 – FALL 2004 Information Organization Topics Categorization Knowledge Representation Lexical Relations and WordNet Controlled Vocabularies Semantic Web and RDF Facetted Classification and Thesaurus Design and Construction Metadata Standards Multimedia Information Organization and Retrieval Metadata for Motion Pictures Media Streams and MPEG-7 Mobile and Context-Aware Multimedia Information Systems Looking Backward Looking Forward Future of Information Systems Project Presentations

SLIDE 22IS 202 – FALL 2004 Information Organization Topics Categorization –What is the definition of class membership in traditional categorization? How does traditional categorization have difficulty describing certain phenomena, like games (give 1 other example besides games)? –What is the “basic level” in categorization and how is it psychologically primary? How might the use of basic level categorization affect the design and use of information systems? Knowledge Representation –What limitations in standard information retrieval do knowledge representation technologies try to overcome? What challenges do they face in the attempt? –What are the similarities and differences between commonsense knowledge representation systems like CYC and facetted metadata classifications like the Art and Architecture Thesaurus or the facetted classification you built (give three examples)?

SLIDE 23IS 202 – FALL 2004 Information Organization Topics Lexical Relations and WordNet –What are three lexical relations in WordNet that would be useful in an information retrieval task (explain how and give examples)? –Where are the meanings of the words in WordNet? How would assuming the conduit metaphor vs. the toolmakers’ paradigm of communication lead you to different answers to this question? Controlled Vocabularies –What does Svenonius consider to be the primary difficulties with using controlled vocabularies? –What is the purpose of authority control? Is this a type of controlled vocabulary? Why or why not?

SLIDE 24IS 202 – FALL 2004 Information Organization Topics Semantic Web and RDF –What are the different basic topological structures of XML and RDF? What benefits and problems do these respective structures offer for information organization and retrieval? –What is the Semantic Web effort trying to accomplish? What challenges does that effort face and how might they be overcome? Facetted Classification and Thesaurus Design and Construction –What are the differences between classical and faceted classification and how do these differences affect the design and use of information systems? –How is a classification scheme or a thesaurus designed?

SLIDE 25IS 202 – FALL 2004 Information Organization Topics Metadata Standards –What are the motivations behind creating and using metadata systems like Dublin Core, MARC, AACR II, etc.? –How do metadata standards come about and how might their provenance affect their adoption? Multimedia Information Organization and Retrieval –What is the “Kuleshov Effect” and how might it affect the design of metadata for multimedia data? –What are the “semantic gap” and the “sensory gap” and what challenges do they present for the design of information systems for multimedia data?

SLIDE 26IS 202 – FALL 2004 Information Organization Topics Metadata for Motion Pictures Media Streams and MPEG-7 –What limitations do keywords pose for multimedia information retrieval and how might those limitations be addressed? –What aspects of multimedia content description is MPEG-7 attempting to standardize? Mobile and Context-Aware Multimedia Information Systems –How are cameraphones distinguished from traditional digital cameras in their technological capabilities and use (give 5 examples)? –What and how could contextual metadata be useful in describing and retrieving information (give 4 examples)?

SLIDE 27IS 202 – FALL 2004 Information Organization Topics Looking Backward Looking Forward Future of Information Systems –How are Bush’s vision of the Memex and the current World Wide Web similar and different (explain two similarities and two differences)? Project Presentations –In revising your facetted metadata ontology how did you increase its expressiveness and reusability (give 3 examples)? –How well would the ontology you and your partner group designed support one of the other mobile media metadata applications presented by your classmates?

SLIDE 28IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps

SLIDE 29IS 202 – FALL 2004 Course Evaluations Please take these seriously We and your colleagues really benefit from these in many ways –Affect our promotion and tenure –Give us helpful feedback on what worked and what didn't to help us for next year and beyond –They in no way affect your grade

SLIDE 30IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps

SLIDE 31IS 202 – FALL 2004 Phone Details Use over break? Need roaming? Want GPS unit? Want to still get photos off the phone? Want to switch to primary cell phone number? Can bring in on Friday? Can bring in on Monday?

SLIDE 32IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps

SLIDE 33IS 202 – FALL 2004 Study hard, and good luck! Thank you for all the great work!