Download presentation
Presentation is loading. Please wait.
2004.10.12 - SLIDE 1IS 202 – FALL 2004 Lecture 13: Midterm Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004 SIMS 202: Information Organization and Retrieval
2004.10.12 - SLIDE 2IS 202 – FALL 2004 Lecture Overview Midterm Review –The administrative details –The “Rules” for the exam –We will go through the sample questions and discuss them –Open question/answer period
2004.10.12 - SLIDE 3IS 202 – FALL 2004 Lecture Overview Midterm Review –The administrative details –The “Rules” for the exam –We will go through the sample questions and discuss them –Open question/answer period
2004.10.12 - SLIDE 4IS 202 – FALL 2004 Midterm Exam Details Date: 10/14/2004 Time: 10:30-12:00 The exam is open-book, open note AND open computer There will be 8-10 questions on the exam You may use your own laptop, or one of the computers in the lab. The results of your work are to be printed The exam can be hand-written if you wish, if so be sure to bring: –Pens/Pencils –Calculator –(Paper will be provided on the exam itself, but you may want to bring scratch paper)
2004.10.12 - SLIDE 5IS 202 – FALL 2004 Midterm Exam Details The exam will cover the first half of the course, that is primarily it will be on the topics covered concerning Information Retrieval Questions will be worth a specific number of points and these will be stated on the exam itself Partial credit will be awarded for partial answers In your answers, please balance conciseness with illustration of all of the requested information –In other words, don't write a lot of things that aren't asked for, but try to address all of what is asked for
2004.10.12 - SLIDE 6IS 202 – FALL 2004 Lecture Overview Midterm Review –The administrative details –The “Rules” for the exam –We will go through the sample questions and discuss them –Open question/answer period
2004.10.12 - SLIDE 7IS 202 – FALL 2004 Rules Do your own work No discussion during the exam –Yes, IM counts as discussion! –Yes, email counts as discussion! You are on your honor to not look at other student’s work (you may want to review the University policies on academic dishonesty) PROVIDE PROPER ATTRIBUTION for ideas taken from other sources (online or printed)
2004.10.12 - SLIDE 8IS 202 – FALL 2004 Rules Questions CAN and SHOULD be asked of me or the TA’s Issues/Corrections/Answers for details will be put up on the screens in 202 We will also put these up on a web page for those in the Lab
2004.10.12 - SLIDE 9IS 202 – FALL 2004 Lecture Overview Midterm Review –The administrative details –The “Rules” for the exam –We will go through the sample questions and discuss them –Open question/answer period
2004.10.12 - SLIDE 10IS 202 – FALL 2004 Study Guide To study for the exam: Be sure you understand the material that was covered in lectures and have read and absorbed the corresponding material in the readings Be sure you can do activities similar to what was done in the homework assignments We will have questions that require you to generalize from what you've learned and synthesize ideas –So be sure you have thought about the ideas covered in lecture, readings, and homework assignments
2004.10.12 - SLIDE 11IS 202 – FALL 2004 Study Guide Alison suggests that you might want to bookmark online or printed resources so that you can quickly find the topics that you need
2004.10.12 - SLIDE 12IS 202 – FALL 2004 Example Questions These are available on the Class Web site Note that these examples are NOT the exact questions that will be on the exam but are similar to questions that have been used in the past There will be questions that ask you to do something with supplied data –For example, given some data, design an ER diagram describing the data elements and their relationships
2004.10.12 - SLIDE 13IS 202 – FALL 2004 Example Questions The example questions on the web site are organized (approximately) in the order that the topics were presented during the course: –Information –The Search process –Documents and Statistics of Text –Queries, Ranking, and the Vector Space Model –IR Systems and Implementation –Relevance Feedback –Evaluation of IR Systems –Database Design
2004.10.12 - SLIDE 14IS 202 – FALL 2004 (Approximate) Course Schedule Organization –Phone Project Introduction –Categorization –Knowledge Representation –Lexical Relations and WordNet –Metadata Introduction –Controlled Vocabularies Introduction –Facetted Classification –Thesaurus Design and Construction –Semantic Web –Multimedia Information Organization and Retrieval –Metadata for Media –Phone Project Presentations Retrieval –Overview –Introduction to the Search Process –Boolean Queries and Text Processing –Web Search Issues and Architecture –Statistical Properties of Text and Vector Representation –Probabilistic Ranking & Relevance Feedback –Evaluation –Interfaces for Information Retrieval –Database Design
2004.10.12 - SLIDE 15IS 202 – FALL 2004 Review of Course Content We can draw on: –14 sets of Slides (including this one and the Math Review slides) –Handout papers –The Reader –Textbooks –Assignments –Discussion questions and issues
2004.10.12 - SLIDE 16IS 202 – FALL 2004 Example Questions Topic: Information Example Questions: –What is the information life cycle? –What are different ways of measuring information? What are different ways of defining information?
2004.10.12 - SLIDE 17IS 202 – FALL 2004 Example Questions Topic: Document Representation and Statistical Properties of Text Example Questions: –What is the significance of Zipf's law for weighting of terms in information retrieval? –What kinds of errors can a stemming algorithm produce?
2004.10.12 - SLIDE 18IS 202 – FALL 2004 Example Questions Topic: Queries, Ranking, and the Vector Space Model Example Questions: –What is the difference between a search engine that uses the vector space ranking algorithm on natural language queries and a system that uses Boolean queries? –What is the role of coordination level ranking in a faceted Boolean system? –Describe the following information need in terms of a faceted Boolean query. What kinds of weighting algorithms can be applied to a faceted query like this? ``I would like to find articles about the effects of the passage of the independent investigator statute by Congress on how the U.S. president chooses an attorney general.'' –Why do different web search engines return different sets of documents for the same query? –Redo the computations of Assignment 3 part 3 using different values for TF.
2004.10.12 - SLIDE 19IS 202 – FALL 2004 Example Questions Topic: IR systems and Implementation Example Questions: –Draw and label a diagram that shows the major components of an IR system. –What are the special features of the Cheshire II information access system? –What is the purpose of an inverted index? How is it used to generate answers to Boolean queries? –Convert the contents of a set of documents (short texts) into an inverted index representation.
2004.10.12 - SLIDE 20IS 202 – FALL 2004 Example Questions Topic: Evaluation of IR Systems Example Questions: –Define precision. Define recall. Define relevance. How are the three interrelated? –Under what circumstances is high recall desirable? Under what circumstances is high precision? –What is the main purpose of TREC? How does it differ from earlier evaluation efforts?
2004.10.12 - SLIDE 21IS 202 – FALL 2004 Example Questions Topic: The Search Process Example Questions: –Search and retrieval is part of a larger process. Name some other components of that process. –How/why doesn't the Bates berry-picking model fit with the standard information retrieval model? –How (fundamentally) does search on a directory system like Yahoo differ from search on Altavista or Google?
2004.10.12 - SLIDE 22IS 202 – FALL 2004 Example Questions Topic: Relevance Feedback Example Questions: –What is main the difference between relevance feedback as defined in the literature and the more current web-based notion of "more like this"? –Given a query, three documents marked as relevant, and the Rocchio formula for relevance feedback given in class, compute the vector for the new query that results. –The Koenemann & Belkin study found results in three conditions for relevance feedback: opaque, transparent, and penetrable. Consider the different ways people have implemented systems for predicting which web page to show the user next. How do the differences in these systems correspond to the different relevance feedback
2004.10.12 - SLIDE 23IS 202 – FALL 2004 Example Questions Topic: Database Design Example Questions: –How is a database different than a file system? –What are the benefits of a database system? –What do we mean by data independence? –What are the benefits/drawbacks of the primary database models? –Entity-Relationship Diagrams -- what are they for, how do you create them? –How do you normalize a relational model database? –What is a join?
2004.10.12 - SLIDE 24IS 202 – FALL 2004 Lecture Overview Midterm Review –The administrative details –The “Rules” for the exam –We will go through the sample questions and discuss them –Open question/answer period
2004.10.12 - SLIDE 25IS 202 – FALL 2004 Your Questions What other topics would you like more explanation for?
2004.10.12 - SLIDE 26IS 202 – FALL 2004 Be prepared, and good luck!
Similar presentations
© 2025 Inc.
All rights reserved.