Download presentation
Presentation is loading. Please wait.
1
2004.12.09 - SLIDE 1IS 202 – FALL 2004 Lecture 29: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004 http://www.sims.berkeley.edu/academics/courses/is202/f04/ SIMS 202: Information Organization and Retrieval
2
2004.12.09 - SLIDE 2IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps
3
2004.12.09 - SLIDE 3IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps
4
2004.12.09 - SLIDE 4IS 202 – FALL 2004 Final Exam Details Date: December 14 Time: 9:30-12:30 The exam is open-book, open note AND open computer You may use your own laptop, or one of the computers in the lab The results will need to be printed It can be handwritten if you wish, if so be sure to bring pens, pencils, and erasers It is essential that you have access to and/or bring your final facetted classification so that you can analyze it and use it
5
2004.12.09 - SLIDE 5IS 202 – FALL 2004 Final Exam Details There will be 8 questions on the exam –Some questions have multiple parts One of the questions will be taken from the Discussion Questions you submitted in class Questions will be worth a specific number of points and these will be stated on the exam itself Partial credit will be awarded for partial answers, so we advise that you do not skip any questions In your answers, please balance conciseness with illustration of all of the requested information –In other words, don't write a lot of things that aren't asked for, but try to address all of what is asked for
6
2004.12.09 - SLIDE 6IS 202 – FALL 2004 Final Exam Details The exam will be comprehensive, covering both the Information Organization and Retrieval parts of the course –The emphasis will be on the last half (Organization) (about 70/30 bias towards the last half) Each person will work individually The exam period is three hours –You will likely need the entire time If you use network-accessed material for any part of the exam be sure to cite your sources
7
2004.12.09 - SLIDE 7IS 202 – FALL 2004 Study Guide Be sure you understand the material that was covered in lectures and have read and understood the assigned readings Be sure you can do activities similar to what was done in the homework assignments
8
2004.12.09 - SLIDE 8IS 202 – FALL 2004 Study Guide We will have questions that require you to generalize from what you've learned and synthesize ideas –So be sure you have thought about the ideas covered in lectures, readings, and homework assignments These ideas and abilities should be at your fingertips –There won't be time during the exam to do a lot of catch-up reading on topics you haven't studied
9
2004.12.09 - SLIDE 9IS 202 – FALL 2004 Example Questions These are available on the Class Web site Note that these examples are NOT the exact questions that will be on the exam but are similar to questions that have been used in the past If you have actively participated in the phone project assignments from the last part of the course and are familiar with the facetted classification you designed and built, this will greatly help you on at least 30% of the final exam
10
2004.12.09 - SLIDE 10IS 202 – FALL 2004 Review of Course Content We can draw on: –All sets of slides (including this one) –The Course Readers –Textbooks –Handout papers –Assignments –Discussion questions and issues
11
2004.12.09 - SLIDE 11IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps
12
2004.12.09 - SLIDE 12IS 202 – FALL 2004 Course Schedule Organization –Categorization –Knowledge Representation –Lexical Relations and WordNet –Controlled Vocabularies Introduction –Phone Project Introduction –Semantic Web and RDF –Facetted Classification –Thesaurus Design and Construction –Metadata Standards –Multimedia Information Organization and Retrieval –Metadata for Media –Mobile and Context-Aware Multimedia Systems –Phone Project Presentations –Future of Information Systems Retrieval –Overview –What is Information? –History of Information Systems –Introduction to the Search Process –Boolean Queries and Text Processing –Web Search Issues and Architecture –Implementing Web Site Search Engines –Statistical Properties of Text and Vector Representation –Probabilistic Ranking & Relevance Feedback –Evaluation –Database Design
13
2004.12.09 - SLIDE 13IS 202 – FALL 2004 Your Questions What topics and/or questions would you like to discuss today?
14
2004.12.09 - SLIDE 14IS 202 – FALL 2004 Information Retrieval Topics Information Document Representation and Statistical Properties of Text Queries, Ranking, and the Vector Space Model IR systems and Implementation Evaluation of IR Systems The Search Process and User Interfaces Relevance Feedback Database Design
15
2004.12.09 - SLIDE 15IS 202 – FALL 2004 Information Retrieval Topics Information –What is the information life cycle? –What are different ways of measuring information? What are different ways of defining information? Document Representation and Statistical Properties of Text –What is the significance of Zipf's law for weighting of terms in information retrieval? –What kinds of errors can a stemming algorithm produce?
16
2004.12.09 - SLIDE 16IS 202 – FALL 2004 Information Retrieval Topics Queries, Ranking, and the Vector Space Model –What is the difference between a search engine that uses the vector space ranking algorithm on natural language queries and a system that uses Boolean queries? –What is the role of coordination level ranking in a facetted Boolean system? –Describe the following information need in terms of a faceted Boolean query. What kinds of weighting algorithms can be applied to a faceted query like this? “I would like to find articles about the effects of the passage of the independent investigator statute by Congress on how the U.S. president chooses an attorney general.'' –Why do different web search engines return different sets of documents for the same query? –Redo the computations of Assignment 3 part 3 using different values for TF.
17
2004.12.09 - SLIDE 17IS 202 – FALL 2004 Information Retrieval Topics IR systems and Implementation –Draw and label a diagram that shows the major components of an IR system. –What are the special features of the Cheshire II information access system? –What is the purpose of an inverted index? How is it used to generate answers to Boolean queries? –Convert the contents of a set of documents (short texts) into an inverted index representation. Evaluation of IR Systems –Define precision. Define recall. Define relevance. How are the three interrelated? –Under what circumstances is high recall desirable? Under what circumstances is high precision? –What is the main purpose of TREC? How does it differ from earlier evaluation efforts?
18
2004.12.09 - SLIDE 18IS 202 – FALL 2004 Information Retrieval Topics The Search Process and User Interfaces –Search and retrieval is part of a larger process. Name some other components of that process. –How/why doesn't the Bates berry-picking model fit with the standard information retrieval model? –How (fundamentally) does search on a directory system like Yahoo differ from search on Altavista or Google?
19
2004.12.09 - SLIDE 19IS 202 – FALL 2004 Information Retrieval Topics Relevance Feedback –What is main the difference between relevance feedback as defined in the literature and the more current web-based notion of "more like this"? –Given a query, three documents marked as relevant, and the Rocchio formula for relevance feedback given in class, compute the vector for the new query that results. –The Koenemann & Belkin study found results in three conditions for relevance feedback opaque, transparent, and penetrable. Consider the different ways people have recently implemented systems for predicting which web page to show the user next. How do the differences in these systems correspond to the different relevance feedback
20
2004.12.09 - SLIDE 20IS 202 – FALL 2004 Information Retrieval Topics Database Design –How is a database different than a file system? –What are the benefits of a database system? –What do we mean by data independence? –What are the benefits/drawbacks of the primary database models? –Entity-Relationship Diagrams -- what are they for, how do you create them? –How do you normalize a relational model database? –What is a join?
21
2004.12.09 - SLIDE 21IS 202 – FALL 2004 Information Organization Topics Categorization Knowledge Representation Lexical Relations and WordNet Controlled Vocabularies Semantic Web and RDF Facetted Classification and Thesaurus Design and Construction Metadata Standards Multimedia Information Organization and Retrieval Metadata for Motion Pictures Media Streams and MPEG-7 Mobile and Context-Aware Multimedia Information Systems Looking Backward Looking Forward Future of Information Systems Project Presentations
22
2004.12.09 - SLIDE 22IS 202 – FALL 2004 Information Organization Topics Categorization –What is the definition of class membership in traditional categorization? How does traditional categorization have difficulty describing certain phenomena, like games (give 1 other example besides games)? –What is the “basic level” in categorization and how is it psychologically primary? How might the use of basic level categorization affect the design and use of information systems? Knowledge Representation –What limitations in standard information retrieval do knowledge representation technologies try to overcome? What challenges do they face in the attempt? –What are the similarities and differences between commonsense knowledge representation systems like CYC and facetted metadata classifications like the Art and Architecture Thesaurus or the facetted classification you built (give three examples)?
23
2004.12.09 - SLIDE 23IS 202 – FALL 2004 Information Organization Topics Lexical Relations and WordNet –What are three lexical relations in WordNet that would be useful in an information retrieval task (explain how and give examples)? –Where are the meanings of the words in WordNet? How would assuming the conduit metaphor vs. the toolmakers’ paradigm of communication lead you to different answers to this question? Controlled Vocabularies –What does Svenonius consider to be the primary difficulties with using controlled vocabularies? –What is the purpose of authority control? Is this a type of controlled vocabulary? Why or why not?
24
2004.12.09 - SLIDE 24IS 202 – FALL 2004 Information Organization Topics Semantic Web and RDF –What are the different basic topological structures of XML and RDF? What benefits and problems do these respective structures offer for information organization and retrieval? –What is the Semantic Web effort trying to accomplish? What challenges does that effort face and how might they be overcome? Facetted Classification and Thesaurus Design and Construction –What are the differences between classical and faceted classification and how do these differences affect the design and use of information systems? –How is a classification scheme or a thesaurus designed?
25
2004.12.09 - SLIDE 25IS 202 – FALL 2004 Information Organization Topics Metadata Standards –What are the motivations behind creating and using metadata systems like Dublin Core, MARC, AACR II, etc.? –How do metadata standards come about and how might their provenance affect their adoption? Multimedia Information Organization and Retrieval –What is the “Kuleshov Effect” and how might it affect the design of metadata for multimedia data? –What are the “semantic gap” and the “sensory gap” and what challenges do they present for the design of information systems for multimedia data?
26
2004.12.09 - SLIDE 26IS 202 – FALL 2004 Information Organization Topics Metadata for Motion Pictures Media Streams and MPEG-7 –What limitations do keywords pose for multimedia information retrieval and how might those limitations be addressed? –What aspects of multimedia content description is MPEG-7 attempting to standardize? Mobile and Context-Aware Multimedia Information Systems –How are cameraphones distinguished from traditional digital cameras in their technological capabilities and use (give 5 examples)? –What and how could contextual metadata be useful in describing and retrieving information (give 4 examples)?
27
2004.12.09 - SLIDE 27IS 202 – FALL 2004 Information Organization Topics Looking Backward Looking Forward Future of Information Systems –How are Bush’s vision of the Memex and the current World Wide Web similar and different (explain two similarities and two differences)? Project Presentations –In revising your facetted metadata ontology how did you increase its expressiveness and reusability (give 3 examples)? –How well would the ontology you and your partner group designed support one of the other mobile media metadata applications presented by your classmates?
28
2004.12.09 - SLIDE 28IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps
29
2004.12.09 - SLIDE 29IS 202 – FALL 2004 Course Evaluations Please take these seriously We and your colleagues really benefit from these in many ways –Affect our promotion and tenure –Give us helpful feedback on what worked and what didn't to help us for next year and beyond –They in no way affect your grade
30
2004.12.09 - SLIDE 30IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps
31
2004.12.09 - SLIDE 31IS 202 – FALL 2004 Phone Details Use over break? Need roaming? Want GPS unit? Want to still get photos off the phone? Want to switch to primary cell phone number? Can bring in on Friday? Can bring in on Monday?
32
2004.12.09 - SLIDE 32IS 202 – FALL 2004 Lecture Overview Final Exam Final Review Course Evaluations Phone Details Next Steps
33
2004.12.09 - SLIDE 33IS 202 – FALL 2004 Study hard, and good luck! Thank you for all the great work!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.