Download presentation
Presentation is loading. Please wait.
1
CS580: Building Web Based Information Systems Roger Alexander & Adele Howe The purpose of the course is to teach theory and practice underlying the construction of Web based information systems. As such, the course will devote equal time to information retrieval and software engineering topics. The theory will be put into practice through a semester-long team programming project.
2
Information Gathering Process Information System WWW Query/ Feedback Documents
3
User Tasks Searching/Retrieval user formulates information need as a precise query. Target is single or small number of documents. Surfing/Browsing user has an interest and no clearly defined objectives. Filtering on-going information interest IR system takes the initiative (pushing model)
4
Information Sources Access: WWW, ftp, libraries Format: natural language representation: ASCII, PDF, RTF, Word… structure: ASCII, HTML, XML, DAML… Content: text, images, sound, …
5
Basic Tools Indexes human or machine generated accessed through search engines or catalogs Link Traversal accessed through browsers or software agents Queries descriptions of information need
6
Issues Size/Efficiency Repository (i.e., WWW) is huge; users crave instant access. Accuracy/Relevance Noise on WWW can be overwhelming. It is difficult to describe information need and find what you want. Semantics are lacking. Reliability Behavior of systems must be predictable and robust to interference.
7
Issues (cont.) Duration of Information Need one time or on-going Adaptability/Personalization Trustworthiness/Authority Information sources may not be credible (e.g., web sites for Flat Earth Society). Timeliness/Dynamism WWW changes every second; links are updated, moved or removed.
8
Issues (cont.) Privacy Content and access providers can monitor users. Security Information transmitted may be captured and read.
9
Basic IR Terminology Document: single unit of information, anything from a book to a sentence. Corpus: a collection of documents Repository: location where documents are stored Term: atomic semantic/syntactic unit in document, usually word or phrase Query: statement of information need, sometimes summary of document
10
Evaluation Recall emphasizes returning all relevant documents: Precision emphasizes returning only relevant documents: Retrieved Relevant Corpus
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.