Web- and Multimedia-based Information Systems
Assessment Presentation Programming Assignment
Presentations Document Management Systems & OCR – Market Overview – Algorithm Introduction Video on Demand – Real Media – Technology Authoring Systems – Macromedia Products
Presentations Content Management – Functionality – Market Overview – Opencms Application Server – Functionality – Market Overview
Presentations VRML – Syntax Introduction – Exercise SMIL : Multimedia Synchronisation – Syntax Introduction – Exercise
Presentations Software / Frontend Ergonomics (HCI) Usability Navigation
What are Information Systems? Store Information Retrieval
Information Textual Audiovisual – Images – Audio – Video Multimedia Documents
Information System Classification Expert Systems Transaction Processing Systems Office Automation Systems Management/Executive Information Systems Geographic Information Systems Information Retrieval Systems
Expert Systems Problem Solving Artificial Intelligence Replace an Expert Multiple operational Implementations Often Implemented using Prolog
Transaction Processing Systems Records Events of interest to an organization Supports the operational level of the business High data volume
TPS applications Manufactoring and Production Sales and Marketing Finance and Accounting Human Resources
Office Automation Personal Productivity Groupware & Communications
Management/Executive Information Systems Analysis of TPS data Higher Level Reports Drill Down to detailed Information possible
Geographic Information Systems Different Sources Spatial Data Visualization
Information Retrieval
Information Retrieval System Manages Documents = Records of Information Presents relevant Documents on a Query
Information Retrieval System Examples POTS directory assistance Library Catalog World Wide Web Search Engine
Information Retrieval Deals with the – Representation of – Storage of – Organization of – Access to Information items
History Early Example: Book‘s Table of Contents Indices in libraries Only recently automatic indexing The Web – Easy & cheap access – Variety of sources – Freedom of Publication, Interactivity
Data Retrieval vs Information Retrieval Exact match Looks for matching items Complete Query Data with well defined structure and semantics Best match Looks for Relevant Items Incomplete Query Natural Language Documents
Information Retrieval and the Web IR originally Text Indexing and Searching Web is highly heterogenous System, no common data model Navigation is ineffiecient Information Retrieval promises to structure information and ease fulfilling information needs
Usage: Information Retrieval User has Information need User translates this need into a machine- understandable Query System retrieves relevant Information
The User Retrieval Browsing Database
Logical Views of a Document Full text Set of Index Terms – Specified by human expert – Text Operations Elimination of Stopwords Stemming Compression Intermediate Logical Views Structure Recognition
Retrieval Process User Interface Text Operations Text Database DB Manager Module Index Searching Ranking
Operational Modes Ad Hoc – Fixed Database, changing Queries Filtering – Fixed Queries, changing Database – User Profiles
Information Retrieval Data Structures
Data Structures Linear list Sequentially ordered file Indexed file
Linear List Unsorted list of documents Easy addition of files Traversal required for a search Author D Author E Author A Author F Author B Author G Author C
Sequentially Ordered File Sorted by the values of a Key Addition of documents more involved Binary search possible Author A Author B Author C Author D Author E Author F Author G
Indices A1B2...F6A1B2...F6 Author A Author B Author C Author D Author E Author F
Inverted Indices An index of all the words in the texts Vocabulary – Different Words in the text – Little Space required after Text Operations Occurences – Positions – More Space required, ~30-40% of text size
Inverted Indices Block Addressing – Smaller Pointers – References in one block are collapsed – Online Search required for exact positions – Fixed Size Blocks or Natural Cuts Fully Inverted Indices – For less readily accessable collections if exact position is required
Information Retrieval Models
Classic IR Models Boolean Vector Probabilistic
Common Concepts Index Terms Weigths for varying relevance
Boolean Model Pro Easy to understand Precise Semantics of a query Contra Binary Decision Difficult for users
Boolean Model Example Query Q = 1 AND ( 2 OR NOT 3) AND 1 OR 2NOT 3
Boolean Model – Set Operations AND : Intersection (Durchschnitt) OR : Union (Vereinigung) NOT : Complement (Komplement) – Seldom used on its own