Evidence from Metadata INST 734 Doug Oard Module 8.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Dr. Leo Obrst Information Semantics Command & Control Center July 17, 2007 Ontologies Can't Help Records Management Or Can They?
Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Information Architecture for Indexers Presented by Fred Leise American Society of Indexers National Conference Galveston, Texas May 18, 2002 © 2002 ContextualAnalysis.
Search and Ye Shall Find (maybe) Seminar on Emergent Information Technology August 20, 2007 Douglas W. Oard.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)
Internal information 1 EPi/Policy training UK September 12, 2008.
Ranked Retrieval INST 734 Module 3 Doug Oard. Agenda  Ranked retrieval Similarity-based ranking Probability-based ranking.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Evidence from Metadata LBSC 796/CMSC 828o Session 6 – March 1, 2004 Douglas W. Oard.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
IR & Metadata. Metadata Didn’t we already talk about this? We discussed what metadata is and its types –Data about data –Descriptive metadata is external.
Evidence from Metadata LBSC 796/INFM 718R Session 9: November 5, 2007 Douglas W. Oard.
Adaptive Book: A Platform for teaching, learning and student modeling Ananda Gunawardena School of Computer Science Carnegie Mellon University.
Text Retrieval and Spreadsheets Class 4 LBSC 690 Information Technology.
CM143 - Web Week 2 Basic HTML. Links and Image Tags.
Knowledge organisation and information architecture, Nils Pharo Knowledge organisation and the Web Nils Pharo, 6th November 2002.
Cross-Language Retrieval INST 734 Module 11 Doug Oard.
Dawn Pedersen Art Institute. Introduction All your hard design work will suffer in anonymity if people can't find your site. The most common way people.
Evidence from Content INST 734 Module 2 Doug Oard.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Electronic Communication and Web Accessibility Workshop.
Adding metadata to web pages Please note: this is a temporary test document for use in internal testing only.
Metadata Week 4 LBSC 671 Creating Information Infrastructures.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Adventures in Digital Asset Management: Fedora at the National Library of Wales Glen Robson National Library of Wales
Indexes/Abstracts Ready Reference Dr. Dania Bilal IS 530 Spring 2002.
The Boolean Retrieval Model LBSC 708A/CMSC 838L Session 2 - September 11, 2001 Philip Resnik.
Metadata Week 4 LBSC 671 Creating Information Infrastructures.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Evidence from Metadata LBSC 796/INFM 718R Session 9: April 6, 2011 Douglas W. Oard.
Web Search Module 6 INST 734 Doug Oard. Agenda The Web  Crawling Web search.
Information Retrieval and Knowledge Organisation Knut Hinkelmann.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:
Filtering and Recommendation INST 734 Module 9 Doug Oard.
Librarians vs. Automation Carolyn Weber Lucio Campanelli Will Hohyon Ryu.
Structure of IR Systems INST 734 Module 1 Doug Oard.
Cross-Language Retrieval INST 734 Module 11 Doug Oard.
Web Search Module 6 INST 734 Doug Oard. Agenda The Web Crawling  Web search.
How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”
Accessibility : Designing the Interface and Navigation The Non-Designer’s Web Book Chapter 7 Robin Williams and John Tollett Presented by Sherie Loika.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Librarians vs. Automation Carolyn Weber Lucio Campanelli Will Hohyon Ryu.
Evidence from Behavior
Structure of IR Systems LBSC 796/INFM 718R Session 1, September 10, 2007 Doug Oard.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Content Challenges for Open Government Dale Waldt Sr. Analyst / Consultant
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Structure of IR Systems LBSC 796/INFM 718R Session 1, January 26, 2011 Doug Oard.
Toward Semantic Search: RDFa based facet browser Jin Guang Zheng Tetherless World Constellation.
CP3024 Lecture 12 Search Engines. What is the main WWW problem?  With an estimated 800 million web pages finding the one you want is difficult!
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
Semantic (web) activity at Elsevier Marc Krellenstein VP, Search and Discovery Elsevier October 27, 2004
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
Search Engine Optimization
Programming by a Sample: Rapidly Creating Web Applications with d.mix
Presented by: Hassan Sayyadi
Personalized Social Image Recommendation
Creating Accessible Electronic Content
Objective % Explain concepts used to create websites.
Cataloging the Internet
Searching EIT, Author Gay Robertson, 2017.
Information Retrieval and Web Design
Presentation transcript:

Evidence from Metadata INST 734 Doug Oard Module 8

Agenda  Metadata Intentional description Incidental description Linked data

HTML Meta Tags INST 734: Information Retrieval Systems <META NAME="DESCRIPTION" CONTENT=“Make Money Fast"> <META NAME="KEYWORDS" CONTENT=“easy,money,part-time,home">

Metadata Uses Have it –Preservation (e.g., PREMIS) –Validation –Disposition Find it –Search/Recognize/Choose –Browse (“Navigation”) Serve it –Persistent location –Structure –Surrogates Use it –Context –Rights management –User behavior capture –Reasoning (“Semantic Web”)

Problems with “Free Text” Search Homonymy –Terms may have many unrelated meanings –Polysemy (related meanings) is less of a problem Synonymy –Many ways of saying (nearly) the same thing Anaphora –Alternate ways of referring to the same thing

Controlled Vocabulary Develop a concept inventory –Uniquely identify concepts using “descriptors” –Concept labels form a “controlled vocabulary” –Organize concepts using a “thesaurus” Assign concept descriptors to documents –Also known as “indexing” Craft queries using the controlled vocabulary

Two Ways of Searching Write the document using terms to convey meaning Author Content-Based Query-Document Matching Document Terms Query Terms Construct query from terms that may appear in documents Free-Text Searcher Retrieval Status Value Construct query from available concept descriptors Controlled Vocabulary Searcher Choose appropriate concept descriptors Indexer Metadata-Based Query-Document Matching Query Descriptors Document Descriptors

Controlled Vocabulary Applications When implied concepts must be captured –Political action, volunteerism, … When searchers can’t guess what was written –Searching foreign language materials When no words are present –Photos w/o captions, videos w/o transcripts, … When user needs are easily anticipated –Weather reports, yellow pages, …

Controlled Vocabulary Challenges Changing concept inventories –Literary warrant and user needs are hard to predict Accurate concept indexing is expensive –Machines are inaccurate, humans are inconsistent Users and indexers may think differently –Diverse user populations add to the complexity Using thesauri effectively requires training –Meta-knowledge and thesaurus-specific expertise

Open Archival Information System (OAIS) Reference Model

Metadata Sources Manual –Professional –Community –Personal Automated –Capture –Extraction –Classification

Machine-Assisted Indexing //TEXT: science IF (all caps) USE research policy USE community program ENDIF IF (near “Technology” AND with “Development”) USE community development USE development aid ENDIF near: within 250 words with: in the same sentence Access Innovations system:

Metadata Design Issues Balance cost and benefit –Complement (don’t repeat) content and behavior Accommodate dynamic factors –Changing concepts, content, URL’s, … Limit adversarial behavior –Social authority, transparency, … Consider the future –Interpretability, automated reasoning, …

Agenda Metadata  Intentional description Incidental description Linked data