Cornell CS 502 Metadata for the Web Issues and Simple Answers CS 502 – 20020221 Carl Lagoze – Cornell University.

Slides:



Advertisements
Similar presentations
T. Baker / 23 Sep 2000 Dublin Core Qualifiers and A Grammar for Dublin Core Thomas Baker DC-8, National Library of Canada, Ottawa 4 October 2000.
Advertisements

Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.
UKOLN, University of Bath
The Institute for Learning and Research Technology is a national centre of excellence in the development and use of technology-based methods in teaching,
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
CS570 Artificial Intelligence Semantic Web & Ontology 2
Alexandria Digital Library Project The ADEPT Bucket Framework.
1 CS 502: Computing Methods for Digital Libraries Lecture 18 Descriptive Metadata: Metadata Models.
Mixing and Mapping Metadata to Provide Integrated Access to Digital Library Collections Karen Calhoun Director, Central Technical Services Tom Turner Metadata.
Natalia Wehler: Dublin Core Requirements on Metadata  multiple softwares to use metadata  management of changing standards  needs to be functional,
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
Besser--Dublin Core Metadata 2/14/02 1 Dublin Core Metadata Howard Besser UCLA School of Education & Information
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.
Cornell CS 502 Metadata for the Web From Discovery to Description CS 502 – Carl Lagoze – Cornell University.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
Metadata for the Web A Necessary Evil? CS 431 – March 2, 2005 Carl Lagoze – Cornell University.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
© 2006 DCMI DC-2006 – International Conference on Dublin Core and Metadata Applications 3-6 October 2006 Thomas Baker Dublin Core Metadata Initiative.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
Some URLs JODI Paper – Harmony project –
Stuart Weibel OCLC, Inc. October, 1997 Dublin Core Metadata Stuart Weibel Consulting Research Scientist OCLC Office of Research purl.org/net/weibel October.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Metadata and identifiers for e- journals Copenhagen Juha Hakala Helsinki University Library
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Chinese-European Workshop on Digital Preservation Beijing (China), July.
Metadata Standards and Applications 5. Applying Metadata Standards: Application Profiles.
Cornell CS Bibliographic Concepts CS 502 – Carl Lagoze – Cornell University Acks to H. Van de Sompel.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
1 CS/INFO 430 Information Retrieval Lecture 20 Metadata 2.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Web Metadata, what is it? Ora Lassila Visiting Scientist (from Nokia) Definition Applications Current Standardization Efforts.
Metadata Modularization Concepts and Tools Carl Lagoze CS
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
Creating an Application Profile Tutorial 3 DC2004, Shanghai Library 13 October 2004 Thomas Baker, Fraunhofer Society Robina Clayphan, British Library Pete.
Semantic Web, Web Services and Museums: Mapping the Road to Implementation John Perkins “MESMUSES Workshop” Florence, June 16-17, 2003.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Modularization and Interoperability: Dublin Core and the Warwick Framework Sandra D. Payette Digital Library Research Group Cornell University November.
1 Discussion Class 4 The Dublin Core Metadata Initiative.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
A Quick Introduction to Metadata Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
1 The ABC Metadata Ontology and Model Carl Lagoze, Cornell University Jane Hunter, DSTC.
INLS 150 Session 5 February 7, 2002 Cristina Pattuelli School of Information & Library Science UNC.
A centre of expertise in digital information managementwww.ukoln.ac.uk DCMI Affiliates: Implications for Institutions Rosemary Russell UKOLN University.
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Libraries Catalogs Dublin Core.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
1cs The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,
Cornell CS 502 Metadata for the Web Issues and Simple Answers CS 502 – Carl Lagoze – Cornell University.
Metadata for the Web Beyond Dublin Core? CS 431 – March 9, 2005 Carl Lagoze – Cornell University Acknowledgements to Liz Liddy and Geri Gay.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
1 RDF, XML & interoperability Metadata : a reprise Communities, communication & XML An introduction to RDF RDF, XML and interoperability.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Lecture 12 Why metadata? CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Metadata for the Web From Discovery to Description
Attributes and Values Describing Entities.
Introduction to Semantic Metadata & Semantic Web
Metadata in Digital Preservation: Setting the Scene
Attributes and Values Describing Entities.
Presentation transcript:

Cornell CS 502 Metadata for the Web Issues and Simple Answers CS 502 – Carl Lagoze – Cornell University

Cornell CS 502 “Metadata is data about data”

Cornell CS 502 Some untested hypotheses Metadata is useful for… –People –Machines More metadata is better (semi) automated digital libraries and simple metadata

Cornell CS 502 Some known facts Number and variety of metadata vocabularies will continue to increase The Tower of Babel is a franchise –There is not one common view of reality “The one thing I know about metadata is that it is expensive”

Cornell CS 502 Are metadata and data distinguishable? Objectivity? Intellectual property? Structure? Aboutness?

Cornell CS 502 The fiction of classification …there is no classification of the universe that is not fictional and conjectural. Jorge Luis Borges

Cornell CS 502 Lenses and Views All classification does and should provide a biased lens or view of reality Each view emphasizes certain characteristics and hides others Geospatial Rights Museum

Cornell CS 502 Reality is Complex Created by: George Castaldo Created on: 1994 Created by: Leonardo da Vinci Created on: 1506 Relationship?

Cornell CS 502 Objects are Related IFLA Entity Model

Cornell CS 502 Entities, Events, and Agents Photographe r Camera type Software Computer artist

Cornell CS 502 Haven’t we done metadata already?

Cornell CS 502 What’s wrong with this model? Expensive –Complex (even for its original goal?) –Professional intervention (assumes single community of expertise) Monolithic –One size fits all approach –Reflects its centralized system origins Bias towards physical artifacts –Fixed resources –Incomplete handling of resource evolution and other resource relationships Anglo-centric

Cornell CS 502 Web Challenge to Traditional Cataloging Scale Permanence Authenticity Organizational Context Custodial Control Variety

Cornell CS 502 Internet Commons includes Multiple Communities Scientific Data Home Pages Geo Internet Commons Library Museums Commerce Whatever...

Cornell CS 502 Realities of Web search and discovery Search systems are motivated by advertising Index coverage is unpredictable and limited Too much recall, too little precision Index spam abounds Resources (and their names) are volatile

Cornell CS 502 Metadata: Part of a Solution Structured data about data –helps to impose order on chaos –enables automated discovery/manipulation Variety across various dimensions: –specialization –decentralization –democratization

Cornell CS 502 Web Metadata Issues: Description vs. Discovery Library cataloging motivated by describing resources Fuzzy search buckets –Separate books about Sigmund Freud versus books by Sigmund Freud into different buckets –But, different types of data appropriate for different buckets: URLs, date strings, word strings, names But general, fuzzy categories may not be sufficient for describing resources

Cornell CS 502 Web Metadata Models: Drill-Down Searching Paradigm Moving along a specificity spectrum Inter-domain vs. intra-domain terms, models, query mechanisms One size doesn't fit all –Cognitive models of searching and browsing

Cornell CS 502 Metadata Takes Many Forms

Cornell CS 502 Metadata: Part of the problem cost functionality AACR2/MARC google Dublin Core

Cornell CS 502 Metadata Challenges Accommodate multiple varieties of metadata –community-specific functionality, creation, administration, access Tensions –functionality and simplicity –extensibility and interoperability –human and machine creation and use

Cornell CS 502 Interoperability has many facets Semantics –Meaning/classification/ontology Models/Structure –Entities and relationships Syntax –grammars to convey semantics and structure

Cornell CS 502 Warwick Framework: Containing Chaos Conceptual Architecture for metadata from the Warwick Metadata Workshop (DC-2) Conceptual architecture to support the specification, collection, encoding, and exchange of modular metadata Provide context for metadata efforts (including Dublin Core) –avoids the “black-hole” of comprehensive element sets –focuses interoperability issues at package level

Cornell CS 502 Metadata Container Container Package Dublin Core Package MARC record Package Indirect Reference Package Terms and Conditions URI

Cornell CS 502 Modularization Allows Distributed Management Communities of expertise (not software vendors) are responsible for: –Semantics –Registration –Administration –Access management –Authority of data –Sharing and Distribution

Cornell CS 502 Modularization Implementation Issues Data encoding Semantic interaction of overlapping sets –between semantically-related packages –between semantically distinct packages Type registry

Cornell CS 502 Dublin Core Metadata Initiative A simple set of properties to support resource discovery on the web (fuzzy search buckets)? A cross-domain switchboard for interoperable metadata? An extensible ontology for resource desciption?

Cornell CS 502 The fifteen Dublin Core Elements

Cornell CS 502 A Pidgin for Digital Tourists Metadata is language Dublin Core is a small and simple language -- a pidgin -- for finding resources across domains. Speakers of different languages naturally "pidginize" to communicate –E.g., tourists using simple phrases to order beer ("zwei Bier bitte" "dva pivo" "biru o san bai"...) We are all "tourists" on the global Internet.

Cornell CS 502 A Grammar of Dublin Core r.html By design not as subtle as mother tongues, but easy to learn and extremely useful in practice Pidgins: small vocabularies (Dublin Core: fifteen special nouns and lots of optional adjectives) Simple grammars: sentences (statements) follow a simple fixed pattern...

Cornell CS 502 Example Dublin Core statements Resource has Title 'Grammar of Dublin Core'. Resource has Creator 'Tom Baker'. Resource has Subject 'Metadata'. Resource has Relation

Cornell CS 502 Resourcehasproperty DC:Creator DC:Title DC:Subject DC:Date... X implied subject implied verb one of 15 properties property value (an appropriate literal)