Metadata: practice and practice Lorcan Dempsey VP Research and Chief Strategist CLIR/DLF. Managing Digital Assets: A Primer for Library and Information Technology Administrators Charleston, SC February 4-6, 2005 Welcome. My name is Jay Jordan and I am President and CEO of OCLC. I am joined today by Cathy De Rosa, Vice President of Corporate Marketing for OCLC. We are pleased to be able to discuss with you today a report that we recently completed for the OCLC Membership – The 2002 OCLC Environmental Scan – Pattern Recognition.
Overview Part 1 Part 2 Part 3 Part 4
Some themes Consolidation: fragmentation gets in the way Industrialization: much of our metadata creation is a cottage industry: current approaches will not scale Cost: value Intellectual and machine: need to work harder to programmatically create metadata Institutions and service: move from projects to service
Example: Metasearch/portal Metadata is everywhere ;-)
A ‘portal’ turned inside out … Common services I need a few references Content services Application services Presentation services
authentication Common services Content services Application services Presentation services
Directory: user profile Common services Content services Application services Presentation services
Query broker Common services Content services Application services Presentation services
Directory: service/collection description Common services Content services Application services Presentation services
Content: results list Common services Content services Application services Presentation services
I’d like to get this book. Common services I’d like to get this book. Request broker Content services Application services Presentation services
Directory: ILL policy Common services Content services Application services Presentation services
Directory: service/collection description Common services Content services Application services Presentation services
Content: circ/ILL system Common services Content: circ/ILL system Content services Application services Presentation services
I need this article too. Request broker Common services Content services Application services Presentation services
openURL resolver Common services Content services Application services Presentation services
Directory: local knowledge base Common services Directory: local knowledge base Content services Application services Presentation services
Nearly there … Directory: service/collection description Common services Content services Application services Presentation services
Content: article Common services Content services Application services Presentation services
Metadata for multiple entities required to support operations. Directory: service/collection description Directory: user profile Directory: ILL policy Authentication Common services Directory: local knowledge base Reference db OpenURL resolver Circ/ILL system Article db Metadata for multiple entities required to support operations. This picture could be extended in multiple ways. Request broker Query broker Content services Application services Presentation services
Metadata as intelligence … Know what resources are available Know how to play a resource Know provenance of a resource Know what use policy governs a resource Know how to ingest a resource Know how to interact with a resource Know how to compose/decompose resources …
… allows people and machines to work smarter … Metadata? … allows people and machines to work smarter …
Metadata? Schematized … … statements about … … resources
Something about Resources
Resources: everything that moves Multiple types of information Objects Collections Services People Organizations Places Terms Formats Rights Business terms License … and will support multiple operations Discovery to delivery Digital asset management Publishing interfaces: intersections between user information spaces and library information spaces
Different classes of metadata increasingly a part of complex object models Descriptive Structural Technical Administrative Rights Preservation Tracking Provenance SCORM/Content package METS MPEG 21 …
* * * * Community? EAD, MARC AMC, .. MARC, MODS, DC, RSLP, .. Onix, … XML, RDF, OWL, … CSDGM, DDI, NBII, IVOA, … EGMS, AGLS, GILS, … GEM, DC-ED, IEEE-LOM, SCORM, … MPEG, JPEG, TIAA-CREF… * * *
So … More than discovery … More than information objects … More than library …
Something about schematized
Simple descriptive metadata!! Application profile Cataloging rules Controlled vocabs. … FRBR INDECS CIDOC … ‘Element set’ Values/content Information model MARC21 DC VRA Core MODS Onix … Encoding XML ISO2709 …
OAI
OAI-based mediation OAI Server#1 Merged resource OAI Harvester Web Browser Merged resource OAI Server#1 OAI Server#2 OAI Server#3 OAI Harvester
OAI A way of ‘publishing’ processable metadata on the network A way of synchronizing databases And … The same for resources themselves? A nice building block for other services
An example Following pages show some experimental services where OAI is used to ‘publish’ metadata. There is a WIKI interface to metadata stores managed under OAI.
Front page to Jeff Young’s Wiki registries and services (OCLC internal)
1 of 3
Tools Edit
(1 of 3) Wiki editing view of METS record – user can input/modify metadata and references (future enhancement planned: drop-downs for schema – we’ll keep instances of the schema in an OAI file, and build a registry of schema)
3 of 3 User completes process, update will automatically update METS record
Interoperability Recombinant potential Economic and service issues Cost
Interoperability a factor at all these levels .. For example .. Encoding Element set Content/values Encoding Element set Content/values Examples: Z39.2/MARC/AACR DC OAI
Importance of agreements DC profile Vocabularies
This gives a context for discussing … Traditional library practice Strive for consistency at all three levels in the ISO 2709/MARC/AACR model Institutionalised in standards, OCLC/RLG/LC, committees, … Dublin Core Consistency of element set A small number of encodings Content/values subject to separate agreement OAI A transport for resources. No control over the transported resources So …
Something about collections
Collections grid Stewardship Uniqueness Books high low Books Journals Newspapers Gov. docs CD, DVD Maps Scores Freely-accessible web resources low Uniqueness Special collections Archives Rare books Local history materials Archives & Manuscripts Theses & dissertations Research and learning materials ePrints/tech reports Learning objects Courseware E-portfolios Research data Untransferred records high
Collections grid disclosure Publishing Amazoogle D2D Reformatting high low Publishing Amazoogle D2D low Reformatting high E-learning E-research Cultural heritage Digital asset management
Some observations Metadata Cost/value Routine? Consolidated? Bought materials Licensed materials Special collections/archives Research and learning materials high low Stewardship Uniqueness Metadata Cost/value Routine? Consolidated?
Making data work Reading in the dark
Some thoughts … Fragmentation Consolidation Fragmentation reduces gravitational pull Fragmentation increases cost Consolidation Services Processing Mobilize collective capacity Routinization/industrialization Programmatic extraction of metadata from digital resources Agreement Plural disclosure Want to make stuff available in lots of ways
Thank you! Lorcan Dempsey DempseyL@oclc.org