Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment Muriel Foulonneau Grainger Engineering Library University of Illinois at Urbana-Champaign UIUC June 2006
2 June 15th, 2006 University of Illinois at UC Outlines Improving resource discoverability Hidden Web, portals and distributed digital libraries Interoperability Metadata and protocols The Open Archives Protocol for Metadata Harvesting The protocol, examples of services and repositories Issues for digital libraries of distributed objects
3 June 15th, 2006 University of Illinois at UC Improving resource discoverability
4 June 15th, 2006 University of Illinois at UC Sharing content New services, new representations of the content, new audiences Bring your content to attention of new users outside your immediate community 37% of visits to images of the State Library of New South Wales came from the PictureAustralia portal in 2002/3
5 June 15th, 2006 University of Illinois at UC Integrated Access to CIC Metadata
6 June 15th, 2006 University of Illinois at UC Thematic access to resources
7 June 15th, 2006 University of Illinois at UC Russian Publics collection at UIUC
8 June 15th, 2006 University of Illinois at UC On the CIC metadata portal
9 June 15th, 2006 University of Illinois at UC Search on Google
10 June 15th, 2006 University of Illinois at UC Multiple services use different features Full text Metadata Collection descript. Metadata AND resources Metadata Metadata AND resources
11 June 15th, 2006 University of Illinois at UC Interoperability
12 June 15th, 2006 University of Illinois at UC Content and services Building services => New services need content with similar features Collection service
13 June 15th, 2006 University of Illinois at UC What is interoperability Interoperability is the capacity for different systems to talk to each other I need A standard language An interpreter “ ” - this is a month - 01=“Jan”
14 June 15th, 2006 University of Illinois at UC Various types of interoperability Technical Protocols, hardware, … Mac/PC, Netscape/IE … Organizational Who is in charge? Competence? Politics? Update? Rules Content – related = metadata What do you talk about? The “item” = Granularity and nature of the object Semantic : date…. Created? Published? Syntactical : 04 January 2004 Linguistic : 04 Enero 2004
15 June 15th, 2006 University of Illinois at UC Metadata Are used to Manage Provide information Retrieve Preserve Define rights and conditions of use Describe structure Descriptive Administrative Structural
16 June 15th, 2006 University of Illinois at UC A metadata format Is a set of elements or information, mandatory or not, to apply together in order to reach one of the above mentioned objectives Standard As a text As a DTD in SGML As a Xschema in XML => MARC, EAD, MODS, Dublin Core, LOM, MPEG7, MyHomeCookedSchema …
17 June 15th, 2006 University of Illinois at UC The Dublin Core Metadata Element Set 15 elements ContentIntellectual property Instantiation Coverage Description Relation Type Source Title Subject Rights Contributor Publisher Creator Language Identifier Format Date
18 June 15th, 2006 University of Illinois at UC Where metadata lay “Internal” Webpage Embedded TEI, EAD External Catalogs XML records … Includes a link to the resource => Third party metadata Library of Congress home page The Library of Congress
19 June 15th, 2006 University of Illinois at UC Sharing metadata : Federated search My user wants “mills”…. Whatever that comes from Federated search Mill? My resource 04 Eg. Z39.50, SRU/SRW, WAIS
20 June 15th, 2006 University of Illinois at UC Sharing metadata : Data agregation The portal gathers metadata (and resources?) Mill? My resource 04 Eg. Search engines, union catalogs, OAI
21 June 15th, 2006 University of Illinois at UC OAI divides the world between data providers and service providers
22 June 15th, 2006 University of Illinois at UC The OAI framework Service provider Harvester Repository Data provider Repository Data provider Repository Aggregator
23 June 15th, 2006 University of Illinois at UC OAI repositories can be organized in sets
24 June 15th, 2006 University of Illinois at UC Honoré Daumier Lithograph (Brandeis University) MARC Record In XML Dublin Core Record In XML Qualified Dublin Core RecordQualified Dublin Core Record MODS record Multiple representations of an object
25 June 15th, 2006 University of Illinois at UC OAI is based on standards HTTP protocol XML XML Schemas Dublin Core
26 June 15th, 2006 University of Illinois at UC OAI supports 6 verbs Identify ListSets ListRecords dc dc ListMetadataFormats ListIdentifiers ai_dc GetRecord hotos.grainger.uiuc.edu:AP-1A &metadataPrefix=oai_dc
27 June 15th, 2006 University of Illinois at UC An OAI response - oai:images.library.uiuc.edu:emblems/ emblems - Müller, Johann Heinrich Traugott,
28 June 15th, 2006 University of Illinois at UC Examples of repositories Library of Congress ContentDM at UIUC bin/oai.exe Ohio State Knowledge Bank
29 June 15th, 2006 University of Illinois at UC Examples of services
30 June 15th, 2006 University of Illinois at UC Turn key systems and modules CWIS : ContentDM : Digitool : DSpace : EPrints : DLXS: OAICat: XMLFile: DLESE OAI software:
31 June 15th, 2006 University of Illinois at UC Useful tools UIUC OAI registry OAI repository explorer Errol
32 June 15th, 2006 University of Illinois at UC Digital libraries of distributed objects
33 June 15th, 2006 University of Illinois at UC Metadata shareability issues Granularity Loss of context Completeness DLF-NSDL Best practices on shareable metadata
34 June 15th, 2006 University of Illinois at UC What is behind URLs
35 June 15th, 2006 University of Illinois at UC Conveying actionable URLs ViewResizeSelect Annotate Share
36 June 15th, 2006 University of Illinois at UC Conclusions Interoperability: technical, content-related and organizational, well OAI is the easy part Works even better for particular communities with similar organizational structures and metadata formats Extensions of the protocol for: Objects Actionable URLs
37 June 15th, 2006 University of Illinois at UC References and useful material The Open Archives Website DLF/NSDL best practices for OAI and shareable metadata OAForum Tutorial Getting a Leg Up on OAI Science_Digital_Library_Conference.doc