Collection/Item Metadata Relationships Allen H. Renear, Karen M. Wickett, Richard J. Urban, David Dubin, Sarah L. Shreeves Center for Informatics Research.

Slides:



Advertisements
Similar presentations
OAForum – September 2003 Muriel Foulonneau Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Muriel.
Advertisements

Collection-level description & collection management: tool for the trade or information trade-off? Collection Description Focus Workshop 4 Newcastle, 8.
Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of.
An overview of collection-level metadata Applications of Metadata BCS Electronic Publishing Specialist Group, Ismaili Centre, London, 29 May 2002 Pete.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Can We Talk? MICHAEL Conference London May 23, 2008Joyce Ray.
Collection-level description & the Information Landscape: users evaluate strategies for resource discovery Collection Description Focus Workshop 5 Cambridge,
Collection description & Collection Description Focus JISC/DNER Moving Image & Sound Cluster Steering Group meeting, HEFCE Office, London, 24 September.
UKOLN is supported by: Bridget Robinson and Ann Chapman From analytical model to implementation and beyond CD Focus Schema Forum, CBI Conference Centre.
Towards consensus on collection-level description Collection Description Focus Briefing Day 1 British Library, St Pancras, London 22 October 2001 Bridget.
An introduction to collections and collection-level description Collection-Level Description & NOF-digitise projects NOF-digitise programme seminar, London,
Libraries in the New Research Environment Joyce Ray NAS/BRDI Symposium Associate Deputy for Libraries June 3, 2010.
Alexandria Digital Library Project The ADEPT Bucket Framework.
Investing in Collection Representation for More Useful Repositories Carole L. Palmer Center for Informatics Research in Science and Scholarship (CIRSS)
NSF – DLF – JISC/UKOLN Digital Library Service Registry Workshop National Science Foundation, Arlington, VA March 2006 The University of Illinois.
The NSDL Registry Diane Hillmann  Jon Phipps. What We’re Doing Received an NSF grant in Oct. 2006, to: Register metadata schemas, vocabularies, application.
NSDL 2 nd Generation Mathematics Digital Library ASEE Annual Meeting June 13, 2005 Portland, OR William H. Mischo
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Usability Evaluation of Digital Libraries Stacey Greenaway Submitted to University of Wolverhampton module Dec 15 th 2006.
Assessing Descriptive Substance in Free-Text Collection-Level Metadata Oksana L. Zavalina, Carole L. Palmer, Amy S. Jackson, Myung-Ja Han Center for Informatics.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.
IMLS NLG Collection Registry & Item-Level Metadata Repository at the University of Illinois Timothy W. Cole Mathematics Librarian &
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Creating rich shareable metadata: The DLF Aquifer MODS implementation guidelines Sarah L. Shreeves University of Illinois at Urbana-Champaign ALA Annual.
Data Curation Education and Biological Information Specialists DigCCurr 2007 Chapel Hill, April 20, 2007 P. Bryan Heidorn, Carole L. Palmer, Melissa H.
Multiple Interpretations: Implications of FRBR as a Boundary Object Ingbert Floyd Graduate School of Library and Information Science,
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
Sustaining Collection Value: Managing Collection/Item Metadata Relationships Allen H. Renear, Richard Urban, Karen Wickett, Carole L. Palmer, David Dubin.
This IMLS-funded project builds on the success of a program already in place at GSLIS, the Data Curation Education Program (DCEP), a concentration within.
Enhancing Content Visibility in Institutional Repositories: Maintaining Metadata Consistency Across Digital Collections Ahmet Meti Tmava and Daniel Gelaw.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
Metadata in a distributed information environment: Interoperability as recombinant potential Lorcan Dempsey OCLC/SCURL pre-IFLA conference, 15/16 Aug 02.
Common challenges, common issues Lorcan Dempsey School for scanning The Hague, 16 October 2002.
Collection Description Metadata Element Sets Pete Johnston UKOLN, University of Bath Chair, DC CD WG NISO Metasearch Initiative, Task Group 2 Durham, NC,
Metadata Semantic Web Solutions: Web 2.0 and beyond ______________________________ Stephanie Beene, 10/14/2008.
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
Application Profiles: interoperable friend or foe? Rachel Heery Michael Day TEL Milestone Conference Frankfurt am Main, April 2002.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
JISC Information Environment Service Registry (IESR) Ann Apps MIMAS, The University of Manchester, UK.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
1 The NSDL Program Stephen Griffin National Science Foundation.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
11 October Shanghai Library © 2004 DCMI 1 The Dublin Core Metadata Initiative: DC-2004 Introduction Makx Dekkers, Managing Director, Dublin.
IMLS DCC Project Briefing ( ) Jenny Benevento ( ) Timothy W. Cole.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Tracking Metadata Use for Digital Collections Ellen Knutson, Carole L. Palmer, Michael Twidale Graduate.
Research Data Access and Preservation Summit 2012 E-Research Roundtable Center for Informatics Research in Science and Scholarship Graduate School of Library.
Distributed Service Registry Workshop, Warwick, U.K. 1 Distributed Functionality in the UIUC OAI Registry
Carl Lagoze Digital Library Service Registry Workshop Services in a Scholarly Communication Framework.
Surveying the landscape: collection-level description & resource discovery JISC/NSF DLI Projects meeting, Edinburgh, 24 June 2002 Pete Johnston UKOLN,
Collection-level description: from theory to practice Minerva project meeting Paris, 24 January 2003 Pete Johnston UKOLN, University of Bath Bath, BA2.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
DLF Fall Forum The Distributed Library: OAI for Digital Library Aggregation UIUC’s Role: Registry of OAI Data Providers
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
UKOLN is supported by: Library futures in the new research landscape. Dr Liz Lyon, UKOLN, University of Bath, UK CURL Members Meeting October 2004, London.
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Timothy W. Cole Jenny Benevento Muriel Foulonneau
Utility of an OAI Service Provider Search Portal
Metadata for Special Collections
Integrating Access for Information Discovery and More
Research on Data Curation and Repositories
NSDL Data Repository (NDR)
Session 2: Metadata and Catalogues
Metadata in Digital Preservation: Setting the Scene
Metadata Principles Matthew Mayernik air • planet • people
Presentation transcript:

Collection/Item Metadata Relationships Allen H. Renear, Karen M. Wickett, Richard J. Urban, David Dubin, Sarah L. Shreeves Center for Informatics Research in Science and Scholarship (CIRSS) Graduate School of Library and Information Science University of Illinois at Urbana-Champaign DC 2008: International Conference on Dublin Core and Metadata Applications September 24, 2008; Berlin, Germany

Why collection-level metadata is important Collections are designed to support research and scholarship. Toward this end collection descriptions indicate such things as: –purpose –subject –method of selection –spatial/temporal coverage –completeness –representativeness –summary statistical features …etc. These descriptions enable collections to function as more than simply aggregates of items, –as intended by their creators and curators –as required by their users

But unfortunately…. Collection-level metadata is poorly understood and accommodated Most retrieval systems flatten the world, ignoring collection context Retrieval systems that do use metadata use only item-level metadata Even simple discovery is impeded: If the owner of a collection is indicated only at the collection-level, then retrieval accessing only item-level metadata… — cannot usefully process queries constrained by owner — cannot display the owner of item in the result set

Origins of our focus on this problem: DCC IMLS Digital Collections and Content University of Illinois at Urbana-Champaign Grainger Library & Graduate School of Library and Information Science Funded by IMLS, Timothy Cole, Principal Investigator Carole L. Palmer, Sarah L. Shreeves, Michael B. Twidale, Co-Investigators Deliverables… a collection metadata schema Based on RSLP CD and concurrent work on DC Collection Application Profile. a collection-level metadata registry for 202 IMLS digital collections. an item-level metadata repository 76 collections harvested using OAI-PMH. an experimental portal for searching aggregated metadata. [

Among the research findings: Users need collection-level information, for discovery and understanding (Palmer & Knutson, 2004; Foulonneau et al. 2005; Palmer, et al. 2006) But what information? And how to provide it? So we included this problem in our next IMLS proposal… Climax Miners, Leadville, CO. Courtesy Colorado School of Mines

The new project In 2007 the DCC received a new three year IMLS grant Carole L. Palmer, Principal Investigator Timothy Cole, Allen H. Renear, Michael B. Twidale, Co-Investigators A major deliverable: show how a formal description of collection/item metadata relationships can help registry users locate and use digital items across multiple collections. CIMR: Collection/Item Metadata Relationships Three phases: 1)Develop a logic-based framework of collection/item metadata relationships and inference rules. 2)Conduct empirical studies to see if the framework matches the behavior of metadata specification designers, metadata creators, and registry users. 3)Implement pilot applications to support searching, browsing, and navigation; including RDF/OWL formulations and inference rules. Our initial focus is on the Dublin Core Collections Application Profile ( DCCAP ).

Where we are now Phase 1: Develop a logic-based framework of collection/item metadata relationships and inference rules. The next few slides… three simple examples of collection/item metadata relationships

Attribute/Value Propagation: marcrel:OWN Consider the DCCAP metadata element marcrel:OWN… Plausibly: whoever owns a collection owns each of its items We say that metadata attributes with this behavior a/v-propagate. Informal definition an attribute a/v-propagates =df if a collection has some value for the attribute then each item in the collection has the same value for that attribute. Or, in first order logic: An attribute A a/v-propagates =df  x  y  z [(IsGatheredInto(x,y) & A(y,z))  A(x,z) ] [ IsGatheredInto(x,y) is adapted from from the DCMI DCCAP.]

Value Propagation: cld:itemType / dc:type Consider the DCCAP metadata element cld:itemType.* *a refinement, assuming homogeneous collections and no repetition of elements. cld:itemType* does not a/v-propagate… However, if a collection has a value for cld:itemType* then each of its items has the same value for dc:type. We call this v-propagation. Informal definition an attribute v-propagates =df if a collection has some value for the attribute then each item in the collection has that value for some other attribute. Or, in first order logic: An attribute A v-propagates to an attribute B =df  x  y  z [(IsGatheredInto(x,y) & A(y,z))  B(x,z) ]

Value Constraints: cld:dateItemsCreated / dcterms:created cld:dateItemsCreated* does not a/v propagate nor does it v-propagate to dcterms:created However, if a collection has a temporal range for cld:dateItemsCreated*, then its items may not have values for dcterms:created that fall outside that range. this is a constraint: the value of dcterms:created must be temporally-within the range given by cld:dateItemsCreated* Informal Definition an attribute A v-constrains an attribute B with respect to constraint C =df if a collection has the value z for A and an item in the collection has the value w for B, then w is related to z by C. In first order logic: An attribute A v-constrains an attribute B with respect to a constraint C =df  x  y  z  w [(IsGatheredInto(x,y) & A(y,z) & B(x,w))  C(w,z)]

How will the framework help? Metadata specification developers use the framework to classify metadata elements in their specifications. Metadata librarians use these classifications to confirm their understanding of the metadata elements they are assigning. Software architects use these classifications to guide the configuration of inferencing features in retrieval systems.

What is missing? A completed shared framework... a project for the community

Prior work? Of course. Relationships such as those just described have been studied elsewhere — which is a good thing. However as far as we know no one has focused on the IsGatheredInto relationship.

Some research questions how many relationship categories are there? which metadata attributes fall into which categories? when does propagation convert information without loss? what about propagation from items to collections? how expressive a logic is needed for propagation rules? –how much of first order logic? –what extensions to first order logic? (modal, default, …?) –what are the consequences for computational efficiency?

One result: Finishing the job requires modal logic An attribute A a/v-propagates =df I. a) ◇  y  z [Collection(y) & A(y,z)] & b) ◇  x  z [Member(x) & ~A(x,z)] & c) ◇  x  y  z [A(x,z) & ~A(y,z)] & II. ☐  x  y  z [(IsGatheredInto(x,y) & A(y,z) )  A(x,z) ]. See: The Return of the Trivial: Formalizing collection/item metadata relationships. Renear, A.H., Wickett, K.M., Urban, R.J., and Dubin, D. Proceedings of the 8 th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM Press, New York 2008.

Most importantly: Non-Reducible Collection Attributes Some vital collection-level attributes resist conversion to item-level attributes Examples are metadata indicating that a collection -- is complete or incomplete -- is representative (in some respect) -- is heterogeneous with respect to genre or type of object, etc. -- was developed according to some particular method -- was designed for some particular purpose -- has certain summary statistical features …. and so on. These are tightly tied to the distinctive role a collection is intended to play in the support of research and scholarship. If this information is inaccessible, the collection cannot be useful, as a collection, in the way originally intended by its creators.

Questions? We are just getting started and welcome comments and advice. Acknowledgements This research is supported by: The Institute of Museum and Library Services, a federal agency that fosters innovation, leadership, and a lifetime of learning. National Leadership Grant for Research & Demonstration Carole L. Palmer, Principal Investigator Hosted by the: Center for Informatics Research in Science and Scholarship Graduate School of Library & Information Science University of Illinois at Urbana-Champaign Project documentation: We have benefited from many discussions with other DCC/CIMR project members and with participants in the IMLS DCC Metadata Roundtable, including: Thomas Dousa, Myung-Ja Han, Amy Jackson, Mark Newton, Oksana Zavalina, Wu Zheng.

References Arms, W.Y. Dushay, N., Fulker, D. & Lagoze, C. (2003). A case study in metadata harvesting: the NSDL. Library Hi Tech, 21(2), pp. 228 – 237. Brachman, R. J. (1983). What ISA is and isn ’ t: An analysis of taxonomic links in Semantic Networks. IEEE Computer, 16 (10), pp Brachman R. J. et al. (1991). “ Living With Classic: When and how to use a KL-ONE-like language ”, in Principles of Semantic Networks: Explorations in the Representation of Knowledge, ed. John F. Sowa, Morgan Kaufman, pp Brockman, W. et al. (2001). Scholarly Work in the Humanities and the Evolving Information Environment. Washington, DC: Digital Library Federation/Council on Library and Information Resources. Christenson, H. Tennant, R. (2005). Integrating Information Resources: Principles, Technologies, and Approaches. California Digial Library. Currall, J., Moss, M., & Stuart, S What is a collection? Archivaria, 58, Dempsey, L. (2005). From metasearch to distributed information environments. Lorcan Dempsey ’ s Weblog (October 9, 2005). DLF. (2005). The Distributed Library: OAI for Digital Library Aggregation. OAI Scholars Advisory Panel, June 20-21, Washington, DC. Digital Library Federation. DCMI. (2007). Dublin Core Collections Application Profile. Retrieved April 13, 2008, Dushay, N. & Hillmann, D.I. (2003). Analyzing metadata for effective use and re – use. DC – 2003: Proceedings of the International DCMI Metadata Conference and Workshop, [United States]: Dublin Core Metadata Initiative, pp. 161 – 170. Foulonneau, M., Cole, T. W., Habing, T. G., & Shreeves, S. L. (2005). Using collection descriptions to enhance aggregation of harvested item-level metadata. Proceedings of the 5 th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM Press, Gasser, L. & Stvilia, B. (2001). A new framework for information quality. Technical report ISRN UIUCLIS--2001/1+AMAS. Champaign, Ill.: University of Illinois at Urbana Champaign. Guarino, N. & Welty, C. (2004). An overview of OntoClean. S. Staab and R. Studer, eds, The Handbook on Ontologies. Springer. Heaney, M. (2000). An Analytic Model of Collections and Their Catalogues, UK Office for Library and Information Science. Hutt, A. & Riley, J. (2005). Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials. Proceedings of the 5th ACM/IEEE – CS Joint Conference on Digital Libraries, Denver, Colo. (June 7 – 11 June). New York: ACM Press, pp. 262 – 270. Lagoze, C. et al. (2006). Metadata aggregation and “ automated digital libraries ” : A retrospective on the NSDL experience. Proceedings of the 6 th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM Press, New York. Lalmas, M. (1998). Logical models in information retrieval. Information Processing and Management. 34, 1. Lee, H. (2005). The concept of collection from the user ’ s perspective. Library Quarterly, 75(1), Lee, H. (2000). What is a collection? JASIS, 51 (12), Palmer, C. L. (2004). Thematic research collections. S. Schreibman, R.Siemens, & J. Unsworth (Eds.). Companion to Digital Humanities. Oxford: Blackwell, pp Palmer, C.L., and Knutson, E. (2004) Metadata practices and implications for federated collections. Proceedings of the 67 th ASIS&T Annual Meeting (Providence, RI). Palmer, C.L., Knutson, E., Twidale, M., and Zavalina, O. (2006). Collection definition in federated digital resource development. Proceedings of the 69 th ASIS&T Annual Meeting (Austin, TX, Nov. 3-8, 2006). Renear, A. H., Urban, R., Wickett, K., Palmer, C.L., & Dubin, D. (2008a). Sustaining collection value: Managing collection/item metadata relationships. Proceedings of the Digital Humanities conference, June 2008, Oulu, Finland Renear, A.H., Wickett, K.M., Urban, R.J., and Dubin, D. (2008b). The return of the trivial: Formalizing collection/item metadata relationships. Proceedings of the 8 th ACM/IEEE-CS Joint Conference on Digital Libraries ACM Press, New York. Sebastiani, F. (1998). On the role of logic in information retrieval, Information Processing and Management 34, 1. Shreeves, S., Knutson, E., Stilva, B., et al. (2005). Is ‘ Quality ’ Metadata, ‘ Shareable ’ Metadata? The Implications of local metadata practices for federated collections. In H.A. Thompson (ed.) Proceedings of the Twelfth National Conference of the Association of College and Research Libraries, April , Minneapolis, MN. Chicago, IL: Association of College and Research Libraries. pp Stvilia, B., Gasser, L., Twidale, M., Shreeves, S.L. & Cole, T.W. (2004). Metadata quality for federated collections. Proceedings of ICIQ04 — 9 th International Conference on Information Quality. Cambridge, MA: 111 – 25. van Rijsbergen, C. J. (1986). A non-classical logic for information retrieval. The Computer Journal 29,6. Warner, S., Bekaert, J., Lagoze, C., Lin, X., Payette, S., & Van de Sompel, H. (2007). Pathways: Augmenting interoperability across scholarly repositories. International Journal on Digital Libraries. Wendler, R. (2004). The eye of the beholder: Challenges of image description and access at Harvard. In Hillmann, D. I. and Westbrooks, E. L., eds., Metadata in Practice. American Library Association, Chicago, IL, pp Woods, W. (1975). What ’ s in a link: Foundations for semantic networks. Representation and Understanding, D. Bobrow and A. Collins (eds.), Academic Press.