Digital Preservation Seminar Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library Authors: Jeroen Bekaert, Patrick Hochstenbach & Herbert Van de Sompel Presenter: Rabia Haq 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar Introduction Digital Objects in the LANL Repository MPEG-21 Digital Item Declaration Language - DIDL Use of MPEG-21 DIDL to represent LANL Repository Conclusion 1/14/2019 Digital Preservation Seminar
Digital Objects at LANL XML-packaging required that supports Datastreams of various media-types Secondary data – metadata supporting - discovery - digital preservation - rights management Persistent Identifiers 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar MPEG-21 DIDL A DIDL complex object – Digital Item Declaration (DID) Each received data item is a DID All DIDs wrapped into one large XML file Note: data items are enclosed within <didl:Container> tags 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar DIDL – Data Model Container <didl:Container> Item <didl:Item> Component <didl:Component> Resource <didl:Resource> Descriptor <didl:Descriptor> LANL defined a DIDL profile conforming to - MPEG-21 DIDL Schema & - self-defined Schematron Schema 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar DIDL – Data Model 1/14/2019 Digital Preservation Seminar Figure 1 from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34
Digital Preservation Seminar Descriptors Provide flexibility to the Data Model Associated with parent entity Convey secondary information such as - Identification Information -MPEG-21 Part 3 : DII - Processing Information -MPEG-21 Part 10: DIP - Rights Information MPEG-21 Part 5: REL/ Part 4 : IPMP 1/14/2019 Digital Preservation Seminar
DII – Digital Item Identification Descriptors used to assign persistent identifiers to all entities - Container, Item, Component, Descriptor <dii:Identifier> Important, as the DIDL profile is id-centric Terry Harrison mentioned that a data item is useless without any code, or processing Item to access that data item. The PI serves that purpose. Good from a digital preservation point of view. 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar DII - example <didl:Item> <didl:Descriptor> <didl:Statement mimeType="text/xml; charset=UTF-8"> <dii:Identifier xmlns:dii="urn:mpeg:mpeg21:2002:01-DII- NS"> urn:isbn:0-395-36341-1</dii:Identifier> </didl:Statement> </didl:Descriptor> … </didl:Item> dii:Identifier (Item level) Table 2 from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 1/14/2019 Digital Preservation Seminar
DIP – Digital Item Processing Provides architecture for disseminating DIDs New Item introduced – Processing Item (PI) - <dip:…> ObjectType – link between entity and Processing Item <dip:ObjectType> value = <dip:Argument> of PI An entity can have multiple ObjectTypes A PI can bind to more than one entity 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar DIP - example <didl:Item> … Content <!-- ObjectType of Item --> <didl:Descriptor> <didl:Statement mimeType="text/xml; charset=UTF-8"> <dip:ObjectType xmlns:dip="urn:mpeg:mpeg21:2002:01-DIP-NS"> urn:my:Argument</dip:ObjectType> </didl:Statement> </didl:Descriptor> …</didl:Item> Processing Item <didl:Item> … <didl:Descriptor> <!-- Argument of processing method --> <didl:Statement mimeType="text/xml; charset=UTF-8"> <dip:Argument xmlns:dip="urn:mpeg:mpeg21:2002:01-DIP-NS"> urn:my:Argument</dip:Argument> </didl:Statement> </didl:Descriptor> <didl:Resource mimeType="…">…Link to processing code…</didl:Resource> … </didl:Item> Excerpt from Table 3 from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 1/14/2019 Digital Preservation Seminar
REL – Rights Expression Language Descriptor associates rights expressions with DIDs and contained entities MPEG-21 Intellectual Property Management and Protection (IPMP) - provides tools to enforce rights expressions declared by REL 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar REL/IPMP - example <didl:Item> … <didl:Descriptor> <didl:Statement mimeType="text/xml; charset=UTF-8"> <r:license xmlns:r="urn:mpeg:mpeg21:2003:01-REL-R- NS"> <!-- optionally, specific rights can be added here.--> <r:otherInfo> <dc:rights xmlns:dc="http://purl.org/dc/elements/1.1/"> Copyright 2003; American Physical Society</dc:rights> </r:otherInfo> </r:license> </didl:Statement> </didl:Descriptor> … </didl:Item> Table 4 from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 1/14/2019 Digital Preservation Seminar
MPEG-21 DIDL usage at LANL – DIDL Profile LANL DIDs compliant with both MPEG21 and self-defined schemas types of DIDs in examples - a PDF technical Report - a MARC record - an XML representation of the MARC record All three considered independent data items Relationships established through self-defined Descriptors 1/14/2019 Digital Preservation Seminar
DIDL profile DID structure 1/14/2019 Figure 2 from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 Digital Preservation Seminar
LANL’s usage of Descriptors Identifiers – mandatory for all DIDs PlaceHolders for Processing Items Defining Relationships Creation Date 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar Identifiers <dii:Identifier> Two types of DID Identifiers - DID identifier of entities – Container, Item, Processing Item… DID identifier : urn:uuid:10ba6842-ec45-3b19-8kub- hy8ff58c58a8b - Content identifier of content - technical reports, MARC records... Technical Report Identifier: urn:bar:99-6537 Identifiers from Table 5 from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 1/14/2019 Digital Preservation Seminar
PlaceHolders for Processing Items A change in Processing Item of an entity requires a DIDL update. Each update results in a new DIDL version Inappropriate for LANL’s static data items PLaceHolders for Processing Items - <diph:PlaceHolder> - PlaceHolders replaced with ObjectType and Processing Item during dissemination of DID 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar Relationships Descriptor to express relationships between entities - Container, Item, Component … - entities and resources external to that entity A Digital Item Relations XML – Namespace - <dir:Relations> Contains Resource Descriptive Framework (RDF) statements, “isDerivationOf”, “isPartOf”, “isTranslationOf”, “isDescriptiveMetadataOf”, etc. more statements can be defined as required 1/14/2019 Digital Preservation Seminar
Relationships - example <didl:Item> … <didl:Descriptor> <didl:Statement mimeType="text/xml; charset=UTF-8"> <dir:Relations xmlns:dir="http://library.lanl.gov/2003-11/MPEG-21/DIR"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xml:base="urn:uuid:10ba6842-ec45-3b19-8kub-hy8ff58c58a8b"> <rdf:Description rdf:about="#//didl:Item[1]”> <a:isPartOf xmlns:a="http://purl.org/dc/terms/#"> <rdf:Description rdf:about="info:sid/library.lanl.gov:lanl-opac"> <b:hasType xmlns:b="http://…/Relations#" rdf:resource="http://…/Relations#Collection"/> </rdf:Description> </a:isPartOf> <rdf:Description> <rdf:Description rdf:about="#//didl:Item[1] "> <b:isDescriptiveMetadataOf xmlns:b="http://…/Relations#" rdf:resource="#//didl:Item[2] "/> </rdf:Description> </rdf:RDF> <dir:Relations> </didl:Statement> </didl:Descriptor> … </didl:Item> Table 8 from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar Creation Date Descriptor to contain creation datetime of each entity – Container, Item, Component Digital Item Date Time Namespace <didt:Created> 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar Creation Date <!-- Creation-datetime of Container --> <didl:Descriptor> <didl:Statement mimeType="text/xml; charset=UTF-8"> <didt:Created xmlns:didt="http://library.lanl.gov/2003- 09/MPEG-21/DIDT"> 2003-09-05T21:51:01Z</didt:Created> </didl:Statement> </didl:Descriptor> Excerpt from Table 9, from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 1/14/2019 Digital Preservation Seminar
Digital Preservation Seminar Conclusion LANL utilized MPEG-21 DIDL Data Model - Container, Item, Component, Resource, Descriptor Utilized flexibility provided by Descriptors Defined Namespaces for - Identifiers – DII - PlaceHolders for Processing Items - Rights Expressions - Creation Date and Time 1/14/2019 Digital Preservation Seminar
Example – Item hierarchy …<didl:Item> Descriptor for dii:Identifier content identifier of item Descriptor for didt:Created Creation Date of item Descriptor for defining relationships of item <didl:Component> Descriptor for PlaceHolder of datastream Description for didt:Created Creation Date of datastream didl:Resource containing actual datastream didl:Recource containing datastream reference </didl:Component> </didl:Item> … Excerpt from Appendix A, from Bekert, Hochstenbach, Sompel http://www.dlib.org/dlib/november03/bekaert/11bekaert.html#34 1/14/2019 Digital Preservation Seminar