Presentation is loading. Please wait.

Presentation is loading. Please wait.

Medusa at the University of Illinois

Similar presentations


Presentation on theme: "Medusa at the University of Illinois"— Presentation transcript:

1 Medusa at the University of Illinois
A Digital Preservation Repository Built Upon PREMIS Kyle Rimkus Preservation Librarian University of Illinois Urbana-Champaign presented October 2, 2012 iPres 2012, Toronto

2 National Digital Infrastructure and Information Preservation(NDIIPP) Program grants
Phase I : Phase II: Background of dp at UIUC… “Hub and Spoke” (HandS tool suite):

3 HandS METS Profile This approach was distinguished by several key factors:   the reliance on PREMIS for digital preservation metadata the reliance on MODS for descriptive metadata the packaging of PREMIS, MODS, and other associated metadata and file information in the METS format, using a METS profile designed specifically for this project, to describe the relationships

4 Medusa is Born The central idea of our PREMIS implementation is that it is platform and infrastructure independent. The PREMIS records that describe our digital objects do so in such as way that the system in which they are currently managed is of little consequence – the emphasis being placed not on the software, but the objects in it and the records that describe them.

5 PREMIS in Medusa The central concept here is that of the self-describing, encapsulated object. That is, every digital asset stewarded in Medusa – whether a content or metadata file – is assigned a unique ID and an associated PREMIS file which tells the story of that item. We dislike the practice common to many repository platforms where, for example, digital content files live in one place, such as a file server, and metadata lives in a database. In such an infrastructure, there is an inherent risk to the long-term viability of digital objects, as their constituent parts are split up across a variety of systems subject to their own specific risk factors. We also like this because we are not leaving any metadata of importance in a database or other external application; we make sure to store all digital preservation metadata and relation metadata in our PREMIS files.

6 PREMIS “relationship” vs METS “structMap”
Intellectual entities Rights Objects Agents Events …we do this without METS. We found from our Hub and Spoke experience that there is considerable overlap between many of the fields available in METS and PREMIS – you have to choose whether to place file information such as file type and file size, among other things, in one, the other, or both. When you come down to it, in fact, and this is perhaps a gross oversimplification – but for the sake of our discussion, let’s say that the one thing you have in METS that you do not necessarily have in PREMIS is the one required METS tag of “structMamp” or “structure map” for indicating the structure of METS packages and the relationships between metadata and items. As anyone who has worked with METS knows, there can be considerable overhead involved in expressing such relationships in consistently valid XML, especially if you end up going down the path of generating vast METS files for complex objects.

7 PREMIS Controlled Vocabularies
Relationship Types and Subtypes: …taking a linked-data inspired approach to the use of the PREMIS “relationship” tag – preferring the flexibility it offers to construct a web of arbitrary relationships between digital assets with a rather simple structure rather than the more baroque, strictly hierarchical alternative offered by METS records.

8 A PREMIS snippet <relationship> <relationshipType>BASIC_IMAGE_ASSET</relationshipType> <relationshipSubType>PARENT</relationshipSubType> <relatedObjectIdentification> <relatedObjectIdentifierType>LOCAL</relatedObjectIdentifierType> <relatedObjectIdentifierValue>MEDUSA:4052dc68-7c0f-420b-9d07-840c79768ae9-2</relatedObjectIdentifierValue> </relatedObjectIdentification> </relationship> By “linked data inspired,” and we’ve also talked about this being inspired by object oriented programming, I mean the following. The “relationship” tag, by allowing a “relationshipType” and a “relationshipSubType” to refine it, the PREMIS standard offers a considerable amount of flexibility.

9 PREMIS Controlled Vocabularies
What we end up with, in effect, is the ability to define any type of relationship we want between any single asset and any number of others; and have similar flexibility with defining Events, Agents, and Rights specific to assets in our digital preservation environment. Currently, our technical team is creating these terms as they go along, and often have a very loose connection to controlled vocabularies.

10 A PREMIS Archival Information Package
…system is still under development.

11 …sample of some actual XML generated by one of our test packages.

12 Questions? Public documentation coming soon (before 2013) at:
LibraryDigitalPreservation/Home


Download ppt "Medusa at the University of Illinois"

Similar presentations


Ads by Google