Download presentation
Presentation is loading. Please wait.
Published byDarlene Marshall Modified over 9 years ago
1
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian makw@msu.edu Aaron Collie Digital Curation Librarian collie@msu.edu
2
What we wanted What we found What we did METS
3
We bridged a gap, but we didn’t close the bridge METS
4
METS 1.8 METS Archivematica Output: METS.xml AIP DIP METS Fedora Commons Input: METS Fedora Extension fedora-batch-ingest.sh Datastreams! Staging 12 TB Staging 12 TB Dark Archive 84 TB Dark Archive 84 TB Serving 5 TB Serving 5 TB AIP DIP AIP
5
METS METS Fedora Ext. 1.1 XSL Humans Staging 12 TB Staging 12 TB METS 1.8 DIP AIP ProQuest
6
Persistent ID (PID) METS PQ_DATA DC MODS PREMIS (…) METS DIP AIP (“E”) METS Fedora Ext. 1.1
7
Why? We wanted to be able to control and systematize ingest at the microservice level And we like the direction Archivematica is taking We wanted to pipe technical and preservation metadata into Fedora Commons This was the reason we got started We haven’t contributed to the open source community, and we wanted something to learn on. We are thinking of it as professional development…
8
Comparing Archivematica & Fedora METS Different schema Archivematica: METS v. 1.8 http://www.loc.gov/standards/mets/version18/mets.xsd Fedora: Fedora METS 1.1 http://fedora-commons.org/definitions/1/0/mets-fedora-ext1-1.xsd Differences in structure, elements, attributes, & values allowed
9
Archivematica Physical structMap of the bag (i.e. directory structure) Fedora: No per v.1.1* Solution: Structure represented by & attributes of by page no. embedded in filename only physical arrangement is possible unless changing file naming convention to include logical info by file type/usage (e.g. preservation master, high/low resolution access copies) * is allowed in schema v.1.0 (used until Fedora 3.0)
10
Archivematica Two file groups: “Original” & “Submission documentation” – Original: digital objects – Submission documentation: descriptive metadata XML files Fedora Datastreams to be ingested as files – Files of digital objects and others (e.g. Archivematica METS) – Descriptive metadata XML files are ingested as “inline XML datastreams” » Copy all XML files in “Submission documentation” into separate elements
11
Archivematica: Hierarchical structure … … 1 digital file has 1 All,, and pertaining to the same file are nested under the same
12
Fedora: Flat structure To accommodate inline XML datastream versioning – ID (syntax DSn.v) contains both: » the number of the inline datastream (n) and » the version number of the datastream (v) – Individual serves as container and its ID serves to indicate datastream number – and alike have their IDs to indicate datastream version number
13
attribute in Archivematica Pointing to one, which has,,, and nested within, per file – Fedora Pointing to multiple, each of which contains,,, or, per file –
14
Archivematica Only 1 Dublin Core record is allowed to describe the SIP Constrained by Archivematica workflow instead of METS schema Additional descriptive metadata XML records are included in “Submission documentation” folder Fedora Fedora extension element: Allowed MDTYPE: MARC, EAD, DC, NISOIMG, LC-AV, VRA, TEI Header, DDI, FGDC, & OTHER Copy XML files in “Submission documentation” folder into separate – MODS has to be labeled as “OTHER” – Use namespace URI to assign correct “MDTYPE” » Does not work with TEI Header or EAD
15
Archivematica Does not use (optional in METS schema) Fedora attribute to indicate whether the object is “active”, “inactive” or “deleted” Hard-coding in with constant data MSU Libraries Digital and Multimedia Center
16
attribute in Archivematica Does not use (optional in METS schema) Fedora To indicate whether the file is “managed by Fedora internally”, “externally referenced”, or “redirected” – Though optional according to Fedora-METS schema Determine based on filename or file format – Archivematica add “checksum” into filename for files generated during the preservation workflow
17
Proposed Workflow 84TB Dark Archive(s) Staging Area 12 TB Staging Area 12 TB Serving Share(s) Web Display METS AIP DIP
18
What a bridge gets us: Automatically extracts and captures technical & preservation metadata Eases handling of complex objects with lots of metadata or parts Maintains and manages separate AIP/DIP packages METS
19
What a full integration might benefit from: Archivematica A/DIP Content Model & Solution Pack Integrated AIP management Including dashboard GUI Including JMS messaging Integrated rebuilds from filesystem Currently supported in Fedora Commmons On Roadmap for archivematica Automated ingest, improved handling
20
Questions? Lucas Mak (makw@msu.edu) Aaron Collie (collie@msu.edu)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.