Presentation is loading. Please wait.

Presentation is loading. Please wait.

PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.

Similar presentations


Presentation on theme: "PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009."— Presentation transcript:

1 PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009

2 2 General Archival Information Package (AIP) AIP is just a conceptual entity Conceptual (generic) data model Content files stored on write once media Content files may be containerized (stored in ZIP or WARC files) One or more containers per AIP; files in containers may belong to various AIPs AIP Descriptor: METS file describes the content of the AIP structure, files, descriptive metadata, preservation metadata Different METS profiles for different content streams eJournals, newspapers (born digital and digitized), web archiving Common underlying document model for all AIPs

3 3 METS Descriptor What is stored in the METS Descriptor? Structure of the document (logical and physical in different structMaps) Not all content streams have two structMaps (born digital streams have only on) Descriptive metadata File Section Defines container files as well as content files (nested elements)

4 4 METS Descriptor What is stored in the METS Descriptor? Structure of the document (logical and physical in different structMaps) Not all content streams have two structMaps (born digital streams Descriptive metadata File Section Defines container files as well as content files (nested elements) Preservation metadata Preservation metadata for files and representations

5 5 METS Descriptor What is stored in the METS Descriptor? Preservation metadata: Preservation metadata for files and representations Focusses on: Audit trail – events and agents Technical metadata – basic technical metadata in METS and PREMIS Assumption: future migrations of files necessary No emulation considered; no environment information stored elements

6 6 Preservation Metadata (PREMIS) in METS Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove output Newspapers uses PREMIS 2.0; MODS 3.3; METS 1.8 Web Archiving uses PREMIS 2.0; MODS 3.3; DC; METS 1.8

7 7 Preservation Metadata (PREMIS) eJournal content stream Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove output AIP model: One AIP per article, issue, journal, digital manifestation Any changes will lead to a new AIP; old version of AIP is referenced

8 8 Preservation Metadata (PREMIS) eJournal content stream Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove output AIP model: One AIP per article, issue, journal, digital manifestation Journal, Issue, Article: AIP consists just of a METS descriptor (mainly descriptive metadata (MODS) embedded and preservation metadata: PREMIS: regarded as representations of intellectual entities Relationships between representations are recorded in MODS record

9 9 Preservation Metadata (PREMIS) eJournal content stream Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove dtd AIP model: One AIP per article, issue, journal, manifestation Digital Manifestation: AIP consists of content files and METS descriptor. METS descriptor contains PREMIS records for files and one for the Digital Manifestation itself Relationships to article recorded in PREMIS record (manifestationOf) Relationships to submission is recorded in PREMIS (containedInSubmission) Submission: received content files in ZIP (one AIP)

10 10 Preservation Metadata (PREMIS) and METS: eJournal content stream Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove output amdSec: one amdSec per PREMIS record; referenced from and elements Use of ; ; elements techMD: Extracted data from Jhove (files) PREMIS record of a file digiprovMD: PREMIS record of representations (journal, issue, article) PREMIS record of a file

11 11 Preservation Metadata (PREMIS) and METS: eJournal content stream Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove output PREMIS elements used: objectIdentifier objectCategory preservationLevel size fixity (MD5, SHA-512) format (PRONOM) Relationships, events and agents where necessary

12 12 Preservation Metadata (PREMIS) and METS: eJournal content stream Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove output PREMIS elements used: objectIdentifier objectCategory preservationLevel size fixity (MD5, SHA-512) format (PRONOM) Relationships, events and agents where necessary Redundantly in METS element }

13 13 Preservation Metadata (PREMIS): relationships PREMIS relationships: manifestationOf (between Manifestation and Article) containedInSubmission (between Manifestation and Submission) PREMIS relationships (between files: m-n relationships): migration uncompression modification Relationships are always stored in Premis records for files will have techMD and digiProvMD

14 14 Preservation Metadata (PREMIS): events PREMIS events (on file level): integrityCheck formatIdentification validation wellformness propertyExtraction PREMIS events (on representation level): metadataUpdate Relationships are always stored in Premis records for files will have techMD and digiProvMD

15 15 Preservation Metadata (PREMIS): events PREMIS events always have an agent Event and agents are stored in each PREMIS record: In case an event effects more than one object, it must be repeated in each object’s PREMIS record. Using the same identifier indicating it is the same event.

16 16 Preservation Metadata (PREMIS) in METS Content streams: eJournals uses PREMIS 1.1; MODS 3.2; METS 1.4; jhove dtd Newspapers uses PREMIS 2.0; MODS 3.3; METS 1.8 Web Archiving uses PREMIS 2.0; MODS 3.3; DC; METS 1.8 Move to PREMIS 2.0 Changes to AIP model

17 17 AIPs and PREMIS 2.0 Change of AIP: Newspapers need second structMap (and structLink) Hierarchy of AIPs no longer possible Instead: one AIP per issue Manifestations are modelled as a (various manifestations per AIP possible) Support of container files (ZIP, WARC) Modelled as nested elements; no PREMIS record for container files No file format specific technical metadata is captured

18 18 METS and PREMIS 2.0 METS and PREMIS 2.0: Use of new METS schema versions: instead of objectCategory just use Agent, object, event in separate elements within the same PREMIS record should be self containing

19 19 METS and PREMIS 2.0 Extended list of event types: deselection: files which are defined in the AIP descriptor but never ingested (no FLocat element) metadataExtraction vs. propertyExtraction Extended list of relationship types (relationshipSubType): modification vs. manipulation

20 20 METS and PREMIS 2.0 Extended list of event types: deselection: files which are defined in the AIP descriptor but never ingested (no FLocat element) metadataExtraction vs. propertyExtraction Extended list of relationship types (relationshipSubType): modification vs. manipulation

21 21 METS and PREMIS 2.0 Problems: Validation Using controlled vocabularies Considering dependencies between METS and PREMIS Standardized workflow for creating METS and PREMIS for all content streams Currently specific implementations for each content stream Extending the AIP Model Preservation metadata for metadata records

22 22 Thanks Markus Enders The British Library Markus.Enders@bl.uk


Download ppt "PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009."

Similar presentations


Ads by Google