DRS 2 Orientation Harvard University Library September 30, 2010 DRS = Digital Repository Service
Agenda 1. DRS 2 1. Concepts (Andrea) 2. New metadata (Robin) 3. Overall schedule (Andrea) 2. BatchBuilder 2 demo (Vitaly) 3. Testing instructions (Vitaly) 4. Questions & comments
DRS 2 Concepts
DRS 1: everything’s a file METS XML file TIFF image file JPEG image file JP2 image file JPEG image file JP2 image file Text file ZIP file PDF document file
File level is not a meaningful level for curatorial uses… Which DRS files make up my digital manuscript? HOLLIS number
METS XML file TIFF image file JPEG image file JP2 image file JPEG image file JP2 image file Text file ZIP file PDF document file
METS XML file TIFF image file JPEG image file JP2 image file JPEG image file JP2 image file Text file ZIP file PDF document file DRS file ID =
METS XML file TIFF image file JPEG image file JP2 image file JPEG image file JP2 image file Text file ZIP file PDF document file
METS XML file TIFF image file JPEG image file JP2 image file JPEG image file JP2 image file Text file ZIP file PDF document file
METS XML file TIFF image file JP2 image file
METS XML file TIFF image file JP2 image file page 1 page 2
Objects Aggregations of files that together represent a coherent unit of content All the files that make up a single digital book All the master and use copies representing a single photograph Useful for management, reporting and searching “How many PDS document objects do I have in the DRS?”
Objects New hook for metadata Administrative categories (projects, exhibits, collections, etc.) Descriptive metadata, catalog records Object Hollis # Digital Medieval Manuscripts at Houghton Library Moralia in Job: manuscript
Content models Object types Define valid file formats and relationships known delivery and rendering applications associated assessments and preservation plans Enforce conformity - we know what we have in the DRS and can monitor & preserve it
DRS 2.1 content models – deposit & delivery 1. Still image Image objects, delivered by IDS 2. PDS document Page-turned documents, delivered by PDS 3. Document Initially just PDF files, delivered by FDS 4. Opaque Files in any format 5. Text Text, XML, etc. delivered by FDS
Still image CM – print TIFF archival master Several derivative JPEG deliverables Derivative JPEG thumbnail Pope Joan Series: Illustration from Philippus Bergomensis, De Claribus Mulieribus. Ferrara, Rossi Harvard Art Museum/Fogg Museum, Gift of Philip Hofer
PDS document CM - book Zoeller, Karl William. Merchandising the plumbing business. Chicago : Domestic Engineering Co., c1921. Baker Library. JP2 archival master / deliverable images per page Plain text files per page …
Document CM - report Intergovernmental Panel on Climate Change (IPCC) WG1 Fourth Assessment Report, Environmental Science and Public Policy Archives Harvard College Library PDF deliverable
Opaque content model The contents of Judge Tragers’ hard drive, Harvard Law School Library Wordperfect files, Text files, PDF documents, etc. Plus documentation about the collection
Text CM – methodology Plain text file Processing methodology for Intergovernmental Panel on Climate Change (IPCC) documents, HCL Imaging Services.
New metadata
Object descriptors A METS metadata file per object on the file system alongside content files Descriptive, administrative, preservation, technical and structural metadata Describes the object, all its files and bitstreams and related significant events Gives the metadata the same secure storage as the content files Self-contained, portable objects
The move to standards PREMIS -- for key preservation metadata, including Events that affect content Relationships that are not implicit MODS -- for descriptive metadata Form-specific schemas for technical metadata, including MIX for images textMD for text DocumentMD for PDF and other document formats More to come… Supplemented by local administrative schemas
New local metadata adminCategory adminFlag captions, phase 2 Behavior, default, unit name, description for objects content model identification DRS URI isFirstGenerationInDrs Closest to original capture isPreferredDeliverableSource
Changes to local metadata OwnerSuppliedName Required for objects, optional for files Role Repeatable for both objects and files Processing Instead of “purpose”; repeatable Quality Optional Methodology Now for objects and files of all types
Tracking changes DRS 2 will keep track of Changes that affect content Troubleshooting content errors Key administrative metadata Three types: Events Administrative flags “Versioned” metadata elements Not tracking every metadata change
Events Object creation deletion /recovery from deletion ingest merge File addition deletion / recovery from deletion integrity check confirmation replacement virus check confirmation
Other tracking Metadata where changes will be tracked: Access Flag Administrative Flag Billing Code Owner Code
Descriptive Metadata MODS Administrative Metadata For the object: PREMIS (including relationships) DRS administrative metadata For each file: PREMIS (including relationships) Format-specific metadata DRS administrative metadata PREMIS Events Inventory of Files Structure Map What’s inside a descriptor?descriptor
Overall schedule
Available now: first release of BatchBuilder 2 for depositor training and testing Supports 5 content models Fall 2010 – Summer 2011 BatchBuilder 2 enhancements & bug fixes Web Admin 2 development and testing ~September 2011: BatchBuilder 2 and Web Admin 2 in production
BatchBuilder 2
Will build batches of objects rather than batches of files Will automatically determine most technical metadata (using FITS) Will automatically create all object descriptors (using OTS)
BatchBuilder 1BatchBuilder 2 Expects files and creates batches of files. Expects objects and creates batches of objects. Can use an existing PDS METS file for PDS objects. Can import a structmap from an “old- style” PDS METS file to create a PDS Document descriptor. Uses batch genres.Uses DRS Content Models. Uses a supplied HOLLIS ID to import contents of a HOLLIS record to a PDS METS Label. Uses a supplied HOLLIS ID to import contents of a HOLLIS record into the MODS section of the object descriptor. Batch level and directory level metadata entered in Batch Template panel. Object level and directory level metadata entered in Object Template. Project level metadata is entered in Administrative Properties panel. Project level metadata is entered in Deposit Settings panel. No depositor authorization – anyone with access to the ftp dropbox can load batches. Depositor authorization – only depositors with permission to load into a particular owner code can load batches into that owner code.
Testing instructions
Questions & Comments