The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation.

Slides:



Advertisements
Similar presentations
Focus on Your Content, Not on Ingesting Your Content Terry Brady Applications Programmer Analyst Georgetown University Library
Advertisements

1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton.
MacKenzie Smith Associate Director for Technology MIT Libraries.
METS: An Introduction Structuring Digital Content.
The future’s so bright…. DAITSS DIGITAL PRESERVATION SYSTEM: RE-ARCHITECTED, RE- WRITTEN, AND OPEN SOURCE Priscilla Caplan Florida Center for Library Automation.
FCLA Digital Library Services What is ETD? ETD : Electronic Theses and Dissertations A thesis or dissertation created and submitted in electronic form.
DRS 2 Metadata Migration June 25, Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.
Workflows for Digital Curation and Preservation Stacy Kowalczyk PASIG Dublin 2012 October 17, 2012.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Joachim Bauer Senior System Engineer, CCS
DSpace Devika P. Madalli DRTC, ISI Bangalore.
1 Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, Sweden.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
MIT’s DSpace A good fit for ETDs Margret Branschofsky Keith Glavash MIT LIBRARIES.
R.Jantz, August 31, Two-day forum on PREMIS Preservation Metadata and the Trusted Digital Repositories August 31, September 1 National Library of.
Merrilee Proffitt e(X)literature / Digital Cultures Project April 2003 News from the Digital Library The Metadata Encoding and Transmission Standard; the.
Ingest and Loading DigiTool Version 3.0. Ingest and Loading 2 Ingest Agenda Ingest Overview and Introduction Ingest activity steps Transformers Task Chains.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
WMS: Democratizing Data
THE RUTGERS WORKFLOW MANAGEMENT SYSTEM Mary Beth Weber Cataloging and Metadata Services Rutgers University Libraries August 3, 2007.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
Glen Robson Ioan Issac-Richards Vicky Philips
Putting it all together for Digital Assets Jon Morley Beck Locey.
DIGITIZATION OF RARE LIBRARY MATERIALS Metadata Format Access to Digital Documents © Adolf Knoll, National Library of the Czech Republic.
ETD Repositories Using DSpace Software Andrew Penman The Robert Gordon University 27 th September 2004.
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Case History: Library of Congress Audio-Visual Prototyping Project METS Opening Day October 27, 2003 Carl Fleischhauer Office of Strategic Initiatives.
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
PeDALS Persistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library,
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
1 The Universal Object Format - A METS Profile for an archiving and exchange format for digital objects.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
PREMIS Rathachai Chawuthai Information Management CSIM / AIT.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
International Seminary on Digitisation: Experience and Technology 11 th May 2004 | National Library | Lisbon – Portugal DIGITAL ARCHIVE OF PORTUGUESE ART.
Overview of EAD Jenn Riley Metadata Librarian Digital Library Program.
PREMIS Implementation Fair – SF 2009 PREMIS use in Rosetta Yair Brama – Ex Libris.
Best Practices for Digital Imaging and Metadata Roy Tennant The Library, University of California, Berkeley
ETD2006 Preserving ETDs With D.A.I.T.S.S. FLORIDA CENTER FOR LIBRARY AUTOMATION FC LA PAPER AUTHORS: Chuck Thomas Priscilla.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
Persistent Digital Archives and Library System (PeDALS)
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan Florida Center for Library Automation (FCLA)
5. Applying metadata standards: Application profiles Metadata Standards and Applications Workshop.
Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
NLW. Object Classes Class 1  1 MARC Record  1 Image  No METS Class 2  1 MARC Record  Many images  No METS Class 3  1 MARC Record  Many.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Opportunities & Obstacles: Prospects of Digital Assets.
Managing ETDs with Associated Complex Digital Objects Gabrielle V. Michalek Director, Scholarly Publishing, Archives and Data Services Carnegie Mellon.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Meeting of the Member States Expert Group on Digitisation and Digital Preservation , Luxembourg European Archival Records and Knowledge Preservation.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Data Management and Archival Storage Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
Joint Meeting of CSUL Committees,
Ingest and Dissemination with DAITSS
FLORIDA CENTER FOR LIBRARY AUTOMATION
DAITSS: Dark Archive in the Sunshine State
DAITSS and the Florida Digital Archive
Statewide Digitization and the FCLA Digital Archive
Metadata - Catalogues and Digitised works
Presentation transcript:

The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

What is the DigiTool to FDA Program? A program developed by FCLA that converts exported DigiTool entities into Submission Information Packages (SIPs) for archiving in the FDA repository.

Archiving DigiTool objects Archiving DigiTool objects is a four-step process: –Step 1: Affiliates flag DigiTool objects for export. –Step 2: DigiTool objects flagged for export by Affiliates are exported using the “Export Digital Entities” job. –Step 3: The DigiTool to FDA program (D2F) aggregates DigiTool objects into Intellectual Entities and creates Submission Information Packages (SIPs) and descriptors in the format required by the FDA –Step 4: The standard FDA Ingest process and program are used to archive the SIPs in the FDA repository.

DigiTool to Preservation Archive Workflow ETD information flagged in Digitool PDF Flag causes export of Metadata & files Program Creates Submission Information Package SIP SIP is Ingested in FLORIDA DIGITAL ARCHIVE preservation repository

Definitions DigiTool Digital Entity: Digital entities contain the following components: –A persistent DigiTool internal ID (PID) –Metadata of various types that describe the object –A stream_ref section that points to an object

Definitions Submission Information Package (SIP): An FDA Submission Information Package (SIP) is a set of files intended for ingest into the Florida Digital Archive. (It is recommended practice that a single SIP should include only those files that comprise a single Intellectual Entity.)

Definitions Intellectual entity: “An Intellectual Entity is defined as something that can be reasonably described and used as a unit, and corresponds roughly to what might be described by a bibliographic record: a book, a sound recording, a photograph. (In the case of serial publications, it is recommended that a SIP include only a single issue, not a volume or set of volumes.)” FCLA Digital Archive (FDA) SIP Specification, Version 1.0

Selecting DigiTool entities for export to the FDA Only those objects with filestreams in formats suitable for long-term preservation should be selected for archiving. (Format information can be found on the FDA website.) Examples: –ETDs containing PDFs –Institutional Repository materials –Masters of scanned images when TIFF files have been loaded into DigiTool Complex objects can be exported to the FDA but care must be taken in flagging them

Flagging DigiTool entities for export DigiTool entities must have the following Control Fields in order to be exported for archiving in the FDA: –Pres. Level = “Preservation Master” –Partition C must contain a valid FDA Account and Project code, separated by a comma Pres. Level Partition C

Note that each DigiTool object desired for archiving must be flagged with Pres. Level = “Preservation Master” Related objects not flagged as “Preservation Master” will not be exported for archiving. Objects without proper Partition C content will not be archived. Note that Usage Type = “Archive” is irrelevant to the DigiTool to FDA process. Flagging DigiTool objects

Example – manifestations 3 manifestations View Main (primary manifestation) Do NOT flag THUMBNAIL or INDEX for archiving

The Export Process FCLA will run the DigiTool “Export Digital Entities” job nightly to extract all flagged DigiTool entities and their filestreams and metadata. Only those objects flagged with Pres. Level = “Preservation Master” will be exported. Related objects (manifestations, parent/children) not flagged as Preservation Masters will not be exported. The objects output by this program are copied to a special workspace where the DigiTool to FDA (D2F) program uses them as input.

The DigiTool to FDA conversion process Step 1: exported objects (metadata and filestreams) are aggregated into packages, one for each Intellectual Entity Step 2: metadata is extracted from the exported objects and a SIP descriptor file is created for the package Step 3: filestreams are listed as content files in the SIP descriptor

Aggregation into Intellectual Entities An Intellectual Entity (e.g. book) in DigiTool can consist of a number of digital entities linked by “Manifestation”, “Includes” and “Part of” relationship links The “Export Digital Objects” job exports each flagged digital object separately After export, DigiTool to FDA uses relationship links to aggregate the exported objects into SIPs that include all of the filestreams that constitute the Intellectual Entity

Rules to Remember If you wish to archive multiple manifestations, make sure that one of the manifestations is flagged Usage Type = “View Main” If you have a complex object (a parent and child objects) make sure to flag the parent for export

Example of Aggregation/Flagging in DigiTool: Single Master (ETD) PID 111 (manifestation) Dublin Core descriptive metadata Filestream: PDF Pres. Level = Pres. Master Partition C = Account,Project PID 222 (manifestation) Filestream: thumbnail Pres. Level = blank Partition C = blank

Example of Aggregation – Export Single Master (ETD) “Export Digital Entities” Query: Select Pres. Level = Pres. Master and Date=today PID 111 PID 222 DigiTool Export Workspace PID 111

Example of Aggregation – D2F Single Master PID 111 Export Workspace SIP 111: Descriptor (descriptive metadata) PDF content file D2F Workspace

Example of Aggregation/Flagging in DigiTool: Manifestations PID 111 (manifestation) Dublin Core descriptive metadata Usage Type=View (primary) Filestream: TIFF Pres. Level = Pres. Master Partition C = Account,Project PID 222 ( manifestation) Filestream: TIFF Pres. Level = Pres. Master Partition C = Account,Project PID 333 (manifestation) Filestream: thumbnail Pres. Level = blank Partition C = blank

Example of Aggregation – Export Manifestations “Export Digital Entities” Query: Select Pres. Level = Pres. Master and Date=today PID 111 PID 222 PID 333 DigiTool Export Workspace PID 111 PID 222

Example of Aggregation – D2F: Manifestations PID 111 (View Primary) PID 222 Export Workspace SIP 111: Descriptor (descriptive metadata) TIFF content file D2F Workspace The D2F program creates one SIP from the two exported objects, based on “Manifestation” links

Example of Aggregation/Flagging in DigiTool: Complex Object PID 111 (Parent and manifestation) Dublin Core descriptive metadata No filestream Pres. Level = Pres. Master Partition C = Account,Project PID 222 (child and manifestation) Filestream: TIFF Pres. Level = Pres. Master Partition C = Account,Project PID 333 (manifestation) Filestream: thumbnail Pres. Level = blank Partition C = blank PID 444 (child and manifestation) Filestream: JP2 Pres. Level = Pres. Master Partition C = Account,Project PID 555 (child and manifestation) Filestream: _*index.html Pres. Level = blank Partition C = blank

Example of Aggregation – Export: Complex Object “Export Digital Entities” Query: Select Pres. Level = Pres. Master and Date=today PID 111 PID 222 PID 333 PID 444 PID 555 DigiTool Export Workspace PID 111 PID 222 PID 444

Example of Aggregation – D2F: Complex Object PID 111 (parent) PID 222 PID 444 Export Workspace SIP 111: Descriptor (descriptive metadata) TIFF content file JP2 content file D2F Workspace The D2F program creates one SIP from the three exported objects, based on “Part of”, “Includes” links

Creation of metadata in SIP descriptor Descriptive metadata is copied from the parent entity or main manifestation into the SIP descriptor (dmdSec) A checksum is generated for every file in the SIP and stored in the SIP descriptor. Other technical metadata is not copied from DigiTool into the SIP descriptor because the FDA generates its own. Administrative metadata (change history) is not copied into the SIP descriptor at this time. It may be added as Phase 2. Access restrictions are not copied into the SIP descriptor because the information is local to DigiTool.

Descriptive metadata in DigiTool DigiTool supports the following descriptive metadata formats: –MARC21 –MODS –Dublin Core The FDA currently loads title information into its database only from MODS and Dublin Core metadata, although all MARC21 metadata is archived in the descriptor file. (MARC21 title information will be included in DAITSS 2)

Step 3: Archiving converted SIPs SIPs created by D2F are sent to the FDA Ingest queue and processed by the standard FDA programs like all other SIPs A successful ingest of a D2F-created SIP will result in an Ingest report being sent to the usual Affiliate reports address. Any D2F-created SIPs rejected by the FDA will result in Error reports being sent to the usual Affiliate reports address

Why would the FDA reject D2F SIPs? Even though D2F creates SIPs according to FDA specifications, the SIPs can be rejected for the following reasons: –The FDA Account and Project codes in Partition C are invalid or are not comma- separated –The SIP contains no content files (DigiTool filestreams).

Problems that won’t be reported: FDA ingest program does not recognize the following conditions as errors: If you flag a parent for export to the FDA but do not flag all of the appropriate children, critical portions of the Intellectual Entity won’t be archived. If you flag children but do not flag the parent, each child will create a separate SIP. If you don’t flag all manifestations appropriate for archiving, critical portions of the Intellectual Entity won’t be archived.

What to do after D2F SIPs are archived FCLA recommends that you record the FDA IEID (Intellectualy Entity ID) in the Note Control Field of the DigiTool entity. FDA IEID (from Ingest Report)

Beta testing, DigiTool workflow by DigiTool workflow Volunteers needed for beta testing End Next Steps?