Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 23 2007 Archiving 2007 1 PAWN: A Policy-Driven Software Environment for Implementing Producer- Archive Interactions in Support of Long Term Digital.

Similar presentations


Presentation on theme: "May 23 2007 Archiving 2007 1 PAWN: A Policy-Driven Software Environment for Implementing Producer- Archive Interactions in Support of Long Term Digital."— Presentation transcript:

1 May 23 2007 Archiving 2007 1 PAWN: A Policy-Driven Software Environment for Implementing Producer- Archive Interactions in Support of Long Term Digital Preservation Mike Smorul, Mike McGann, Joseph JaJa Institute for Advanced Computer Science Studies University of Maryland, College Park Sponsored by National Archives and Records Administration, Library of Congress and NSF

2 May 23 2007 Archiving 2007 2 Problems Facing Ingestion Ensure integrity of data ingestion Each producer-archive interaction is unique Final destination for items in an archive is unique. Differing roles between producer and archive Hostile producers

3 May 23 2007 Archiving 2007 3 What is PAWN? Software that provides an ingestion framework Distributed and secure ingestion of digital objects into an archive. Handles the process –From package assembly –To archival storage Simple, customizable interface for end-users Flexible interface for archive publication

4 May 23 2007 Archiving 2007 4 Package Workflow 1.Create Producer-Archive Agreement 2.Client package template. 3.Create package based on template 4.Once approved, packages can be archived 5.Rejected packages can be held until rectified or deleted for resubmission.

5 May 23 2007 Archiving 2007 5 Expanding a Simple Workflow Support for multiple workflows. –Grouped into logical domains Definable roles per workflow Pluggable components for assembly and archival publishing Distributed components –Web-service based components

6 May 23 2007 Archiving 2007 6 Domain Organization Producers organized into domains, each domain contains a transfer agreement negotiated with the archive. Each domain contains a hierarchical organization of data grouped into record sets/templates (convenient groupings from the transfer agreement). Each domain contains its own users. An end-user operates within a set of record sets.

7 May 23 2007 Archiving 2007 7 Domain Example

8 May 23 2007 Archiving 2007 8 Custom Roles Actions in PAWN can be grouped together to create roles. –There are no common roles between archives, so allow custom ones. Default roles –Producer – Individual data supplier –Records Manager – Oversight of producers –Archive Manager – Final review and archive publishing –Global Administrator – Creates domain, sysadmin-like account Sample Actions –Setting permissions on record sets –Record Schedule creation and modification –Add or delete whole packages –Modify items in a package …

9 May 23 2007 Archiving 2007 9 Custom Package Building PAWN provides an API for developing custom package builders Custom package builders can be written in JAVA and implement a simple interface. Builders interact with a hierarchical structured package Manifest  Namespace  Type  Descriptive Name Data  Type  Descriptive Name  Bits Metadata … Manifest … Metadata  Type  Bits  Name

10 May 23 2007 Archiving 2007 10 PAWN Archive Gateway Pluggable component that provides an API for developing gateways into various services. Each gateway may have multiple instances, each configured differently PAWN handles managing and associating gateways with the appropriate data.

11 May 23 2007 Archiving 2007 11 PAWN Architecture Divided into producer and archive side components –Producer: data supplying and domain management –Archive: data storage, resource allocation and archival publishing Web-service based communication Trust relationship between producer and archive components –SAML and PKI

12 May 23 2007 Archiving 2007 12 Components

13 May 23 2007 Archiving 2007 13 Case Studies ICDL Book Builder SLAC Record Ingestion 10,000 CDroms Remote ingestion Unskilled labor Custom hardware Sample NARA ingestion Model government roles DOE Record Schedule Custom package builder Multiple data sources Model logical books

14 May 23 2007 Archiving 2007 14 PAWN Summary Platform for ingestion Customizable Components –Roles, ingest and publishing Distributed architecture

15 May 23 2007 Archiving 2007 15 More information Web site: –http://www.umiacs.umd.edu/research/adapthttp://www.umiacs.umd.edu/research/adapt Wiki link for technical details. Or “I’m feeling lucky” Google keywords: –ADAPT UMIACS


Download ppt "May 23 2007 Archiving 2007 1 PAWN: A Policy-Driven Software Environment for Implementing Producer- Archive Interactions in Support of Long Term Digital."

Similar presentations


Ads by Google