NGDA Architecture Update Greg Janée
Greg Janée May 16, Three motivations Archival has to be cheap & easy –little incentive –no funding Need to archive data semantics –key differentiator from text, audio, video Focus on long-term preservation –need to migrate whole systems
Greg Janée May 16, system databasestorage handle resolver database Typical repository architecture database handle resolver database fragile
Greg Janée May 16, NGDA architecture storage subsystem standard, public data model archival system ADLOAI bulk loader databases, caches, etc. Web access ingest
Greg Janée May 16, Post-NGDA architecture storage subsystem standard, public data model Web
Greg Janée May 16, Storage system requirements Req’s: –associate UUIDs/RIDs with bitstreams –retrieve global/local bitstream by UUID/RID –determine (parent) UUID of any bitstream –list all UUIDs Satisfied by: –any filesystem –tag URIs for UUIDs tag:library.ucsb.edu,2005:identifier
Greg Janée May 16, Archival objects directory UUID component RID UUID
Greg Janée May 16, Archival objects Directory info per component –named relationship/position –format & semantics by UUID references to definitions –fixity: checksum –provenance: isDerivative –policy: mutability –rights Components may be provided by archive itself
Greg Janée May 16, Example USGS DOQQ GeoTIFFFGDC Object x x.tiffx.fgdcx.gif metadata data derived TIFF subtypeOf
Greg Janée May 16, Archives Archive = set of archival objects –no structure –no free-floating bitstreams In anticipation of federation: –associations may cross archive boundaries –archival objects may not
Greg Janée May 16, Object types Content Format definition Semantic definition Provider Organizational structures –collection –series –ingest session
Greg Janée May 16, Archive-provider agreement Defines –common structure of objects to be ingested –necessary validations –associations to other objects –policies, rights, etc. Represents choke point –requires human evaluation
Greg Janée May 16, Deferred functionality Incremental ingest Object revisions Rights 3rd-party access Federation
Greg Janée May 16, Status Starting development now Approach: iterative refinement