Centro Ricerche e Innovazione Tecnologica TAPE workshop on the curation and preservation of audiovisual collections University of Glasgow, Scotland, UK.

Slides:



Advertisements
Similar presentations
Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
Advertisements

Term Project Grade 9 Section B Due december 18 Find and research one Emerging technology not studied in class. It can be a prototype or already available.
SDMX in the Vietnam Ministry of Planning and Investment - A Data Model to Manage Metadata and Data ETV2 Component 5 – Facilitating better decision-making.
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program
The OAIS experience at the British Library Deborah Woodyard Digital Preservation Coordinator ERPANET OAIS Training Seminar, Nov 2002.
The future’s so bright…. DAITSS DIGITAL PRESERVATION SYSTEM: RE-ARCHITECTED, RE- WRITTEN, AND OPEN SOURCE Priscilla Caplan Florida Center for Library Automation.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
Windows XP Photo Workflow Tim Grey Imaging Strategist Microsoft Corporation.
An Introduction June 17, 2013 Open Archival Information System (OAIS)
Digital archival storage for the University of Michigan Library collections.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
ISO & OAI-PMH By Neal Harmeyer, Amy Hatfield, and Brandon Beatty PURDUE UNIVERSITY RESEARCH REPOSITORY.
Operating Systems.
Agenda  Overview  Configuring the database for basic Backup and Recovery  Backing up your database  Restore and Recovery Operations  Managing your.
Different approaches to digital preservation Hilde van Wijngaarden Digital Preservation Officer Koninklijke Bibliotheek/ National Library of the Netherlands.
November 2009 Network Disaster Recovery October 2014.
Software Configuration Management (SCM)
DCC Conference, Glasgow November, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego.
FP7-ICT PrestoPRIME 1 Richard Wright BBC R&D Preservation: Scenarios, Risks, Costs Screening the Future Hilversum March 2011 Richard.
Johannes Spitzbart Phonogrammarchiv, Austrian Academy of Sciences Österreichische Tage der Digitalen Geisteswissenschaften save the data - workshop on.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
Digital Preservation 101, or, How to Keep Bits for Centuries Julie C. Swierczek Digital Asset Manager and Digital Archivist Harvard Art Museums.
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
TECHNOLOGY SUPPORT FOR ESSSS Progress, Issues, and Challenges Marshall Breeding Director for Innovative Technology and Research Vanderbilt University Library.
Chapter 9 Section 2 : Storage Networking Technologies and Virtualization.
Information: Policy, Strategy and Systems Module Overview
FORMAT AND FILE ISSUES FOR VIDEO ARCHIVING Franz Pavuza Phonogrammarchiv Austrian Academy of Science.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
E.Soundararajan R.Baskaran & M.Sai Baba Indira Gandhi Centre for Atomic Research, Kalpakkam.
Small steps and lasting impact: making a start with preservation or It’s not all NASA Patricia Sleeman Digital Archives and Repositories University of.
Storage of digital objects Adolf Knoll National Library of the Czech Republic
ETD2006 Preserving ETDs With D.A.I.T.S.S. FLORIDA CENTER FOR LIBRARY AUTOMATION FC LA PAPER AUTHORS: Chuck Thomas Priscilla.
VITAL at the National Library of Wales Glen Robson
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
M-1 INGEST OVERVIEW Don Sawyer National Space Science Data Center NASA/GSFC October 13, 1999.
Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt
Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation.
The OAIS Reference Model Michael Day, Digital Curation Centre UKOLN, University of Bath Reference Models meeting,
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Cedars work on metadata Michael Day UKOLN, University of Bath Cedars Workshop Manchester, February 2002.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Next Generation of Apache Hadoop MapReduce Owen
Digital Asset Management Systems and Digital Preservation EUAN COCHRANE – DIGITAL PRESERVATION MANAGER YALE UNIVERSITY LIBRARY.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
Etere MTX IT Based Playout. Why MTX  Our vision is to deliver an integrated framework  MTX follow this vision not a simple video device but an integrated.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Preservation Functionality in a Digital Archive Erik Oltmans Koninklijke Bibliotheek Raymond J. van Diessen IBM Business Consulting Services Hilde van.
OAIS (archive) Producer Management Consumer. Representation Information Data Object Information Object Interpreted using its Yields.
OAIS (archive) OAIS (archive) Producer Management Consumer.
A Solution for Maintaining File Integrity within an Online Data Archive Dan Scholes PDS Geosciences Node Washington University 1.
Etere MTX IT Based Playout.
FLORIDA CENTER FOR LIBRARY AUTOMATION
OAIS Producer (archive) Consumer Management
Building A Repository for Digital Objects
DAITSS: Dark Archive in the Sunshine State
DAITSS and the Florida Digital Archive
Digital Archiving & Preservation : How to compare and contrast
Video Compression - MPEG
Storage Virtualization
Storage & Digital Asset Management CIO Council Update
2.C Memory GCSE Computing Langley Park School for Boys.
Open Archival Information System
Robin Dale RLG OAIS Functionality Robin Dale RLG
IBM Tivoli Storage Manager
Presentation transcript:

Centro Ricerche e Innovazione Tecnologica TAPE workshop on the curation and preservation of audiovisual collections University of Glasgow, Scotland, UK Monday 12th – Friday 16th May 2008 Giorgio Dimino RAI Research Centre Storage and repositories

Centro Ricerche e Innovazione Tecnologica Reference Model for an Open Archival Information System (OAIS) Consultative Committee for Space Data Systems (CCSDS) This document is a technical Recommendation for use in developing a broader consensus on what is required for an archive to provide permanent, or indefinite long-term, preservation of digital information. This Recommendation establishes a common framework of terms and concepts which comprise an Open Archival Information System (OAIS). It allows existing and future archives to be more meaningfully compared and contrasted. It provides a basis for further standardization within an archival context and it should promote greater vendor awareness of, and support of, archival requirements.

Centro Ricerche e Innovazione Tecnologica OAIS environment model ProducerConsumer Management OAIS archive Provides content to archive Uses the archive content Decides archive strategic objectives

Centro Ricerche e Innovazione Tecnologica Data vs. Information OAIS definition Data object Information object Representation information yelds Interpreted using its What we store What we want Knowledge about data interpretation

Centro Ricerche e Innovazione Tecnologica Video data formats Uncompressed raster formats YUV and RGB  Standard definition 4:2:2 video, 270 Mb/s, requires 120 GB per hour Lossless compression (e.g. JPEG2000)  Variable efficiency, on average ½ of the uncompressed Compressed formats (e.g. MPEG2, MPEG4, VC1,DV)  Compression depends on the final quality expected, typical bit rates from 3 Mb/s to 50 Mb/s, up to 100 times reduction The “Representation Information” needed to interpret compressed formats is generally extremely complex. Rendering is done using specific software or hardware. The written specification must be seen only as a last resort disaster recovery option

Centro Ricerche e Innovazione Tecnologica Video quality, some considerations Digital master  Result of digitisation of analogue tapes. It becomes the new master to replace the corresponding analogue tape. It should be stored at maximum quality Publication master  If keeping the all the digital masters on line is too expensive, a surrogate master can be generated in some cases at lower quality from which all the subsequent publication copies will be derived by transcoding Publication version  The version that is delivered to the user of a particular service (an archive can offer several services based on the same content) Viewing version  A version at reduced quality used for content selection

Centro Ricerche e Innovazione Tecnologica OAIS Information Package Content Information Preservation Description Information Packaging Information Descriptive Information Provenance Context Reference Fixity Data object Representation information

Centro Ricerche e Innovazione Tecnologica Video packaging (wrappers) SMPTE MXF MPEG2 TS Microsoft ASF AVI Apple Quicktime Adobe Flash FLV SWF For reference see

Centro Ricerche e Innovazione Tecnologica OAIS collabration diagram

Centro Ricerche e Innovazione Tecnologica OAIS functional entities

Centro Ricerche e Innovazione Tecnologica Storage technologies Data tapes  IBMLTO Ultrium GB  Quantum DLT-S4 800 GB  SonySAIT800 GB  StoragetekT GB Hard disk  Up to 1 TB per disk 3.5”  Several RAID configurations possible Solid State Disks  Still expensive but becoming interesting  Capacity still lower than hd128 GB (announced products) 2.5” Optical Disks  DVD RW9 GB  Blu-Ray50GB

Centro Ricerche e Innovazione Tecnologica Some remarks The choice of storage technologies depends on many factors, including:  Total amount of data  Expected increase rate  Desired throughput  Access performance  Data security No storage media can last forever No technology can be considered 100% reliable Never keep single copies! Obsolescence occurs very rapidly Data migration must be considered part of the management process, not an emergency operation

Centro Ricerche e Innovazione Tecnologica Digital Vs Analogue Archive (Bookshelf meters required for 1000 hours of audio data) 800 GB today 1 TB today

Centro Ricerche e Innovazione Tecnologica Flat storage File server User Front end Selection Content Data base NAS

Centro Ricerche e Innovazione Tecnologica Storage hierarchy Near-Line On line Fast Hard Disk/RAID Tape (robot) Solid State Disk RAM RAID

Centro Ricerche e Innovazione Tecnologica Hierarchical Storage Management (HSM) HD cache Tape robotic storage File server User Front end Selection Content Data base

Centro Ricerche e Innovazione Tecnologica Federated storage (GRID) Based on GRID concepts of distributed computing and file system over a WAN Multiple self-contained storage nodes interconnected Each storage node contains its own storage medium, microprocessor, indexing capability, and management layer, generally based on commodity pc Advantages  Fault tolerance  Scalability  Throughput Example: Google File System, Apache HADOOP

Centro Ricerche e Innovazione Tecnologica Basic functionalities Virtualization  The user sees a single file system Data replication  The system automatically manages the desired redundancy Direct access to data  Data move from storage node to client without intermediation Dynamic reconfiguration  Nodes can be switched on and off while the system is in operation Automatic load balancing  Exploiting data replication and direct node access

Centro Ricerche e Innovazione Tecnologica Data blocking and replication A data file is divided into fixed length blocks Each block is replicated n times on different nodes File data data data data data data data data data data data data data data data data data data Block 1 Block 2 Block 3 Block 4 Node 1 Node 2 Node 3 Node 4 Node 5

Centro Ricerche e Innovazione Tecnologica Architecture Node DataNodes Name Node Name Node user Filename Nodes list Data chunks Cluster 1Cluster 2 Node

Centro Ricerche e Innovazione Tecnologica Digital Asset Management (1) A software system that implements all the archive management policies Provides the archive administrator the necessary tools to  Monitor the preservation state of the media  Restore backup copies when primary media is damaged  Monitor the use of the storage  Monitor software/hardware failures  Define ingestion and access policies Should provide support for technology/system migration

Centro Ricerche e Innovazione Tecnologica Digital Asset Management (2) Provides the necessary functionalities to implement the ingestion workflow  Receive the SIP (or a batch of)  Analyse the SIP, verify that all the vital metadata are valid  Assign UMIDs  Transcode SIP into AIP  Generate proxies (low resolution video, key frames)  Provide content documentation Provides the functionalities to implement the access workflow  Verify that the user has access rights  Provide content selection functionalities (search retrieval and browsing)  Verify content associated rights  Transcode AIP into DIP (it can depend on user request)  Deliver the DIP

Centro Ricerche e Innovazione Tecnologica OAIS Functions of Archival Storage

Centro Ricerche e Innovazione Tecnologica Business rights management A BRM is a system that manages content associated usage rights Without an automated BRM system the reuse of content can be slowed down by manual rights clearing operations Depending on the type of archive it can be convenient to have BRM closely coupled with DAM

Centro Ricerche e Innovazione Tecnologica Digital archive design (1) Analyse and state clearly your business requirements  What is your archive primary goal  Who are your users  Producers  Consumers  … and what are their needs Assess your content  Amount of items  Conservation status  Increase rate  usage

Centro Ricerche e Innovazione Tecnologica Select archive video formats and quality  Target archived quality depends on foreseen usage and preservation issues  Define the AIP (Archive Information Package)  Video coding  File formats  Associated metadata Extimate storage requirements  Amount of data  Level of security of data  Increase rate  Input output performace Digital archive design (2)

Centro Ricerche e Innovazione Tecnologica Define ingestion workflow and SIP  Ingestion procedures are particularly critical if your content needs digitization and restoration Define access workflow and DIP  Access is heavily dependent on proper documentation and retrieval tools  Properly dimension throughput  Affected by video bitrate and transcoding from AIP to DIP Define archive maintenance procedures  Consistency check  Media replacement  Disaster recovery Digital archive design (3)

Centro Ricerche e Innovazione Tecnologica Consider migration  Storage technology  Media capacity follows Moore’s law  … but sometime there is a technology leap (e.g. from tape library to hd arrays)  Coding formats  Compression schemes become more efficient allowing grater bit saving at a given quality –Older formats become obsolete –Transcoding generally implies possible loss of quality  Software/hardware  Proprietary formats often pose upgrade constraints Digital archive design (4)

Centro Ricerche e Innovazione Tecnologica Consider needs to interfacing to other systems  Federated libraries  Account systems  Production  Digital rights management … and finally design or commission a system Digital archive design (5)