An Overview of Open Digital Archive Architecture Jan-Ming Ho, PhD Research Fellow and Deputy Director Ints. Of Info. Sci., Academia Sinica.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

OAI from 50,000 Feet OAI develops and promotes interoperability solutions that aim to facilitate the efficient dissemination of content. Begun in 1999.
Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
The Biosafety Clearing-House of the Cartagena Protocol on Biosafety Tutorial – BCH Resources.
The Library behind the scene How does it work ? The Library behind the scenes 1 JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
URI IS 373—Web Standards Todd Will. CIS Web Standards-URI 2 of 17 What’s in a name? What is a URI/URL/URN? Why are they important? What strategies.
Metadata: An Introduction By Wendy Duff October 13, 2001 ECURE.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
Current Progress of SMART Sinica Metadata Architecture and Research Task Ya-ning Chen Shu-jiun Chen Computing Centre Academia Sinica 6 September 2000.
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
Kristin Eberle Monica Hampton Carmen Velasquez Kristin Eberle Monica Hampton Carmen Velasquez Knowledge Management.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
Demonstration of repositories Fedora (Flexible Extensible Digital Object Repository Architecture) Marie Lagerwall MIDESS Partners Meeting February 9, 2007.
Presented by Karen W. Gwynn LS – Metadata University of Alabama Prof. Steven MacCall Spring 2011.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
Digital Library Architecture and Technology
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Global & Regional Initiatives on Information Management Eero Mikkola(IUFRO) Joris Siermann (CIFOR) Global Forest.
Chapter 6 Text and Multimedia Languages and Properties
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
Dspace 1 Introduction to DSpace Mukesh Pund Scientist NISCAIR, New Delhi.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
“Old Style” Libraries, Digital Libraries: Convergences, Divergences, And the Troubles in Between.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
Introduction to metadata
Tsinghua University Library Yang Zhao & Airong Jiang Tsinghua University Library, Beijing China 4 June, 2004 Electronic Thesis and Dissertation System.
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
DSpace - Digital Library Software
Metadata and Digital Libraries M. SURULINATHI Assistant Librarian.
GPO’s Future Digital System (FDsys) November 2, 2006 LS&CM CENDI Presentation.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
Networked Information Resources Federated search, link server, e-books.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Introduction to DSpace
Cataloging the Internet
Metadata to fit your needs... How much is too much?
Oya Y. Rieger Cornell University Library May 2004
Presentation transcript:

An Overview of Open Digital Archive Architecture Jan-Ming Ho, PhD Research Fellow and Deputy Director Ints. Of Info. Sci., Academia Sinica

The Problem

Collection Management proofreading Preservation Front-end Production DisseminationDigitizationPresentation WorkflowAAA User Services and Management Value-Added Services Knowledge Discovery Other archive systems Catalog Service Multimedia raw data and metadata Digital Archive Model

Requirements for NDAE Digital Archive Working Environment Collection, digitization workflow, and storage Metadata, indexing, and digital object management Discovery and Dissemination Content distribution Retrieval and presentation Models the requirements of content holders and users Scalability and Interoperability Multimedia Processing and Presentation Retrieval, watermark, summarization, virtual reality, etc. Multilingual Requirements Unicode and Han Variants Missing Han Characters Thesaurus AAA – Authentication, Authorization, and Accounting Union Catalog and Value-added Services

Sample Content Projects in NDAP Rubbings of Bronze, Stones, and Bamboo Slips Holomorphic rubbings Archaeological Excavations Seal Database of Rare Books Archives of Specimens of Insects, Fish, and Shell, etc.Insects Old Chinese Paintings Engravings on Bronze Wares Engravings on Bronze Wares made in Chin Dynasty ( A.D.)

Management of Holomorphic Rubbings

Directory of Species

Specimen Information System

Metadata Design Domain-specific and internationalization Standardizing metadata to facilitate preservation and dissemination of digital objects, and their applications

A Service Infrastructure Dark Archive Content Creation and Management UnionCatalogCentralizedHosting DomainCatalog Access ValueAdded ValueAdded EducationService ValueAdded Content Creation and Management

An Educators ’ Platform Education Resource Exchange Platform Front-end Back-end Educational Resources Online Journals Education Material Textbook, Reading Government Institutes, Non-governmental Consulting Teams, Seeding Schools Online Counseling Educators ’ Activities 1.Retrieval of lesson plan and other educational resources 2.Community Interaction 3.Teaching Activity 4.Experience Sharing 5.Journal submission

A Survey of Related Standards

OAIS Preservation Metadata Open Archive Information System Preservation Metadata Preservation metadata is the information infrastructure that supports the processes associated with digital preservation. the information necessary to maintain the viability, renderability, and understandability of digital resources over the long-term. an OAIS has three basic functions: ingest, storage and dissemination In the ERA concept, these functions are executed in three virtual workspaces: Accession, Archival, and Reference workbenches.

ERA Block Diagram from [1] [1] Kenneth Thibodeau, “ Building the Archives of the Future, Advances in Preserving Electronic Records at the National Archives and Records Administration, ” D-Lib Mag., vol. 7, no. 2, Feb

OAI-PMH and Dublin Core OAI Protocol for Metadata Harvesting Open Archives Initiative Protocol for Metadata Harvesting provides an application-independent interoperability framework based on metadata harvesting Dublin Core address the problem of resource discovery for networked resources 15-element set of descriptors interdisciplinary and international consensus reached on the semantics of each of the 15 elements

A Typical OAI-PMH Architecture

Name Space DOI The Digital Object Identifier (DOI ® ) is a system for identifying and exchanging intellectual property in the digital environment. URI A URI can be further classified as a locator, a name, or both. "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. "Uniform Resource Name" (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable. : URN Uniform Resource Names (URNs) are intended to serve as persistent, location-independent, resource identifiers and are designed to make it easy to map other namespaces into URN-space. "urn:" ":"

Descriptive Metadata METS The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library EAD The EAD Document Type Definition (DTD) is a standard for encoding archival finding aids using the Standard Generalized Markup Language (SGML).

More on Descriptive Metadata used in NDAP MARC TEI CDWA Species 2000 Data Standard ECHO OLACMS CSDGM MARC 21 Concise format for Authority Data ADL Gazatteer Content Standard

Our Approach

Architecture of ODAE user#1 user#2 user#3 Remote systems Union Catalog (Discovery Engine) Data Provider Metadata Server Metadata & Workflow Server Missing- Character Server Media Center Repository Manager Video Audio Image Media Production Streaming Server SSO Server AAA Server Doc Center Backend Production Client

Missing Character Server

Number of Hanzi Characters BIG5: 13,051 GB 2312: 6,763 GBK: 21,003 GB : 27,000+ Unicode 2.1: 20,902 Unicode 3.0: 27,484 Unicode 3.1: 70,195 Estimated number of characters: 50,000+ Estimated number of glyphs: 100,000+ In common use: 8,000 – 9,000

Missing Character Problem C.C. Hsieh, et. al. Glyph Expression Maintains a Hanzi Glyph Database Preparation Heavy users, e.g., content holders Occasional users Network Presentation Retrieval of documents containing mission characters

Preparing Missing Characters by Content Holders Installing Hanzi glyph database at the client URL: It also contains MS Office document templates for preparing glyph expressions Inserting glyph expression wherever needed in a document or database

Presenting Missing Characters Content Holder ………… glyph expression ………… Java Applet ………… glyph expression ………… Java Applet ………… …. ………… Glyph Image Server Client Web Server Presentation module

Glyph Image Server Accept a glyph expression encoded in the form of a CGI query Returns a glyph image

Missing Character Presentation The web server automatically inserts a presentation applet into each outgoing web page Author can also choose to insert the applet into the HTML document The presentation applet retrieve the same HTML document from the server Netscape 4.x compatibility The web server extracts the glyph expression from the document, and  converts it into a CGI query for the glyph image server and  Writes it back to the browser’s cache The web browser renders the new web page with the glyph image retrieved from the glyph image server

Network-based Input Method for Missing Characters

Retrieving Documents with Missing Characters

ODAE Content Management Architecture user#1 user#2 user#3 Remote systems Union Catalog (Discovery Engine) Data Provider Metadata Server Metadata & Workflow Server Missing- Character Server Media Center Repository Manager Video Audio Image Media Production Streaming Server SSO Server AAA Server Doc Center Backend Production Client

Metadata Server

Goals The metadata group interacts closely with content holders to look into existing international metadata activities to define domain-specific metadata and workflow to manage the digital archive

Metadata Server Design Data Flow Engine Data Provider of Union Catalog Index Engine Content Holders Web Surfers Presentation Engine Preservation Engine Media CenterMetadata Store

Media Center

Major Functions A repository of multimedia objects Media Processing Rotation, Creating Thumbnails Adding Watermark Registering a unique name from Local Name Authority

Integration with Local Name Authority Content Holders Media Center Local Name Authority Digital Object Repository (URN Handle System)

Union Catalog and Data Provider

Union Catalog Services Goals: Archive, Commerce, and Public Access Functional Requirements Full-text Search  Using character strings as query to retrieve documents containing one or all of the strings Dublin Core Search  Search for documents containing a query string in one of the 15 Dublin Core elements  To increase the precision of search results Catalog  Advanced users can make better use of the above two search functions.  However, it is essential for general users to use a hierarchical catalog to get familiar with the archive of digital objects. For Discovery Purposes

Building an Inter-Agent Union Catalog Domain metadata Archive of Digital Objects Union Catalog Catalog Mapping Metadata- DC mapping DC metadata OAI

Individual Content Holder Domain metadata Archive of Digital Objects For Individual Project

Union Catalog and the Mappings Domain metadata Archive of Digital Objects Union Catalog Catalog Mapping Metadata- DC mapping DC metadata For Union Catalog

Defining a Union Catalog Domain Catalog and Union Catalog Their mapping Metadata Mapping Mapping essential archive metadata elements to DC elements One-way mapping

Technical Support for a Union Catalog

Technical Supports Tools for Transferring Metadata to OAI Data Provider Two additional servers data provider and service provider Data transfer protocol from metadata database to OAI data provider Server authentication

An OAI Service Provider

Document Center

Conclusions Union Category Digital Object Model Hierarchical data model is assumed in METS, OAI-PMH, etc. Relational Model Workflow NARA/ERA and ISO OAIS Impacts on Education AAA and E-Commerce Modularity and Scalability