Download presentation
Presentation is loading. Please wait.
Published byDulcie Cole Modified over 9 years ago
1
An Overview of Open Digital Archive Architecture Jan-Ming Ho, PhD Research Fellow and Deputy Director Ints. Of Info. Sci., Academia Sinica
2
The Problem
3
Collection Management proofreading Preservation Front-end Production DisseminationDigitizationPresentation WorkflowAAA User Services and Management Value-Added Services Knowledge Discovery Other archive systems Catalog Service Multimedia raw data and metadata Digital Archive Model
4
Requirements for NDAE Digital Archive Working Environment Collection, digitization workflow, and storage Metadata, indexing, and digital object management Discovery and Dissemination Content distribution Retrieval and presentation Models the requirements of content holders and users Scalability and Interoperability Multimedia Processing and Presentation Retrieval, watermark, summarization, virtual reality, etc. Multilingual Requirements Unicode and Han Variants Missing Han Characters Thesaurus AAA – Authentication, Authorization, and Accounting Union Catalog and Value-added Services
5
Sample Content Projects in NDAP Rubbings of Bronze, Stones, and Bamboo Slips Holomorphic rubbings Archaeological Excavations Seal Database of Rare Books Archives of Specimens of Insects, Fish, and Shell, etc.Insects Old Chinese Paintings Engravings on Bronze Wares Engravings on Bronze Wares made in Chin Dynasty (265-289A.D.)
6
Management of Holomorphic Rubbings
15
Directory of Species
16
Specimen Information System
22
Metadata Design Domain-specific and internationalization Standardizing metadata to facilitate preservation and dissemination of digital objects, and their applications
23
A Service Infrastructure Dark Archive Content Creation and Management UnionCatalogCentralizedHosting DomainCatalog Access ValueAdded ValueAdded EducationService ValueAdded Content Creation and Management
24
An Educators ’ Platform Education Resource Exchange Platform Front-end Back-end Educational Resources Online Journals Education Material Textbook, Reading Government Institutes, Non-governmental Consulting Teams, Seeding Schools Online Counseling Educators ’ Activities 1.Retrieval of lesson plan and other educational resources 2.Community Interaction 3.Teaching Activity 4.Experience Sharing 5.Journal submission
25
A Survey of Related Standards
26
OAIS Preservation Metadata Open Archive Information System Preservation Metadata Preservation metadata is the information infrastructure that supports the processes associated with digital preservation. the information necessary to maintain the viability, renderability, and understandability of digital resources over the long-term. an OAIS has three basic functions: ingest, storage and dissemination In the ERA concept, these functions are executed in three virtual workspaces: Accession, Archival, and Reference workbenches.
27
ERA Block Diagram from [1] [1] Kenneth Thibodeau, “ Building the Archives of the Future, Advances in Preserving Electronic Records at the National Archives and Records Administration, ” D-Lib Mag., vol. 7, no. 2, Feb. 2001.
28
OAI-PMH and Dublin Core OAI Protocol for Metadata Harvesting Open Archives Initiative Protocol for Metadata Harvesting provides an application-independent interoperability framework based on metadata harvesting Dublin Core address the problem of resource discovery for networked resources 15-element set of descriptors interdisciplinary and international consensus reached on the semantics of each of the 15 elements
29
A Typical OAI-PMH Architecture
30
Name Space DOI The Digital Object Identifier (DOI ® ) is a system for identifying and exchanging intellectual property in the digital environment. URI A URI can be further classified as a locator, a name, or both. "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. "Uniform Resource Name" (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable. : URN Uniform Resource Names (URNs) are intended to serve as persistent, location-independent, resource identifiers and are designed to make it easy to map other namespaces into URN-space. "urn:" ":"
31
Descriptive Metadata METS The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library EAD The EAD Document Type Definition (DTD) is a standard for encoding archival finding aids using the Standard Generalized Markup Language (SGML).
32
More on Descriptive Metadata used in NDAP MARC TEI CDWA Species 2000 Data Standard ECHO OLACMS CSDGM MARC 21 Concise format for Authority Data ADL Gazatteer Content Standard
33
Our Approach
34
Architecture of ODAE user#1 user#2 user#3 Remote systems Union Catalog (Discovery Engine) Data Provider Metadata Server Metadata & Workflow Server Missing- Character Server Media Center Repository Manager Video Audio Image Media Production Streaming Server SSO Server AAA Server Doc Center Backend Production Client
35
Missing Character Server
36
Number of Hanzi Characters BIG5: 13,051 GB 2312: 6,763 GBK: 21,003 GB 18030-2000: 27,000+ Unicode 2.1: 20,902 Unicode 3.0: 27,484 Unicode 3.1: 70,195 Estimated number of characters: 50,000+ Estimated number of glyphs: 100,000+ In common use: 8,000 – 9,000
37
Missing Character Problem C.C. Hsieh, et. al. Glyph Expression Maintains a Hanzi Glyph Database Preparation Heavy users, e.g., content holders Occasional users Network Presentation Retrieval of documents containing mission characters
38
Preparing Missing Characters by Content Holders Installing Hanzi glyph database at the client URL: http://ckip.iis.sinica.edu.tw/CKIP/tool/ It also contains MS Office document templates for preparing glyph expressions Inserting glyph expression wherever needed in a document or database
39
Presenting Missing Characters Content Holder ………… glyph expression ………… Java Applet ………… glyph expression ………… Java Applet ………… …. ………… Glyph Image Server Client 1. 2.3. 4. Web Server Presentation module
40
Glyph Image Server Accept a glyph expression encoded in the form of a CGI query Returns a glyph image
41
Missing Character Presentation The web server automatically inserts a presentation applet into each outgoing web page Author can also choose to insert the applet into the HTML document The presentation applet retrieve the same HTML document from the server Netscape 4.x compatibility The web server extracts the glyph expression from the document, and converts it into a CGI query for the glyph image server and Writes it back to the browser’s cache The web browser renders the new web page with the glyph image retrieved from the glyph image server
42
Network-based Input Method for Missing Characters
43
Retrieving Documents with Missing Characters
45
ODAE Content Management Architecture user#1 user#2 user#3 Remote systems Union Catalog (Discovery Engine) Data Provider Metadata Server Metadata & Workflow Server Missing- Character Server Media Center Repository Manager Video Audio Image Media Production Streaming Server SSO Server AAA Server Doc Center Backend Production Client
46
Metadata Server
47
Goals The metadata group interacts closely with content holders to look into existing international metadata activities to define domain-specific metadata and workflow to manage the digital archive
48
Metadata Server Design Data Flow Engine Data Provider of Union Catalog Index Engine Content Holders Web Surfers Presentation Engine Preservation Engine Media CenterMetadata Store
49
Media Center
50
Major Functions A repository of multimedia objects Media Processing Rotation, Creating Thumbnails Adding Watermark Registering a unique name from Local Name Authority
52
Integration with Local Name Authority Content Holders Media Center Local Name Authority Digital Object Repository (URN Handle System)
53
Union Catalog and Data Provider
54
Union Catalog Services Goals: Archive, Commerce, and Public Access Functional Requirements Full-text Search Using character strings as query to retrieve documents containing one or all of the strings Dublin Core Search Search for documents containing a query string in one of the 15 Dublin Core elements To increase the precision of search results Catalog Advanced users can make better use of the above two search functions. However, it is essential for general users to use a hierarchical catalog to get familiar with the archive of digital objects. For Discovery Purposes
55
Building an Inter-Agent Union Catalog Domain metadata Archive of Digital Objects Union Catalog Catalog Mapping Metadata- DC mapping DC metadata OAI
56
Individual Content Holder Domain metadata Archive of Digital Objects For Individual Project
57
Union Catalog and the Mappings Domain metadata Archive of Digital Objects Union Catalog Catalog Mapping Metadata- DC mapping DC metadata For Union Catalog
58
Defining a Union Catalog Domain Catalog and Union Catalog Their mapping Metadata Mapping Mapping essential archive metadata elements to DC elements One-way mapping
59
Technical Support for a Union Catalog
60
Technical Supports Tools for Transferring Metadata to OAI Data Provider Two additional servers data provider and service provider Data transfer protocol from metadata database to OAI data provider Server authentication
61
An OAI Service Provider
62
Document Center http://pkc.iis.sinica.edu.tw/user/ndap/
63
Conclusions Union Category Digital Object Model Hierarchical data model is assumed in METS, OAI-PMH, etc. Relational Model Workflow NARA/ERA and ISO OAIS Impacts on Education AAA and E-Commerce Modularity and Scalability
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.