Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Overview of Open Digital Archive Architecture Jan-Ming Ho, PhD Research Fellow and Deputy Director Ints. Of Info. Sci., Academia Sinica.

Similar presentations


Presentation on theme: "An Overview of Open Digital Archive Architecture Jan-Ming Ho, PhD Research Fellow and Deputy Director Ints. Of Info. Sci., Academia Sinica."— Presentation transcript:

1 An Overview of Open Digital Archive Architecture Jan-Ming Ho, PhD Research Fellow and Deputy Director Ints. Of Info. Sci., Academia Sinica

2 The Problem

3 Collection Management proofreading Preservation Front-end Production DisseminationDigitizationPresentation WorkflowAAA User Services and Management Value-Added Services Knowledge Discovery Other archive systems Catalog Service Multimedia raw data and metadata Digital Archive Model

4 Requirements for NDAE Digital Archive Working Environment Collection, digitization workflow, and storage Metadata, indexing, and digital object management Discovery and Dissemination Content distribution Retrieval and presentation Models the requirements of content holders and users Scalability and Interoperability Multimedia Processing and Presentation Retrieval, watermark, summarization, virtual reality, etc. Multilingual Requirements Unicode and Han Variants Missing Han Characters Thesaurus AAA – Authentication, Authorization, and Accounting Union Catalog and Value-added Services

5 Sample Content Projects in NDAP Rubbings of Bronze, Stones, and Bamboo Slips Holomorphic rubbings Archaeological Excavations Seal Database of Rare Books Archives of Specimens of Insects, Fish, and Shell, etc.Insects Old Chinese Paintings Engravings on Bronze Wares Engravings on Bronze Wares made in Chin Dynasty (265-289A.D.)

6 Management of Holomorphic Rubbings

7

8

9

10

11

12

13

14

15 Directory of Species

16 Specimen Information System

17

18

19

20

21

22 Metadata Design Domain-specific and internationalization Standardizing metadata to facilitate preservation and dissemination of digital objects, and their applications

23 A Service Infrastructure Dark Archive Content Creation and Management UnionCatalogCentralizedHosting DomainCatalog Access ValueAdded ValueAdded EducationService ValueAdded Content Creation and Management

24 An Educators ’ Platform Education Resource Exchange Platform Front-end Back-end Educational Resources Online Journals Education Material Textbook, Reading Government Institutes, Non-governmental Consulting Teams, Seeding Schools Online Counseling Educators ’ Activities 1.Retrieval of lesson plan and other educational resources 2.Community Interaction 3.Teaching Activity 4.Experience Sharing 5.Journal submission

25 A Survey of Related Standards

26 OAIS Preservation Metadata Open Archive Information System Preservation Metadata Preservation metadata is the information infrastructure that supports the processes associated with digital preservation. the information necessary to maintain the viability, renderability, and understandability of digital resources over the long-term. an OAIS has three basic functions: ingest, storage and dissemination In the ERA concept, these functions are executed in three virtual workspaces: Accession, Archival, and Reference workbenches.

27 ERA Block Diagram from [1] [1] Kenneth Thibodeau, “ Building the Archives of the Future, Advances in Preserving Electronic Records at the National Archives and Records Administration, ” D-Lib Mag., vol. 7, no. 2, Feb. 2001.

28 OAI-PMH and Dublin Core OAI Protocol for Metadata Harvesting Open Archives Initiative Protocol for Metadata Harvesting provides an application-independent interoperability framework based on metadata harvesting Dublin Core address the problem of resource discovery for networked resources 15-element set of descriptors interdisciplinary and international consensus reached on the semantics of each of the 15 elements

29 A Typical OAI-PMH Architecture

30 Name Space DOI The Digital Object Identifier (DOI ® ) is a system for identifying and exchanging intellectual property in the digital environment. URI A URI can be further classified as a locator, a name, or both. "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. "Uniform Resource Name" (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable. : URN Uniform Resource Names (URNs) are intended to serve as persistent, location-independent, resource identifiers and are designed to make it easy to map other namespaces into URN-space. "urn:" ":"

31 Descriptive Metadata METS The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library EAD The EAD Document Type Definition (DTD) is a standard for encoding archival finding aids using the Standard Generalized Markup Language (SGML).

32 More on Descriptive Metadata used in NDAP MARC TEI CDWA Species 2000 Data Standard ECHO OLACMS CSDGM MARC 21 Concise format for Authority Data ADL Gazatteer Content Standard

33 Our Approach

34 Architecture of ODAE user#1 user#2 user#3 Remote systems Union Catalog (Discovery Engine) Data Provider Metadata Server Metadata & Workflow Server Missing- Character Server Media Center Repository Manager Video Audio Image Media Production Streaming Server SSO Server AAA Server Doc Center Backend Production Client

35 Missing Character Server

36 Number of Hanzi Characters BIG5: 13,051 GB 2312: 6,763 GBK: 21,003 GB 18030-2000: 27,000+ Unicode 2.1: 20,902 Unicode 3.0: 27,484 Unicode 3.1: 70,195 Estimated number of characters: 50,000+ Estimated number of glyphs: 100,000+ In common use: 8,000 – 9,000

37 Missing Character Problem C.C. Hsieh, et. al. Glyph Expression Maintains a Hanzi Glyph Database Preparation Heavy users, e.g., content holders Occasional users Network Presentation Retrieval of documents containing mission characters

38 Preparing Missing Characters by Content Holders Installing Hanzi glyph database at the client URL: http://ckip.iis.sinica.edu.tw/CKIP/tool/ It also contains MS Office document templates for preparing glyph expressions Inserting glyph expression wherever needed in a document or database

39 Presenting Missing Characters Content Holder ………… glyph expression ………… Java Applet ………… glyph expression ………… Java Applet ………… …. ………… Glyph Image Server Client 1. 2.3. 4. Web Server Presentation module

40 Glyph Image Server Accept a glyph expression encoded in the form of a CGI query Returns a glyph image

41 Missing Character Presentation The web server automatically inserts a presentation applet into each outgoing web page Author can also choose to insert the applet into the HTML document The presentation applet retrieve the same HTML document from the server Netscape 4.x compatibility The web server extracts the glyph expression from the document, and  converts it into a CGI query for the glyph image server and  Writes it back to the browser’s cache The web browser renders the new web page with the glyph image retrieved from the glyph image server

42 Network-based Input Method for Missing Characters

43 Retrieving Documents with Missing Characters

44

45 ODAE Content Management Architecture user#1 user#2 user#3 Remote systems Union Catalog (Discovery Engine) Data Provider Metadata Server Metadata & Workflow Server Missing- Character Server Media Center Repository Manager Video Audio Image Media Production Streaming Server SSO Server AAA Server Doc Center Backend Production Client

46 Metadata Server

47 Goals The metadata group interacts closely with content holders to look into existing international metadata activities to define domain-specific metadata and workflow to manage the digital archive

48 Metadata Server Design Data Flow Engine Data Provider of Union Catalog Index Engine Content Holders Web Surfers Presentation Engine Preservation Engine Media CenterMetadata Store

49 Media Center

50 Major Functions A repository of multimedia objects Media Processing Rotation, Creating Thumbnails Adding Watermark Registering a unique name from Local Name Authority

51

52 Integration with Local Name Authority Content Holders Media Center Local Name Authority Digital Object Repository (URN Handle System)

53 Union Catalog and Data Provider

54 Union Catalog Services Goals: Archive, Commerce, and Public Access Functional Requirements Full-text Search  Using character strings as query to retrieve documents containing one or all of the strings Dublin Core Search  Search for documents containing a query string in one of the 15 Dublin Core elements  To increase the precision of search results Catalog  Advanced users can make better use of the above two search functions.  However, it is essential for general users to use a hierarchical catalog to get familiar with the archive of digital objects. For Discovery Purposes

55 Building an Inter-Agent Union Catalog Domain metadata Archive of Digital Objects Union Catalog Catalog Mapping Metadata- DC mapping DC metadata OAI

56 Individual Content Holder Domain metadata Archive of Digital Objects For Individual Project

57 Union Catalog and the Mappings Domain metadata Archive of Digital Objects Union Catalog Catalog Mapping Metadata- DC mapping DC metadata For Union Catalog

58 Defining a Union Catalog Domain Catalog and Union Catalog Their mapping Metadata Mapping Mapping essential archive metadata elements to DC elements One-way mapping

59 Technical Support for a Union Catalog

60 Technical Supports Tools for Transferring Metadata to OAI Data Provider Two additional servers data provider and service provider Data transfer protocol from metadata database to OAI data provider Server authentication

61 An OAI Service Provider

62 Document Center http://pkc.iis.sinica.edu.tw/user/ndap/

63 Conclusions Union Category Digital Object Model Hierarchical data model is assumed in METS, OAI-PMH, etc. Relational Model Workflow NARA/ERA and ISO OAIS Impacts on Education AAA and E-Commerce Modularity and Scalability


Download ppt "An Overview of Open Digital Archive Architecture Jan-Ming Ho, PhD Research Fellow and Deputy Director Ints. Of Info. Sci., Academia Sinica."

Similar presentations


Ads by Google