Download presentation
Presentation is loading. Please wait.
Published byLeo Fowler Modified over 9 years ago
1
This work is licensed under a Creative Commons Attribution 2.0 Germany License http://creativecommons.org/licenses/by/2.0/de/ eSciDoc, VIRR and Digitization Lifecycle - insights into an infrastructure for management of digitized resources Natasa Bulatovic Max Planck Digital Library Research and Development
2
Max Planck Digital Library (MPDL) is a service unit within the Max Planck Society (MPG) MPG consists of about 80 institutes in three scientific sections the Chemistry, Physics and Technology Section the Biology and Medicine Section the Human Sciences Section The core activities of the MPDL lie in building up service infrastructure and tools for publications and research data MPDL develops software solutions in close cooperation with scientists, librarians and technicians In the Human Sciences Section several institutes have digitized cultural artefacts and want to make them open access The Max Planck Digital Library (MPDL) in a Nutshell
3
eSciDoc SOA Landscape
4
Which data are managed?
5
How? PubMan – Publication Management VIRR – Textual digitized resources management IMEJI – Image management
6
PubMan: Management of publications
7
27.08.2015 Collaboration of the MPDL with the Max Planck Institute for European Legal History Motivation: The period of the Holy Roman Empire produced a enormous corpus of legislative sources.Till now no complete collection of this works exist. VIRR is about
8
27.08.2015 ViRR Key features Web-based collaborative application Editor (bibliographic metadata, table of contents and structural metadata) Viewer (online representation) Browser
9
27.08.2015 ViRR Editor Combines a set of tools Paginator Table of Contents Editor Metadata Editor One complex, but flexible workspace No default order for the usage of the tools
10
27.08.2015 ViRR Editor - Paginator Assign the logical page numbers to the physical ones Choose between different formats (Arabic, Latin, custom) Paginate manually or automatically
11
27.08.2015 ViRR Editor - ToC Editor Gather the logical structure of a work by breaking it down in structural elements Arrange the hierarchical order of structural elements in the tree Assign scans to structural elements Choose from fine granular structural element types (over sixty)
12
27.08.2015 ViRR Editor – Metadata Editor Assign descriptive metadata to structural elements Detailed description of every structural element Systematic browsing Dedicated search will be possible
13
ViRR Viewer Browse by scan Browse by ToC Navigate to page View metadata of structural element Page (web resolution) Page (full resolution) on click
14
ViRR: Sharing and reuse http://virr.mpdl.mpg.de
15
From ViRR to Digitization Lifecycle Project Goal support the complete Digitization Lifecycle with guideliness, standards, tools and a publishing platform Partners: MPI for European Legal History, Frankfurt MPI for European Legal History Kunsthistorisches Institut, Florenz (KHI) Kunsthistorisches Institut Bibliotheca Hertziana, Rom Bibliotheca Hertziana MPI for Human Development, Berlin MPI for Human Development Related projects: ViRR (see http://colab.mpdl.mpg.de/mediawiki/ViRR:_Virtueller_Raum_Reichsrecht) XML-Workflow (see http://colab.mpdl.mpg.de/mediawiki/MPDL_Project_XML_Workflow)
16
Imeji: Management of image collections
17
Imeji: repository of Digital Images Organized into Collections Created and defined by the institution, project, working group Albums Created and defined by the researcher
18
Imeji: what is so different about it? Imeji is not Flickr, nor Facebook... Freely definable metadata profiles at collection level Controlled Vocabularies may be integrated Smart search for dates, ranges (based on the metadata type) Helps gathering the metadata more effectively Focusses on collaboration and metadata quality Repository: Data can be exported at any time
19
eSciDoc and other services
20
eSciDoc SOA Landscape
21
eSciDoc core infrastructure Set Handler (OAI-PMH) Admin Handler Aggregation Definition Handl. Statistics Data Handler Scope Handler Report Handler Report Definition Handler Item Handler Container Handler Context Handler Organizational Unit Handler Content Model Manager User Account Handler Role Handler Group Handler Resources & Data StatisticsSecurity Content Relation Handler
22
CoNE Service ●Manages named entities ○Journals ○Persons ○Dewey Decimal Classification (3 public levels) ○Creative Commons Licenses (CC licenses) ○ISO 639-3 Languages ○MIME Types ○PACS classification ○Custom classifications ●Reuse ○Data delivered in multiple formats (JSON, HTML, RDF/XML, Options list) ●Motivation ○Metadata quality: autosuggest components in solutions during metadata editing ○Disambiguation: each entity is a named graph ○Data linking: CoNE identifiers in publication metadata ○Technical facilitation: all lists in one place ○Persons: Researcher Portfolio ●Extensions ○Refresh data from external sources
23
CoNE – Control of Named Entities http://cone.mpdl.mpg.de/ http://pubman.mpdl.mpg.de/cone/persons/resource/persons2450 + Content negotiation supported
24
Transformation Service ●Transforms textual data formats ○Metadata ○Resources ○Standard formats ○Specific formats (e.g. EndNote custom fields) ●Motivation ○Migration of data from MPI ○Exports and dissemination ○Imports ○Continuous interoperability enhancement ○Implement once, use wherever needed
25
Search&Export Service Ciation style manager ●Searches and exports results ●Citation styles (Citation style manager) ○EndNote ○BibTex ○… ●Reuse ○Data delivered in multiple formats (PDF, HTML, XML, ODT) ○By external systems (content management, wordpress) ●Motivation ○Search results should be available in various outputs ○One service – many presentations (e.g. Wordpress Plug-in) ○One interface – easy inclusion of various export formats
26
Syndication Service Syndication Service Syndication Service Feeds: Recent releases in repository Recent releases in repository (item versions) … eSciDoc Repository eSciDoc Repository 2: Get feed definition 3: Search/retrieve items 41 Syndication Service Syndication Service Feeds: Recent releases in repository Recent releases in repository (item versions) … eSciDoc Repository eSciDoc Repository 2: Get feed definition 3: Search/retrieve items 41 Syndication Service Syndication Service Feeds: Recent releases in repository Recent releases in repository (item versions) … eSciDoc Repository eSciDoc Repository 2: Get feed definition 3: Search/retrieve items 41 ●Provides with the latest data updates ●RSS ●Atom ●Reuse ○Subscription to feeds and data reuse ○By any external clients ●Extensions ○Media RSS
27
Validation service Semantical validation Contextual validation Validation rule editor (upcoming)
28
Data acquisition service Fetches data from known sources via identifier (unAPI interface) Transforms data to other format
29
Pubman SWORD Server Deposit of data packages (metadata and fulltexts) Logic implements a pubman specific workflow
30
PID Cache manager ●Fetches Handles from the GWDG Handle System (dummy resolution) ●Assigns a pre-fetched handle to the resource ●Synchronizes the assigned handle with the resolution to a resource in the Handle system EPIC – European Persistent Identifier Consortium (GWDG Germany, SARA Netherlands, CSC Finland, http://www.pidconsortium.eu/ )
31
A note on the metadata profiles ●DCAP based (Dublin Core Application Profile) ●DC terms (identified URIs) ●eSciDoc solution specific terms (identified by URIs) ●METS/MODS ●Publicly available ●Functional description http://colab.mpdl.mpg.de/mediawiki/ESciDoc_Application_Profiles http://colab.mpdl.mpg.de/mediawiki/ESciDoc_Application_Profiles ●Schemas http://metadata.mpdl.mpg.de/escidoc/metadata/schemas/0.1/ http://metadata.mpdl.mpg.de/escidoc/metadata/schemas/0.1/ ●Interoperability levels ●Shared term definitions (done) ●Semantic interoperability (done) ●Description set syntactic interoperability (prepared) ●Description set profile interoperability (prepared)
32
Premises ●Applications ○Web-based ○Internationalized ○Integrated Help system ○Easy to use ○Easy to install ●Services and infrastructure ○Reusable, interoperable, composed, technology-independent ○Extensible, Scalable and performant ●Data ○Persistently identified, versioned, discoverable, provenance and authenticity information, fine-grained authorization ○Described with published metadata profiles ○Interoperable and enabled for reuse and repurpose
33
Related projects and new developments DARIAH Digital Research Infrastructure for Arts and Humanities (see http://dariah.eu)http://dariah.eu Imeji AWOB Astronomers Workbench Resource Registries ECHO – European Cultural Heritage Online (see http://echo.mpiwg-berlin.mpg.de/home )http://echo.mpiwg-berlin.mpg.de/home
34
Thank you! bulatovic@mpdl.mpg.de http://colab.mpdl.mpg.de http://escidoc.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.