Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC.

Slides:



Advertisements
Similar presentations
Grid Initiatives for e-Science virtual communities in Europe and Latin America The VRC-driven GISELA Science Gateway Diego Scardaci.
Advertisements

Federated access to e-Infrastructures worldwide
Depositing e-material to The National Library of Sweden.
SWIM WEB PORTAL by Dipti Aswath SWIM Meeting ORNL Oct 15-17, 2007.
ARCHIMÈDE Presented by Guy Teasdale Directeur, Services soutien et développement Bibliothèque de l’Université Laval CARL Workshop on Institutional Repositories.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
Introducing Symposia : “ The digital repository that thinks like a librarian”
NOBLE Digital Library. How does it work? The NOBLE Digital Library uses the DSpace platform. Image files and metadata are imported into DSpace using.
Catania Science Gateway Framework Motivations, architecture, features Catania, 09/06/2014Riccardo Rotondo
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
ETD Repositories Using DSpace Software Andrew Penman The Robert Gordon University 27 th September 2004.
Digitization to preserve Cultural Heritage. A use case - Federico De Roberto works, Trujillo, 14th Apr 09 Antonio Calanducci
CERN – IT Department CH-1211 Genève 23 Switzerland t CERN Open Source Collaborative tools: Digital Library Software Tim Smith CERN/IT.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
“Old Style” Libraries, Digital Libraries: Convergences, Divergences, And the Troubles in Between.
Towards a Javascript CoG Kit Gregor von Laszewski Fugang Wang Marlon Pierce Gerald Guo
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
1.The portal sends, under the user approval, user’s attribute retrieved from IDP to CA bridge 2.CA bridge module requests to a CA-online a certificate.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Public Domain/Open Source Software Evaluation Photo Organizer.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
© 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice The China Digital Museum Project.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
R utgers C ommunity R epository RU CORE 1 A Statewide Community of Trust: An RUcore Implementation using Shibboleth and XACML The Fourth International.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
Introduction to Web AppBuilder for ArcGIS: JavaScript Apps Made Easy
Supported by EU projects 12/12/2013 Athens, Greece Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
UNIONE EUROPEA Jorge Sevilla Cedillo Istituto Nazionale di Fisica Nucleare – Catania 2.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
This presentation describes the development and implementation of WSU Research Exchange, a permanent digital repository system that is being, adding WSU.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
OSU | PSU | UO The Oregon Spatial Data Library: A Vision for Increased Data Sharing Myrica McCune Institute for Natural Resources Marc Rempel Oregon State.
Internet Documentation and Integration of Metadata (IDIOM) Presented by Ahmet E. Topcu Advisor: Prof. Geoffrey C. Fox 1/14/2009.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Taming the Big Data in Computational Chemistry #euroCRIS2015 Barcelona 9-11-XI-2015 Carles Bo ICIQ (BIST) -
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks A GRID based platform to host multiple repositories.
How to integrate EGI portals with Identity Federations Roberto Barbera Univ. of Catania and INFN EGI Technical Forum – Prague,
Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Rome - 24 January Earth Server EU FP7-INFRA project Scalability for Big Data Roberto Barbera - University of Catania and INFN - Italy
Miguel Ángel Saúl Soto INFN - Sezione di Catania Supervisor: Antonio Calanducci
The world’s open source learning platform Moodle Mobile SCORM 1.2 player Juan Leyva & Daniel
Utilizzo di portali per interfacciamento tra Grid e Cloud Workshop della Commissione Calcolo e Reti dell’INFN, May Laboratori Nazionali del.
EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number DARIAH Competence Centre e-Infrastructure.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Integration of China Relics and gLite with gLibrary You MENG
REST API to develop application for mobile devices Mario Torrisi Dipartimento di Fisica e Astronomia – Università degli Studi.
CONTENTdm A proven solution September A complete digital collection management software solution Stores, manages and provides access for all digital.
UNIONE EUROPEA Jorge Sevilla Cedillo Istituto Nazionale di Fisica Nucleare – Catania 2.
The eCSG Mobile App Mario Torrisi INFN – Division of Catania 24 June 2013 Webinar on the eCSG 1.
DARIAH EU AAI consideration K. Skala, D. Davidović, Z. Šojat Lisbon, 22 May 2015.
Web and mobile access to digital repositories Mario Torrisi National Institute of Nuclear Physics – Division of
A Data Engine for Grid Science Gateways Enabling Easy Transfers and Data Sharing Dr. Marco Fargetta (1), Mr. Riccardo Rotondo (2,*), Prof. Roberto Barbera.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
B. Piringer R. Barbera, A. Calanducci, C. Carrubba, D. Davidovic, G
VI-SEEM Data Discovery Service
AMGA Web Interface Salvatore Scifo INFN sez. Catania
VI-SEEM Data Repository
CS 501: Software Engineering Fall 1999
WP1 Video analysis and annotation WP5a Architecture and Interfaces
GSAF Grid Storage Access Framework
VI-SEEM Data Repository
GSAF Grid Storage Access Framework
EGI FedCloud in Digital Humanities
The SADE mini-project of the EGI DARIAH Competence Centre
敦群數位科技有限公司(vanGene Digital Inc.) 游家德(Jade Yu.)
Presentation transcript:

Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC 5 th General VCC Meeting Ljubljana, Slovenia – 22 April 2015 The INFN Digital Repository System

Outline  The De Roberto Cultural Heritage’s use case  The gLibrary Digital Repository System  High-level architecture & Technologies used  Data Management APIs  (Some) examples of the Cultural Heritage VRC  Summary and conclusions DARIAH ERIC 5 th General VCC Meeting 2

3 Federico De Roberto, an Italian writer born in Naples but spending his life in Catania, has left to the humanistic community numerous works DARIAH ERIC 5 th General VCC Meeting De Roberto Cultural Heritage’s use case Those are made up of valuable and hard-to-manage pieces: manuscripts, typescripts, drafts with handwritten corrections, magazines, sketches, photos, etc.

4 Digitalization of manuscripts, typescripts, printed works TIFF Files, one per page, 600 dpi, about 100MB for A3 High resolution scans for in-depth examination 8000 sheets/scans, 3 TB of disk space Different physical formats, A3/A4/custom size DARIAH ERIC 5 th General VCC Meeting Acquisition stage Embedded Metadata TIFF with embedded metadata to provide scan physical features and information about the content ImageWidth, ImageHeight, XResolution, FileSize, CreationDate, ModifyDate, Description, Keywords, CaptionWriter, Title, Author, Copyright Status, Copyright Notice Added with Photoshop after the digitalization phase

5 Make those works accessible to the communities Always on-line: 24h x 365 and available from everywhere Simple and easy-to-use interface for non-expert people DARIAH ERIC 5 th General VCC Meeting Requirements Quickly find the desired document Document organization according the physical and semantic metadata  Organization by type/collections  Dynamic filtering of search result sets according the selection of one or more document metadata Long-term preservation Multiple copies (replicas) spread in different geographical sites Reliability of storage systems and replica redundancy to achieve secure preservation

6 Store the 8000 scans of De Roberto Heritage & implement the Long-term digital preservation of data Grid & Cloud Storages! DARIAH ERIC 5 th General VCC Meeting Requirements Enable 24/24h access to scientists Web Service Document organization for a quick search Metadata services Simple and easy-to-use system for searches, organization, upload and download of digitalized documents on e- Infrastructures

DARIAH ERIC 5 th General VCC Meeting 7 The INFN Digital Repository System (

8 gLibrary is a platform developed by INFN that provides a simple yet powerful system to organize, search, store and retrieve “digital assets” in distributed repositories built on Grid/Cloud/local storage infrastructures hides the underlying technical details to the users “digital assets”: digital object + corresponding metadata DARIAH ERIC 5 th General VCC Meeting in a nutshell

9 Digital Object: Any files (PNG, JPG, PDF, TIFF, RAW, MP3, MP4, etc.) Metadata: A set of attributes describing a digital object (resolution, author, title, description, location(geo-coords), subject, etc.) Digital Asset: A digital object + its metadata Collection: A set of digital assets of the same type (Presentations, manuscripts) Repository: A library of digital assets organized by collections (all the Presentations and manuscripts) DARIAH ERIC 5 th General VCC Meeting abstractions

10 DARIAH ERIC 5 th General VCC Meeting architecture eToken service Front ends glibrary.ct.infn.it REST API AuthN / AuthZ Science Gateway User Tracking DB Call gLibrary REST API through API Server Gateway Metadata Service Local storage Grid storage Cloud Storage Authorization service GridBOX

11 gLibrary Core Services are implemented using Python and node.Js The gLibrary Metadata and File Transfer Services can be accessed through a set of REST API s REST APIs are developed as a WSGI module in Apache container DARIAH ERIC 5 th General VCC Meeting list of technologies used Grid-based and Federated Authentications are now supported! Data Transfer APIs are provided by GridBOX! Metadata service has been deployed using Django framework An OAI-PMH interface has been implemented on top of gLibrary Metadata services to allow external harvesters the extraction of gLibrary repositories’ metadata

12 DARIAH ERIC 5 th General VCC Meeting Repository Browser Web App

13 DARIAH ERIC 5 th General VCC Meeting e-Cultural Science Gateway in INDICATE

14 DARIAH ERIC 5 th General VCC Meeting Repository Uploader HTML5 Web App It allows to upload new assets to already created repository and specify metadata using a predefined schema

15 DARIAH ERIC 5 th General VCC Meeting Native Mobile clients for accessing repositories

16 DARIAH ERIC 5 th General VCC Meeting Federated Authentication (implementation for mobile appliances) 4. Extract Shibboleth token from response header 1. Get available IDPs Science Gateway 3. Open WebView glibrary.ct.infn.it REST API Now you can issue any API calls to gLibrary REST API 2. Supported IDPs list

17 DARIAH ERIC 5 th General VCC Meeting De Roberto Digital Repository from iPhone Demo presented at the EGEE UF5 in Uppsala Some screenshots here [1]1 YouTube video here [2]2

DARIAH ERIC 5 th General VCC Meeting 18 Data Management APIs for accessing Grid & Cloud Object Storages

Data Management APIs / Download from Grid SE 19 DARIAH ERIC 5 th General VCC Meeting

Data Management APIs / Upload to Grid SE (1/3) 20 DARIAH ERIC 5 th General VCC Meeting

21 DARIAH ERIC 5 th General VCC Meeting Data Management APIs / Upload to Grid SE (2/3)

22 DARIAH ERIC 5 th General VCC Meeting Data Management APIs / Upload to Grid SE (3/3)

Data Management APIs / Download from Cloud Object Storage 2323 DARIAH ERIC 5 th General VCC Meeting

2424 Data Management APIs / Upload to Cloud Object Storage

DARIAH ERIC 5 th General VCC Meeting 25 Repository Management for interacting with the Digital Repository System

DARIAH ERIC 5 th General VCC Meeting 26

Application’s workflow Register analytics on repository Science Gateway run jobs Browse digital assets DARIAH ERIC 5 th General VCC Meeting 27 HPC Clusters Register DOI Metadata Service eToken service glibrary.ct.infn.it User Tracking DB Local storage Grid storage Cloud Storage REST API

run jobs Get digital assets DARIAH ERIC 5 th General VCC Meeting 28 HPC Clusters Metadata Service eToken service glibrary.ct.infn.it User Tracking DB Local storage Grid storage Cloud Storage REST API Science Gateway Application’s workflow

29 DARIAH ERIC 5 th General VCC Meeting Summary and conclusions gLibrary aims to provide a simple framework to manage digital assets on distributed storage, hiding underlying technical infrastructure details Current features: REST APIs to access available digital assets Security: Support for Federated Authentication Usability: Several gLibrary front-ends for web and mobile scenarios

Thank you ! 30 DARIAH ERIC 5 th General VCC Meeting