Web Semantico e archivi digitali: KLIOS e la Knowledge Base del progetto CHAIN-REDS Giuseppina Inserra INFN Catania Workshop della Commissione Calcolo e Reti dell’INFN Genova, 30 Maggio 2013
Outline Introductory considerations The CHAIN Knowledge Base e-Infrastructure The CHAIN Knowledge Base The KLIOS Search Engine Semantic enrichment and linked data Application Summary and conclusions
e-Infrastructure “e’-Infrastructure is an environment where research resources (hardware, software and content) can be readily shared and accessed where necessary to promote better and more effective research. Such environments integrate hard-, soft- and middleware components, networks, data repositories, and all sorts of support enabling virtual research collaborations to flourish globally.” (*) (*) EC-endorsed definition: http://cordis.europa.eu/ictresults/index.cfm?ID=90825§ion=news&tpl=article
Information: The CHAIN Knowledge Base (www. chain-project Information: The CHAIN Knowledge Base (www.chain-project.eu/knowledge-base) Largest e-Infrastructure related knowledge base. Information both from the survey and other sources for more than half of the countries of the world RREN(s) NREN NGI CA(s) Id.Fed(s) ROC(s) Grid site(s) Application(s)
From CHAIN to CHAIN-REDS Started: 1 Dec 2012 Duration: 30 months Ended: 30 Nov 2012
CHAIN-REDS program for Data Infrastructures Identify standards to easily gather and access both Open Access Document Repositories (OADRs) and Data Repositories (DRs) Build a demonstrator to easy visualise and access OADRs and DRs (both geo-views and tab-views) Correlate OADRs and DRs to create linked data and discover new knowledge through semantic enrichment of metadata Promote Data Infrastructure standards and identify new OADRs and DRs from regions addressed by the project (Africa, Middle-East and Gulf Region, Latin America, China, India, Far-East Asia) Populate the demonstrator with these new repositories, add them to the semantic enrichment tool, and set-up at least two use-cases from different domains
Open Access Document Repositories (OADRs) >33 M docs
Open Access Document Repositories (OADRs)
Data Repositories (DRs) Lots of data !
Data Repositories (DRs)
Extending CHAIN-REDS KB with KLIOS KLIOS project is a 2-year grant of INFN to develop small research projects and implement it in real-life use cases
The approch is based on two fundamental pillars: Aims to develop an open access, participatory infrastructure for linking scientists and scientific data/information resources. The approch is based on two fundamental pillars: interconnection and integration of scientific resources through a grid of meta-data network and community facilitation for scientists as well as non- experts (the «citizen scientist») KLIOS: Knowledge Linking and sharIng in research dOmainS - http://klios.ct.infn.it
Standard-based (SAGA) middleware-independent Grid Engine Administrator Common User Researcher Doctor View Your Data Semantic Search ....... App N Add Publications Standard-based (SAGA) middleware-independent Grid Engine Users from different organisations having different roles and privileges GRID 13
Multi-layered architecture Linked data semantic search Semantic enrichment Metadata harvesting
Multi-layered architecture Linked-data search engine Semantic-web enrichment Harvester (running on grid/cloud) Harvester (running on grid/cloud) OAI-PMH End-points OAI-PMH Data Repos. OADRs
Metadata Harvesting The metadata harvester is a process running either on a Grid or a Cloud infrastructure which consists of the following parts: Get the address of each repository publishing an OAI-PMH standard endpoint; Retrieve, using the OAI-PMH repository address, the related Dublin Core encoded metadata in XML format; Get the records from the XML files and, using the Apache Jena API, transform the metadata in RDF format; Save the RDF files into a Virtuoso triple store according to an OWL-compliant ontology built using Protégé.
Semantic Enrichment
Linked data semantic search (www.chain-project.eu/linked-data)
Linked data semantic search (www.chain-project.eu/linked-data)
Linked data semantic search (www.chain-project.eu/linked-data)
Linked data semantic search (www.chain-project.eu/linked-data) Preliminary ! New knowledge discovery!
Linked data semantic search (www.chain-project.eu/linked-data) Preliminary ! New knowledge discovery!
Linked data semantic search (www.chain-project.eu/linked-data) Preliminary ! New knowledge discovery!
Summary and conclusions Data Infrastructures are becoming an essential component of e-Infrastructures Next years’ biggest challenge is to uniquely correlate scientific papers with data used to write them with applications used to analyse them so to be able to go across the knowledge path both ways Semantic-web and linked-data technologies can play a major role in this context and CHAIN-REDS aims to promote these standards in the targeted regions OADRs’ and DRs’ managers/owners in the region are welcome to contact us (at proj-office@chain-project.eu) to share their data within the CHAIN Knowledge Base
Thank you !