Doron Goldfarb & Yann LE FRANC

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Distributed search for complex heterogeneous media Werner Bailer, José-Manuel López-Cobo, Guillermo Álvaro, Georg Thallinger Search Computing Workshop.
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
Databases & Data Warehouses Chapter 3 Database Processing.
4th project meeting 27-29/05/2013, Budapest, Hungary FP 7-INFRASTRUCTURES programme agINFRA agINFRA A data infrastructure for agriculture.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
The Marine Metadata Interoperability Project A Model for Community Collaboration September 23, 2010 Nan Galbraith WHOI.
1 Carlos Rueda, Paul Alexander, John Graybeal Marine Metadata Interoperability Project (MMI) Monterey Bay Aquarium Research Institute (MBARI) The MMI Registry.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Making Connections: SHARE and the Open Science Framework Jeffrey Open Repositories 2015.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Supported by EU projects 12/12/2013 Athens, Greece Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Proof of concept study of the Socio-Ecological Research and Observation oNTOlogy (SERONTO) for integrating multiple ecological databases. Introduction.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
Data Grid Research Group Dept. of Computer Science and Engineering The Ohio State University Columbus, Ohio 43210, USA David Chiu & Gagan Agrawal Enabling.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Processing services.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The pan-European.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Linked Data Profiling Andrejs Abele National University of Ireland, Galway Supervisor: Paul Buitelaar.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Discovering libraries’ gold through collection-level descriptions ELAG 2014, Bath Valentine Charles Data specialist.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Working.
Data Grid Research Group Dept. of Computer Science and Engineering The Ohio State University Columbus, Ohio 43210, USA David Chiu and Gagan Agrawal Enabling.
Of 24 lecture 11: ontology – mediation, merging & aligning.
EBI is an Outstation of the European Molecular Biology Laboratory. Semantic Interoperability Framework Sarala M. Wimalaratne (RICORDO project)
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Support to scientific.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Data Foundations And Terminology (DFT) IG Virtual Meeting July 6 th 2016 Co-Chairs DFT IG :Gary Berg-Cross & Raphael Ritz P8 Sessions DFT IG Breakout Session.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Enriching Europeana.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No LTER- Europe &
The UMLS and the Semantic Web
An Overview of Data-PASS Shared Catalog
Integrating Data for Archaeology
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
knowledge organization for a food secure world
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Toward FAIR Semantic Resources
VI-SEEM Data Repository
Data Discovery Paradigms Interest Group Report on Activities and Outputs Anita de Waard, Siri Jodha Singh Khalsa Fotis Psomopoulis Mingfang Wu.
Chair of Tech Committee, BetterGrids.org
DATA SPHINX & EUDAT Collaboration
Thanks to Bill Arms, Marti Hearst
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
EOSCpilot All Hands Meeting 8 March 2018 Pisa
2. An overview of SDMX (What is SDMX? Part I)
Metadata Construction in Collaborative Research Networks
Semantic Annotation service
LOD reference architecture
Statistical Information Technology
RDA Community and linked data
Taxonomy of public services
Étienne Saint-Pierre, Statistics Canada
Taxonomy of public services
Cultivating Semantics for Data in Agriculture and Nutrition
Presentation transcript:

Enhancing the discoverability and inter-operability of multi-disciplinary semantic repositories Doron Goldfarb & Yann LE FRANC EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065

Question: Why multitude of repos, why not only one impl for all? Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

Problem: Where to find semantic RESOURCES Semantic Repositories (Ontology Libraries, -Registries [d'Aquin&Noy, 2012]) Question: Why multitude of repos, why not only one impl for all? Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

GROWING LANDSCAPE OF SEMANTIC REPOSITORIES LOV Question: Why multitude of repos, why not only one impl for all? Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

POSES CHALLENGES for various applications Semantic Annotation Where can I find the most suitable concept for annotating my data? Example: B2Note Creation/Maintenance of own semantic resources Are there existing resources covering concepts relevant to my domain? Example: Environmental Thesaurus In which repository can relevant resources be found? Question: Why multitude of repos, why not only one impl for all? Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

SOLUTION: AGGREGATION OF SEMANTIC REPOSITORIES LOV Aggregator Search, Align, etc

DISTRIBUTED VS CENTRALIZED ACCESS LOV BioPortal API Search „Human“ AgroPortal API Search „Human“ EBI-OLS API Search „Human“ FINTO/SKOSMOS API Search „Human“ LOV API Search „Human“ Aggregator Search „Human“

distributed VS CENTRALIZED ACCESS LOV BioPortal API Search „Human“ AgroPortal API Search „Human“ EBI-OLS API Search „Human“ FINTO/SKOSMOS API Search „Human“ LOV API Search „Human“ Aggregator Search „Human“ Normalized Ontology Repositories (NOR)

distributed VS CENTRALIZED ACCESS LOV Retrieve extracts from available info about all concepts in all ontologies in all repositories Aggregator Store extracts in database Search index

distributed VS CENTRALIZED ACCESS LOV Retrieve extracts from available info about all concepts in all ontologies in all repositories Aggregator Store extracts in database Semantic Service "Manual" search Cross Repository Analytics (Concept Reuse, etc.) Search index Ranking

distributed VS CENTRALIZED Uris Uri of the ontology class. Labels Human readable label of the ontology class. Description Definition of the ontology class. Short_form Short form of the ontology class. Synonyms List of synonym labels referenced for the ontology class. Ontology_acronym Acronym of the ontology the class pertains to. Ontology_iri IRI of the ontology the class pertains to. Ontology_name Name of the ontology the class pertains to. Ontology_vdate Date of most recent version of the ontology Ontology_version Version ID of most recent version of the ontology Acrs_of_ontologies_reusing_uri List of acronyms for the ontologies reusing the class. Domains  Scientific domain covered by the ontology distributed VS CENTRALIZED LOV Retrieve extracts from available info about all concepts in all ontologies in all repositories Local resource Concept 1 Concept 2 Concept 3 . Concept N Aggregator Store extracts in database Repeated queries Cross Repository Analytics (Concept Reuse, etc.) Search index Ranking

CHALLENGE: HETEROGENEOUS REPOSITORY APIs Different Query/Reponse Syntaxes Example: Heterogeneous version information (also within one and the same repository) Desired information not always available Example: Currently only Ontology level information available Different combinations of queries necessary Example: Retrieve all ontologies, for each ontology three additional calls: version category terms

CHALLENGE: API CALL SEQUENCES

DIFFERENT APPROACHES „Plug-in“/Wrapper style: Create individual retrieval logic for each repository Flexible but technically challenging Find high level „description language“ for available apis Less effort for integration, but less expressive Foster common standards for ontology/concept metadata and for repository APIs Common Metadata Description Common API Framework

PROOF OF CONCEPT APPROACH SpecIfiy API via JSONPath "_comment": "From http://data.bioontology.org/documentation", "repo" : { "name": "Bioportal" }, "ontologies": { "url": "http://data.bioontology.org/ontologies?apikey=<KEY>&format=json&pagesize=500", "next": "links.nextPage", "ontolist": "$", "ontourl": "@id", "ontoprefix": "acronym", "ontoname": "name", "ext1":{ "url": "http://data.bioontology.org/ontologies/<ONTOID>/latest_submission?apikey=<KEY>&format=json", "token": "<ONTOID>", "input": "ontoprefix", "fields": { "ontoversion": "version", "ontovdate": "released" } "terms„: { "url": "http://data.bioontology.org/ontologies/<ONTOID>/classes?apikey=<KEY>&format=json&pagesize=500", "ontotoken": "<ONTOID>", "termlist": "collection", "termid": "@id", "label": "prefLabel", "description": "definition", "synonyms": "synonym", "obsolete" : "obsolete"

Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 Instances missing in statistics Not as easy to be harvested as classes (at least in bio-/agroportal) Not as structured info available

Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 “Beta Cell Genomics Ontology”: “bcgo” (EBI-OLS) “obi_bcgo“ (BioPortal) “aeo”: “Agricultural Experiments Ontology” (AgroPortal) “Anatomical Entity Ontology” (EBI-OLS/BioPortal)

Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 “Beta Cell Genomics Ontology”: “bcgo” (EBI-OLS) “obi_bcgo“ (BioPortal) “aeo”: “Agricultural Experiments Ontology” (AgroPortal) “Anatomical Entity Ontology” (EBI-OLS/BioPortal)

Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 “Beta Cell Genomics Ontology”: “bcgo” (EBI-OLS) “obi_bcgo“ (BioPortal) “aeo”: “Agricultural Experiments Ontology” (AgroPortal) “Anatomical Entity Ontology” (EBI-OLS/BioPortal)

Conclusions A centralized index of concept extracts from multiple resources in multiple repositories enables Central access to multi-disciplinary concepts/terminology for semantic services Identification/cross-repo search of unique and overlapping resources in different repositories Global overview on concept-reuse  improved ranking of search results  increased re-use Open Challenges: Scale Versioning Common metadata standards Common API framework

Conclusions Need for international collaboration A centralized index of concept extracts from multiple resources in multiple repositories enables Central access to multi-disciplinary concepts/terminology for semantic services Identification/cross-repo search of unique and overlapping resources in different repositories Global overview on concept-reuse  improved ranking of search results  increased re-use Open Challenges: Scale Versioning Common metadata standards Common API framework Need for international collaboration

Conclusions RDA Vocabulary and Semantic Service Interest Group https://www.rd-alliance.org/ig-vocabulary-services-rda-10th-plenary-meeting Based on effort initiated by EUDAT Semantic Web Working Group Open community of different stakeholders (Repositories, Research communities, etc.) Task forces on different topics: Strategies for aggregating vocabularies Vocabulary API White paper Ontology metadata standard Ontology Governance: Requesting changes Strategies for selecting from vocabularies

Thank YOU Contact & Information Doron Goldfarb Environment Agency Austria doron.goldfarb@umweltbundesamt.at Yann le Franc e-Science Data Factory ylefranc@esciencefactory.com EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065