Presentation is loading. Please wait.

Presentation is loading. Please wait.

Doron Goldfarb & Yann LE FRANC

Similar presentations


Presentation on theme: "Doron Goldfarb & Yann LE FRANC"— Presentation transcript:

1 Enhancing the discoverability and inter-operability of multi-disciplinary semantic repositories
Doron Goldfarb & Yann LE FRANC EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No

2 Question: Why multitude of repos, why not only one impl for all?
Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

3 Problem: Where to find semantic RESOURCES
Semantic Repositories (Ontology Libraries, -Registries [d'Aquin&Noy, 2012]) Question: Why multitude of repos, why not only one impl for all? Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

4 GROWING LANDSCAPE OF SEMANTIC REPOSITORIES
LOV Question: Why multitude of repos, why not only one impl for all? Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

5 POSES CHALLENGES for various applications
Semantic Annotation Where can I find the most suitable concept for annotating my data? Example: B2Note Creation/Maintenance of own semantic resources Are there existing resources covering concepts relevant to my domain? Example: Environmental Thesaurus In which repository can relevant resources be found? Question: Why multitude of repos, why not only one impl for all? Answer from „Normalized access to ontology repositories„, Viljanen et al. x) Different ontos and user needs require different functionalities x) Some ontologies not available as file but only as service x) Security/Business -> internal onto repos x) Informal ontos not really in Repos

6 SOLUTION: AGGREGATION OF SEMANTIC REPOSITORIES
LOV Aggregator Search, Align, etc

7 DISTRIBUTED VS CENTRALIZED ACCESS
LOV BioPortal API Search „Human“ AgroPortal API Search „Human“ EBI-OLS API Search „Human“ FINTO/SKOSMOS API Search „Human“ LOV API Search „Human“ Aggregator Search „Human“

8 distributed VS CENTRALIZED ACCESS
LOV BioPortal API Search „Human“ AgroPortal API Search „Human“ EBI-OLS API Search „Human“ FINTO/SKOSMOS API Search „Human“ LOV API Search „Human“ Aggregator Search „Human“ Normalized Ontology Repositories (NOR)

9 distributed VS CENTRALIZED ACCESS
LOV Retrieve extracts from available info about all concepts in all ontologies in all repositories Aggregator Store extracts in database Search index

10 distributed VS CENTRALIZED ACCESS
LOV Retrieve extracts from available info about all concepts in all ontologies in all repositories Aggregator Store extracts in database Semantic Service "Manual" search Cross Repository Analytics (Concept Reuse, etc.) Search index Ranking

11 distributed VS CENTRALIZED
Uris Uri of the ontology class. Labels Human readable label of the ontology class. Description Definition of the ontology class. Short_form Short form of the ontology class. Synonyms List of synonym labels referenced for the ontology class. Ontology_acronym Acronym of the ontology the class pertains to. Ontology_iri IRI of the ontology the class pertains to. Ontology_name Name of the ontology the class pertains to. Ontology_vdate Date of most recent version of the ontology Ontology_version Version ID of most recent version of the ontology Acrs_of_ontologies_reusing_uri List of acronyms for the ontologies reusing the class. Domains  Scientific domain covered by the ontology distributed VS CENTRALIZED LOV Retrieve extracts from available info about all concepts in all ontologies in all repositories Local resource Concept 1 Concept 2 Concept 3 . Concept N Aggregator Store extracts in database Repeated queries Cross Repository Analytics (Concept Reuse, etc.) Search index Ranking

12 CHALLENGE: HETEROGENEOUS REPOSITORY APIs
Different Query/Reponse Syntaxes Example: Heterogeneous version information (also within one and the same repository) Desired information not always available Example: Currently only Ontology level information available Different combinations of queries necessary Example: Retrieve all ontologies, for each ontology three additional calls: version category terms

13 CHALLENGE: API CALL SEQUENCES

14 DIFFERENT APPROACHES „Plug-in“/Wrapper style: Create individual retrieval logic for each repository Flexible but technically challenging Find high level „description language“ for available apis Less effort for integration, but less expressive Foster common standards for ontology/concept metadata and for repository APIs Common Metadata Description Common API Framework

15 PROOF OF CONCEPT APPROACH SpecIfiy API via JSONPath
"_comment": "From "repo" : { "name": "Bioportal" }, "ontologies": { "url": " "next": "links.nextPage", "ontolist": "$", "ontourl": "ontoprefix": "acronym", "ontoname": "name", "ext1":{ "url": " "token": "<ONTOID>", "input": "ontoprefix", "fields": { "ontoversion": "version", "ontovdate": "released" } "terms„: { "url": " "ontotoken": "<ONTOID>", "termlist": "collection", "termid": "label": "prefLabel", "description": "definition", "synonyms": "synonym", "obsolete" : "obsolete"

16 Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols
AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 Instances missing in statistics Not as easy to be harvested as classes (at least in bio-/agroportal) Not as structured info available

17 Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols
AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 “Beta Cell Genomics Ontology”: “bcgo” (EBI-OLS) “obi_bcgo“ (BioPortal) “aeo”: “Agricultural Experiments Ontology” (AgroPortal) “Anatomical Entity Ontology” (EBI-OLS/BioPortal)

18 Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols
AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 “Beta Cell Genomics Ontology”: “bcgo” (EBI-OLS) “obi_bcgo“ (BioPortal) “aeo”: “Agricultural Experiments Ontology” (AgroPortal) “Anatomical Entity Ontology” (EBI-OLS/BioPortal)

19 Proof of concept: Aggregate BioPortal, Agroporal and ebi-ols
AgroPortal (63/64) BioPortal (534/586) EBI-OLS (189/193) Total 1,198,472/1,200,845 7,569,311/8,130,580 4,893,030/4,894,758 Unique URI 1,186,681 6,659,704 4,235,425 Unq. Label 1,122,242 5,379,485 3,938,468 “Beta Cell Genomics Ontology”: “bcgo” (EBI-OLS) “obi_bcgo“ (BioPortal) “aeo”: “Agricultural Experiments Ontology” (AgroPortal) “Anatomical Entity Ontology” (EBI-OLS/BioPortal)

20 Conclusions A centralized index of concept extracts from multiple resources in multiple repositories enables Central access to multi-disciplinary concepts/terminology for semantic services Identification/cross-repo search of unique and overlapping resources in different repositories Global overview on concept-reuse  improved ranking of search results  increased re-use Open Challenges: Scale Versioning Common metadata standards Common API framework

21 Conclusions Need for international collaboration
A centralized index of concept extracts from multiple resources in multiple repositories enables Central access to multi-disciplinary concepts/terminology for semantic services Identification/cross-repo search of unique and overlapping resources in different repositories Global overview on concept-reuse  improved ranking of search results  increased re-use Open Challenges: Scale Versioning Common metadata standards Common API framework Need for international collaboration

22 Conclusions RDA Vocabulary and Semantic Service Interest Group
Based on effort initiated by EUDAT Semantic Web Working Group Open community of different stakeholders (Repositories, Research communities, etc.) Task forces on different topics: Strategies for aggregating vocabularies Vocabulary API White paper Ontology metadata standard Ontology Governance: Requesting changes Strategies for selecting from vocabularies

23 Thank YOU Contact & Information
Doron Goldfarb Environment Agency Austria Yann le Franc e-Science Data Factory EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No


Download ppt "Doron Goldfarb & Yann LE FRANC"

Similar presentations


Ads by Google