HIVE: Enabling Common Language and Interdisciplinarity EPA-NIEHS Advancing Environmental Health Data Sharing and Analysis: Finding a Common Language June.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Geoscience Information Network Stephen M Richard Arizona Geological Survey National Geothermal Data System.
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
The Dryad Data Repository Ryan Scherle 1, Hilmar Lapp 1, Amol Bapat 2, Sarah Carrier 2, Jane Greenberg 2, Peggy Schaeffer 1, Todd Vision 1,3, Hollie White.
Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill.
Helping Helping Interdisciplinary Vocabulary Engineering Ryan Scherle – National Evolutionary Synthesis Center Jose Aguera – University of North Carolina.
DLESE and NSDL The role of the Digital Library for Earth System Education* (DLESE) in the National SMETE Digital Library Presented by Dave Fulker Director.
Helping Interdisciplinary Vocabulary Engineering (HIVE) OCTOBER 31, 2011 Joan Boone Nico Carver Jane Greenberg Lina Huang Robert Losee Mady Madhura José.
1 How Semantic Technology Can Improve the NextGen Air Transportation System Information Sharing Environment 4th Annual Spatial Ontology Community of Practice.
1 CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Global Earth Observation Grid Workshop, Bangkok, Thailand, March Integration Platform.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation.
GEOSS Common Infrastructure: A practical tour Doug Nebert U.S. Geological Survey September 2008.
Digital Library Architecture and Technology
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Combining XTF and the cloud => powerful digital collections presence at a low cost Al Cornish Washington State University.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.
Testing and Improving Interoperability The Z39.50 Interoperability Testbed William E. Moen School of Library and Information Sciences Texas Center for.
Producción de Sistemas de Información Agosto-Diciembre 2007 Sesión # 8.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
The Saguaro Digital Library for Natural Asset Management Dr. Sudha RamSudha Ram Advanced Database Research Group Dept. of MIS The University of Arizona.
Open Terminology Portal (TOP) Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer Institute, Center for Biomedical Informatics.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
SEAD Virtual Archive :: A Thin Layer for Scientific Discovery and Long-Term Preservation Inna Kouper April #dlbbspring2013.
W HAT IS I NTEROPERABILITY ? ( AND HOW DO WE MEASURE IT ?) INSPIRE Conference 2011 Edinburgh, UK.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
A Prototype Ontology Tool and Interface for Coastal Atlas Interoperability Dawn J. Wright 1, Luiz Bermudez 2 (presenter), Liz O’Dea 3, Yassine Lassoued.
EcoTerm IV NBII/EioNet Demo of Federated KOS Search Mike Frame Vienna, Austria April 2007.
AGROVOC Thesaurus. 1980s: developed as multilingual structured thesaurus for agricultural terminology (“rice”) : parallel effort to express thesaurus.
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
/Greenberg/NDS DataDryad.org and the interoperability continuum. Repositories and Interoperability 2nd National Data Service Consortium Workshop.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Semantic Web Portal: A Platform for Better Browsing and Visualizing Semantic Data Ying Ding et al. Jin Guang Zheng, Tetherless World Constellation.
Jane Greenberg, Director, Metadata Research Center, and Professor, College of Computing & Informatics Isaac Simmons, Research Engineer, Applied Informatics.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Jane Greenberg & the Dryad Team The DRYAD Repository ~~~~~~ INLS 720 visit to NESCent November 17, 2008.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Update on Ecoinformatics Technical Working Group Activities Larry Fitzwater Computer Scientist US Environmental Protection Agency Rome, Italy – 17 May.
HIVE-DRYAD Integration. For Curators Use HIVE to generate subject, taxon, and spatial terms suggestion. Curator’s needs: – Get terms suggestion from HIVE.
HIVE as a Machine-aided Indexing Tool Personal Keyword use without vocabulary control Machine-aided indexing term extraction Participant relevant and not.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
Chelcie Rowell Jane Greenberg Metadata Research Center UNC-Chapel Hill CONTROLLED VOCABULARY STATUS & POTENTIAL IN DATA REPOSITORIES Authority Control.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
TRSS Terminology Registry Scoping Study
DataNet Collaboration
Document, Index, Discover, Access
The JISC IE Metadata Schema Registry
Web archives as a research subject
Brokering as a Core Element of EarthCube’s Cyberinfrastructure
AB 1755 The Open and Transparent Water Data Act
Australian and New Zealand Metadata Working Group
Presentation transcript:

HIVE: Enabling Common Language and Interdisciplinarity EPA-NIEHS Advancing Environmental Health Data Sharing and Analysis: Finding a Common Language June 25, 2013 Jane Greenberg, Professor SILS Director, SILS Metadata Research Center

Overview Languages of aboutness Ontology Vocabulary challenge(s) re … scientific data HIVE—Helping Interdisciplinary Vocabulary Engineering Conclusions, Q & A

Languages for aboutness A Language A systematic arrangement of concepts What makes a language systematic? What makes an indexing language systematic? Advantages & disadvantages Discovery Communication Interoperability Browsing, serendipity Context, grouping Overview of the scope of a service Partitioning / Segmenting (facets) Multilingual access Known by users Machine processing Costly Stagnant/difficulty in adding new concepts.

4 (McGuinness, D. L. (2003). Ontologies Come of Age. In Fensel, et al, Spinning the Semantic Web. (Cambridge, MIT Press), pp [see also, p ])

Vocabulary challenge(s) and scientific data management Research Challenge Apply standard vocabulary terms to data in collections to improve organization and discovery Applications needed to…  Help researchers select appropriate terms for describing data sets  Integrate terminology selection with data ingestion tools  Apply standard vocabularies and not reinvent the wheel

6 HIVE model  approach for integrating discipline CVs  Model addressing C V cost, interoperability, and usability constraints (interdisciplinary environment)

Results from study with 600 keywords  431 topical terms, exact matches: NBII Thesaurus, 25%; MeSH, 18%  531 terms (topical terms, research method and taxon): LCSH, 22% found exact matches, 25% partial Conclusion: Need multiple vocabularies Dryad…nonprofit organization and an international repository of data underlying scientific and medical publications

~~~~Amy Meet Amy Zanne. She is a botanist. Like every good scientist, she publishes, and she deposits data in Dryad. Amy’s data

About HIVE… GoalPlanVocabulary PartnersWorkshop Hosts Provide efficient, affordable, interoperable, and user friendly access to multiple vocabularies during metadata creation activities Present a model and an approach that can be replicated  —> not necessarily a service Build Plan Evaluate Library of Congress: LCSH Getty Research Institute (GRI): TGN (Thesaurus of Geo. Names ) United States Geological Survey (USGS): NBII Thesaurus, Integrated Taxonomic Information System (ITIS) National Library of Medicine and the National Agricultural Library FAO Columbia Univ. Univ. of California, San Diego George Washington University Univ. of North Texas Universidad Carlos III de Madrid, Madrid, Spain

HIVE Team Craig Willis Bob Losee Lee Richardson Hollie White Jane Greenberg Madhura Marathe Lina Huang José R. P. Agüera Ryan Scherle

HIVE in LTER, Dryad,… Library of Congress Web Archives Minerva project Smithsonian Field Notebook project US Geological Survey, USGS Thesaurus Universidad Carlos III de Madrid (UC3M) Inst. Legal Information Theory & Techniques, NRC, Italy

HIVE/iRODS Integration HIVE System IRODS Metadata Catalog iDrop Web UI HIVE Indexer iDrop SPARQL Search User uses SPARQL for rich metadata queries, displaying links to DFC files and collections. Demo web2 web2 1.Search HIVE 2.Index with HIVE 3.Query via HIVE

HIVE Across the US DataNets Survey ~ a framework studying controlled vocabulary use across all DataNets 1.Which controlled vocabularies? 2.Purposes that these controlled vocabularies serve (e.g. subject description of datasets or description of analytical processes or protocols that have been applied to certain datasets) 3.Facilitators and inhibitors of controlled vocabulary use by data contributors, curators, NSF DataNet Partner administrators, and repository infrastructure developers

Conclusions Controlled vocabularies encourage consistent classification of data – With DFC (Datanet Federation Consortium) we’ll be addressing findability of data on distributed grids HIVE (or the HIVE approach) allows users to search and apply terms from multiple vocabularies Common languages can be generated in different ways – Emphasize the benefits, and reduce the limitations Acknowledgements: Many people, students, IMLS, NSF, etc.

Technical overview and architecture  HIVE combines several open-source technologies to provide a framework for vocabulary services.  Java-based web services can run in any Java application server.  Demonstration RENCI and NESCent  Open-source Google Code ( mrc/). mrc/