SONet: A Community-Driven Scientific Observations Network to achieve Semantic Interoperability of Environmental and Ecological Data Mark Schildhauer 1,

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Convergence Workshop, March 2013 The goals and expected outputs of the convergence initiative Dipak Kalra EuroRec.
Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Life Science Services and Solutions
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological.
Overview of OASIS SOA Reference Architecture Foundation (SOA-RAF)
OASIS Reference Model for Service Oriented Architecture 1.0
Jennifer A. Dunne Santa Fe Institute Pacific Ecoinformatics & Computational Ecology Lab Rich William, Neo Martinez, et al. Challenges.
Dr Matthew Stiff CEH Director Environmental Informatics Presentation to CRM SIG NeSC Edinburgh 12 July 2007 The Environmental Informatics Programme.
Introduction and Overview “the grid” – a proposed distributed computing infrastructure for advanced science and engineering. Purpose: grid concept is motivated.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.
Geographic Information System Geog 258: Maps and GIS February 17, 2006.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Chapter 2 The process Process, Methods, and Tools
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Version 2.0 [Review Date]
A Proposal for a Distributed Earth Observation Data Network Matthew B Jones UC Santa Barbara National Center for Ecological Analysis and Synthesis (NCEAS)
Observations and Ontologies Achieving semantic interoperability of environmental and ecological data Mark Schildhauer 1, Shawn Bowers 2, Josh Madin 3,
Designing the Microbial Research Commons: An International Symposium Overview National Academy of Sciences Washington, DC October 8-9, 2009 Cathy H. Wu.
Managing Sustainability Solutions Initiative (SSI) data Kate Beard, Steve Cousins University of Maine NERACOOS/NECOSP Data Management Workshop, Sept. 26,
Demystifying the Business Analysis Body of Knowledge Central Iowa IIBA Chapter December 7, 2005.
Students Becoming Scientists in the World: Integrating Research and Education for Sustainable Development Dr. James P. Collins Directorate for the Biological.
1 INFRA : INFRA : Scientific Information Repository supporting FP7 “The views expressed in this presentation are those of the author.
Advancing an Information Model for Environmental Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Richard P. Hooper, Kerstin Lehnert, Kim Schreuders,
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
National Science Foundation 1 Evaluating the EHR Portfolio Judith A. Ramaley Assistant Director Education and Human Resources.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
What is a Business Analyst? A Business Analyst is someone who works as a liaison among stakeholders in order to elicit, analyze, communicate and validate.
ESIP Federation: Connecting Communities for Advancing Data, Systems, Human & Organizational Interoperability November 22, 2013 Carol Meyer Executive Director.
Development Process and Testing Tools for Content Standards OASIS Symposium: The Meaning of Interoperability May 9, 2006 Simon Frechette, NIST.
‘intelligent openness’ The common objective of an RCUK data policy Gregor McDonagh
Growing challenges for biodiversity informatics Utility of observational data models Multiple communities within the earth and biological sciences are.
May 28, 2009Great Plains Network meeting, Kansas City 1 Some NSF funding opportunities Terry Langendoen (703) Expert, Robust.
Subgroup 1 Collect interoperability requirements Define common, unified data model Engage tool & data providers, data consumers Subgroup 2 Identify and.
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
W HAT IS I NTEROPERABILITY ? ( AND HOW DO WE MEASURE IT ?) INSPIRE Conference 2011 Edinburgh, UK.
Proof of concept study of the Socio-Ecological Research and Observation oNTOlogy (SERONTO) for integrating multiple ecological databases. Introduction.
Current and Potential Uses for GIS in Academic Arctic Research Michael F. Goodchild University of California Santa Barbara.
Beginning with an NSF INTEROP project whose goal is to facilitate the deployment of an Integrated Ecosystem Approach (IEA) to management in the Northeast.
® GEOSS AIP 5 Water SBA Update HDWG June 2012 Matt Austin NOAA Stefan Fuest KISTERS Jochen Schmidt NIWA.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Business Analysis. Business Analysis Concepts Enterprise Analysis ► Identify business opportunities ► Understand the business strategy ► Identify Business.
Computational Tools for Population Biology Tanya Berger-Wolf, Computer Science, UIC; Daniel Rubenstein, Ecology and Evolutionary Biology, Princeton; Jared.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Manufacturing Systems Integration Division Development Process and Testing Tools for Content Standards Simon Frechette National Institute of Standards.
XMC Cat: An Adaptive Catalog for Scientific Metadata Scott Jensen and Beth Plale School of Informatics and Computing Indiana University-Bloomington Current.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Data Services Task Team WGISS-22 meeting Annapolis, the US, September 12th 2006 Shinobu Kawahito, JAXA/RESTEC.
Teaching with Data: Context and Resources Sean Fox, SERC Carleton College.
The ADC, the CBC and the UIC Where Should These Committees Be Interacting? Gary J. Foley, USEPA Co-Chair User Interface Committee July 20, 2006.
Update on Ecoinformatics Technical Working Group Activities Larry Fitzwater Computer Scientist US Environmental Protection Agency Rome, Italy – 17 May.
ISWG / SIF / GEOSS OOSSIW - November, 2008 GEOSS “Interoperability” Steven F. Browdy (ISWG, SIF, SCC)
Data Infrastructure Building Blocks (DIBBS) NSF Solicitation Webinar -- March 3, 2016 Amy Walton, Program Director Advanced Cyberinfrastructure.
Ontology in MBSE How ontologies fit into MBSE The benefits and challenges.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
1 Office of ASG/CITO Crisis Information Management Strategy UNGIWG-11, Geneva 15 March 2011 A written consent by the UN is required to use the information.
U.S. Department of the Interior U.S. Geological Survey WaterML Presentation to FGDC SWG Nate Booth January 30, 2013.
Future Earth workshop – Kuala Lumpur Breakout session 2: Research priorities and opportunities to strengthen capabilities within Future Earth in Asia and.
Capacity Building Enhance the coordination of efforts to strengthen individual, institutional and infrastructure capacities, particularly in developing.
Model-Driven Analysis Frameworks for Embedded Systems
SONet: A Community-Driven Scientific Observations Network to achieve Semantic Interoperability of Environmental and Ecological Data Mark Schildhauer1,
Measurement Semantics: “MEASEM”
Bird of Feather Session
MSDI training courses feedback MSDIWG10 March 2019 Busan
BCoN Data Integration Workshop, University of Kansas, Feb 13-14, 2018
Presentation transcript:

SONet: A Community-Driven Scientific Observations Network to achieve Semantic Interoperability of Environmental and Ecological Data Mark Schildhauer 1, Shawn Bowers 2, Corina Gries 3, Deborah McGuinness 4, Philip Dibner 5, Josh Madin 6, Matt Jones 1, Luis Bermudez 7 1 NCEAS UC Santa Barbara, 2 Gonzaga University 3 NTL/LTER and Univ. of Wisconsin, 4 McGuinness & Associates, 5 OGC Interoperability Institute, 6 Macquarie University, 7 Southeastern Universities Research Association

Motivation Need to answer increasingly complex and critical questions: What is the (local/regional/global) impact of … changing climate overfishing urban development, human population growth GMOD crops, fertilization declining pollinators globalization of trade deforestation On… food production, spread of disease, drought, global biodiversity, desertification, soil loss

Motivation And a growing deluge of environmental data to assist in these investigations …

Motivation But…  locating desired information is already quite difficult…  Culling through irrelevant information (precision)  Failing to find useful information (recall)  using the data you find is problematic…  Proper interpretation (units, context, methods)  Merging, transforming for re-use  Manual, ad-hoc, arduous … Why?

Motivation Environmental data are highly heterogeneous… Variable syntax (csv, xls, netCDF) Structures (tables, rasters, vectors, hierarchical) Semantics (terminology, units, methods) Encompassing many disciplines: Biotic data: genomics, cellular, physiology, morphology, biodiversity, populations, communities, ecosystems Abiotic data: hydrology, geospatial, oceanography, atmospheric, soil, geology, etc.

Need for interoperability MANY different “semantic” efforts underway to unify data within earth/biodiversity/environmental disciplines, converging on use of OBSERVATIONAL data construct SPECIALIZED needs and concerns of different domains may drive semantic technology solutions to be diverse and incompatible OPPORTUNITY exists for communicating and coordinating among different domains to achieve greater interoperability of emerging semantic technology solutions BENEFIT is providing cross-disciplinary scientists with more seamless and powerful access to a broad range of relevant data and information

NSF’s OCI INTEROP Digital data are increasingly both the products of research and the starting point for new research and education activities. The ability to re-purpose data – to use it in innovative ways and combinations not envisioned by those who created the data – requires that it be possible to find and understand data of many types and from many sources. Interoperability (the ability of two or more systems or components to exchange information and to use the information that has been exchanged) is fundamental to meeting this requirement.

NSF’s OCI INTEROP This NSF crosscutting program supports community efforts to provide for broad interoperability through the development of mechanisms such as robust data and metadata conventions, ontologies, and taxonomies. Support is provided… for consensus-building activities: community workshops, web resources such as community interaction sites, and task groups. … and for providing the expertise necessary to turn the consensus into technical standards with associated implementation tools and resources: information sciences, software development, and ontology and taxonomy design and implementation.

Objectives of SONet Broad Objectives Address semantic interoperability issues in environmental (earth sciences) data [sharing, discovery, integration] Build a network of practitioners (SONet), including domain scientists, computer scientists, and information managers Build generic, cross-disciplinary data interoperability solutions Immediate Goals to Develop An extensible and open observations data model (“core model”) to unify existing domain-specific approaches A semantic (ontology) framework for scientific terminology and corresponding domain extensions Demonstration prototypes using these to address critical data interoperability issues

Working Groups Subgroup 1: Core Data Model for Observations Subgroup 2: Catalog of Common Field Observations Subgroup 3: Scientist-Oriented Term Organization Subgroup 4: Demonstration Projects Subgroup 1 Collect interoperability requirements Define common, unified data model Engage tool & data providers, data consumers Subgroup 2 Identify and catalog common observation types (semantics) Engage data providers and information managers Subgroup 3 Define general extension ontologies of scientific terms Focus work on outputs of group 2 Engage range of domain scientists Subgroup 4 Define and prototype demonstration projects Ensure compatability of subgroups Each group consists of two team leads Postdoc funded to work on demonstration projects & help ensure compatibility across subgroups Core SONet Team

Workshops & Outreach Community workshops … to bring together project members, data managers, domain scientists, computer scientists, and members of the larger environmental informatics community Workshop 1: Collect detailed requirements and use cases to frame a “Scientific Observations Interoperability Challenge”; begin defining core model Workshop 2: Discuss various data models in terms of addressing “Scientific Observations Interoperability Challenge”; refine core model Workshop 3: Roll-out of operational prototype; early evaluation and feedback Workshop 4: Training; further evaluative discussion, and plan SONet sustainability … approximately 20+ participants at each workshop

Project Timeline Workshops and meetings: Year 1: first community workshop, project meeting Year 2: second community workshop, project meeting Year 3: last two community workshops, including training Project has just recently officially started Year 1Year 2Year 3 Project Leaders Meeting (1) (orientation & planning) Project Leaders Meeting (2) ( evaluation & planning) Community Workshop (1) (develop requirements & use cases) Community Workshop (2) (discuss & refine models) Community Workshop (3) (roll-out & evaluate) Community Workshop (4) (training, sustainability) setup project mgmt. infrastructure, Postdoc hiring finalize community participants, meeting preparation begin implementation & interoperability tests, setup support infrastructure document and contrast results, continue impl. & interop. tests continue impl. & interop. tests, meeting preparation finalize impl. & interop. tests, sustainability planning document results, execute plan for sustainability

1 st SONet community workshop Identify initial key participants for effort (you!) Representatives from semantic/observational efforts in diverse environmental sciences: Plant genomics, oceanography, hydrology, biodiversity sciences, ecology, atmospheric sciences, geospatial community Computer science experts in knowledge representation, semantic web, conceptual data modeling, informatics

1 st SONet community workshop Postdoctoral fellow starting November 1 Huiping Cao Collaborative web site Wiki-like editing and file uploading (requires login) Public and private areas Discussion forum (requires login) Logo!

M.O. for SONet, and this meeting Group discussion and collaboration, not presentation Brainstorming and refinement Semantics-based approach Controlling concepts within a discipline Using concepts across disciplines OCI award * Observational approach

Examples of “raw” observational data

Observation defined An observation represents any measurement of some characteristic (attribute) of some real- world entity or phenomenon. A measurement consists of a realized value of some characteristic of an entity, expressed in some well-specified units (drawn from a measurement standard) Observations can provide context for other observations (e.g. observations of spatial or temporal information would often provide context for some other observation)

Another definition for observation An observation is an action with a result which has a value describing some phenomenon. The observation is modelled as a Feature within the context of the General Feature Model. An observation feature binds a result to a feature of interest, upon which the observation was made. The observed property is a property of the feature of interest. An observation uses a procedure to determine the value of the result, which may involve a sensor or observer, analytical procedure, simulation or other numerical process. The observation pattern and feature is primarily useful for capturing metadata associated with the estimation of feature properties, which is important particularly when error in this estimate is of interest.

Formalizing “Observational Data” Concept

Prospective observation models… ProjectDomainObservational data model TDWG/OSRBiodiversityMeta-model to integrate field observational data with specimen data VSTOAtmospheric sciences Ontologies for interoperations among different meteorological metadata standards ODMHydrologyCUAHSI’s Observational Data Model for storing diverse hydrological data SERONTOSocioecological research Ontology for integrating socio-ecological data OGC’s O&MGeospatialObservations and Measurements standard for enhancing sensor data interoperability SEEK’s OBOEEcologyExtensible Observation Ontology for describing data as observations and measurements

Variations of Observational Data Models

Developing a core model Identify the key observational models in the earth and environmental sciences Are these various observational models easily reconciled and/or harmonized? Are there special capabilities and features enabled by some observational approaches? What services should be developed around these observational models?

Goals for this meeting Begin formally identifying and resolving commonalities and discrepancies among our observational efforts Start defining a common core observational model for our data Articulate Use Cases (cross-disciplinary data integration tasks) that underpin a “Scientific Observations Interoperability Challenge”

Goals for this meeting Clarify specific short-term technology development that can catalyze and assist teams undertaking the “Scientific Observations Interoperability Challenge” Plan to publish results of “Interoperability Challenge” in special issue of ??

Scientific Observations Interoperability Challenge Understand the similarities, differences, and scope of the existing models for describing scientific observations Understand the main modeling concepts and relationships used by the different approaches Understand the services offered by systems supporting each approach, e.g., for data discovery, integration, etc. Identify approaches for enabling interoperability among the different approaches and systems Bring together a community to further develop interoperability solutions for sharing and integrating environmental data Further define and evaluate a "straw-man" core observation model and set of services to enable improved interoperability among systems

Scientific Observations Interoperability Challenge Use Cases Begin gathering data, metadata for Use Cases Should involve diverse representative data types Should involve cross-domain integration Begin developing specific queries Use Cases will help define needed services for achieving goals of the interoperability challenge

Architecture example (SEEK project)

SONet: Scientific Observations Network Contact: Mark Schildhauer Sponsored by National Science Foundation, award