Transparency, applications, and ab- stuff – effect on tools for e-science: it’s all about Informatics June 21, 2010, IATUL 2010 Peter Fox (RPI and WHOI)

Slides:



Advertisements
Similar presentations
Geoinformatics 2008 Fox Semantic Provenance 1 Semantic Provenance for Image Data Processing Peter Fox (HAO/ESSL/NCAR) Deborah McGuinness (RPI) Jose Garcia,
Advertisements

GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution.
Complexity must become Linear or Decrease Smart data infrastructure: The sixth generation of mediation for data science Peter Fox 1
A Framework for Earth Science Search Interface Development Designing and Implementing S2S Eric Rozell, Tetherless World Constellation, RPI.
Gregory Leptoukh, David Lary, Suhung Shen, Christopher Lynnes What’s in a day?
Evolving the BCO-DMO search interface - experience with semantic and smart search Cyndy Chandler (WHOI) Peter Fox (RPI and WHOI) Robert Groman, Dicky Allison.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
Evaluating Remote Sensing Data Or How to Avoid Making Great Discoveries by Misinterpreting Data Richard Kleidman ARSET-AQ Applied Remote Sensing Education.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
1 Satellite Remote Sensing of Particulate Matter Air Quality ARSET Applied Remote Sensing Education and Training A project of NASA Applied Sciences Pawan.
Obtaining MISR Data and Information Jeff Walter Atmospheric Science Data Center April 17, 2009.
Dawn Wright Oregon State University Ned Dwyer Coastal & Marine Resources Centre, Ireland The International Coastal Atlas Network (ICAN) FGDC Marine & Coastal.
Scientific Knowledge Discovery in Complex Semantic Networks of Geophysical Systems (no pressure…) EGU2012, NP2.6 April 25, 2012, Vienna, Austria Peter.
“Semantics” for Innovation in Visualization and Multimedia: Smarter Information Science ICSTI Workshop February 8, 2011, Redmond WA Peter Fox (RPI)
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Bringing Data Science, Xinformatics and Semantic eScience into the Graduate Curriculum (solicited) EGU (EOS 6/ ESSI2.3) April 25, 2012, Vienna.
Using DCO Data (Infrastructure, Management, Analysis, Visualization, …) Peter (Marshall Ma) and the Data Science
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Peter Fox CSCI Week 9, October 27, 2008.
Configurable User Interface Framework for Cross-Disciplinary and Citizen Science Presented by: Peter Fox Authors: Eric Rozell, Han Wang, Patrick West,
Students collect a water sample. An amphipod that couldn’t escape our nets. Figure 1: This screenshot shows the controlling page for running model animations.
Facilitating Next Generation Science Collaboration: Respecting and Mediating Vocabularies with Semantics in Ecosystems Assessments. December 7, 2011, AGU11.
Research Data Management At the Smithsonian Using SIdora Nano Tech Working Group May 15, 2014.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Data Life Cycle GeoData 2011 Workshop March 2, 2011, Broomfield, CO Peter Fox (RPI) Tetherless.
Global Change Information System: Information Model and Semantic Application Prototypes (GCIS-IMSAP) Status 01/08/2013 Stephan Zednik 1, Curt Tilmes 2,
Data Science and Analytics Curriculum development at Rensselaer (and the Tetherless World Constellation) (Adapted from NRC BigData Education Was April.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI Week 10, November.
World Data Center for Marine Environmental Sciences.
Semantic Cyberinfrastructure for Knowledge and Information Discovery (SCiKID) Proposal Principle Investigator: Eric Rozell Tetherless World Constellation.
In Search of What Some of It Means RDA Semantics and Metadata Workshop Feb 23, 2015 Peter Fox (RPI) Tetherless World Constellation.
The Rise of Informatics as-a Research Domain WIRADA Science Symposium August 2, 2011, Melbourne Peter Fox (RPI and WHOI)
Semantically-Enabled Science Data Integration (SESDI) and The Virtual Solar-Terrestrial Observatory (VSTO) Semantically-enabled (large-scale) Scientific.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
Local global disambiguation of terms and concepts The BCO-DMO metadata database uses controlled vocabularies to record many of the important pieces of.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
NASA and Earth Science Applied Sciences Program
Semantics and analytics = making the data and the decisions smarter? Digital Antiquity CI Feb 7-8, 2013, Arlington VA Peter Fox (RPI and WHOI)
Knowledge Networks and Science Data Ecosystems December 7, 2012, AGU12 IN54A-02. Peter Fox (RPI/ Tetherless World Constellation and WHOI/AOP&E)
Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) Tetherless World.
Provenance in Earth Science Gregory Leptoukh NASA GSFC.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
1 RDA and Metadata Peter Fox (my view) Metadata session
Research Data Management At the Smithsonian Using Sidora CNI December 10, 2013.
Facilitating Next Generation Science Collaboration: Marine Ecosystems Status Reports and Assessments June 24, 2014 IMBER – D2 Peter Fox (RPI/ Tetherless.
Toward verifiable science: iPython meets PROV-O (Semantics in Ecosystems Assessments). April 16, 2014 ERRT Peter Fox (RPI/ Tetherless World Constellation.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
How Environmental Informatics is Preparing Us for the Era of Big Data AGU FM 2013 GC11F-01 December 09, 2013, MW 3001 Peter
NMFS Use Case 1 review/ evaluation and next steps April 19, 2012 Woods Hole, MA Peter Fox (RPI* and WHOI**) and Andrew Maffei (WHOI) *Tetherless World.
Information Model Driven Semantic Framework Architecture and Design for Distributed Data Repositories AGU 2011, IN51D-04 December 9, 2011 Peter Fox (RPI)
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
Spatial & Temporal Distribution of Cloud Properties observed by MODIS: Preliminary Level-3 Results from the Collection 5 Reprocessing Michael D. King,
Biological and Chemical Oceanography Data Management Office slide 1 of 22 Introduction to Data Management for Ocean Science Research Cyndy Chandler Biological.
PARTHENOS-project.eu EOSC market demand for art, humanties and cultural heritage Amsterdam– EGI Conference– 7/4/2016 Franco Niccolucci Scientific Coordinator,
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum September 2011, Oostende, Belgium Concepts.
The Role of Virtual Observatories and Data Frameworks in an Era of Big Data NIST bIG dATA June 14, 2012, Gaithersburg, MD Peter Fox (RPI and WHOI)
The Semantic eScience Framework AGU FM10 IN22A-02 Deborah McGuinness and Peter Fox (RPI) Tetherless World Constellation.
‘Ontology Management’ Peter Fox (Semantic Web Cluster lead)
Bit.ly/2c3XMgd.
Evaluating Remote Sensing Data
Informatics underlying Data Science (ists)
About Thetus Thetus develops knowledge discovery and modeling infrastructure software for customers who: Have high value data that does not neatly fit.
Science Data Platforms: Informatics Architectures at the Forefront.
  1-A) How would Arctic science benefit from an improved GIS?
Presentation transcript:

Transparency, applications, and ab- stuff – effect on tools for e-science: it’s all about Informatics June 21, 2010, IATUL 2010 Peter Fox (RPI and WHOI) Tetherless World Constellation

2 Working premise Scientists – actually ANYONE - should be able to access a global, distributed knowledge base of scientific data that: appears to be integrated appears to be locally available But… data and information is obtained by multiple means (instruments, models, analysis) using various (often opaque) protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed AND created in a form that facilitates generation, not use (except by accident) And… there exist(ed) significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology…

3 Data has Lots of Audiences From “Why EPO?”, a NASA internal report on science education, 2005 More Strategic Less Strategic SCIENTISTS! Tools

Means of conduct 4

So what about abduction? No, not the criminal meaning… Is a method of logical inference introduced by Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch". Abductive reasoning starts when an inquirer considers of a set of seemingly unrelated facts, armed with an intuition that they are somehow connected. The term abduction is commonly presumed to mean the same thing as hypothesis; however, an abduction is actually the process of inference that produces a hypothesis as its end result 5

Abductive Information System? What would this look like in application tools? If you consent that induction is fundamentally part of how an information system is developed, then how to allow for abduction before induction may be possible? Design factors? Architecture factors? Library factors? Cognitive factors? 6

Modern informatics enables a new scale-free** framework approach Use cases –requirements Stakeholders Distributed authority Access control Ontologies Maintaining Identity

Marine habitat - change Scallop, number, density Scallop, size, shape, color, place Scallop, shell fragment Rock What is this? Flora or fauna? Dirt/ mud; one person’s noise is another person’s signal Several disciplines; biology, geology, chemistry, oceanography Several applications; science, fishing, habitat change, climate and environmental change, data integration Complex inter-relations, questions Use case: What is the temperature and salinity of the water and are these marine specimens usual or part of an ecosystem change? Src: WHOI and the HabCam group

Multi-tiered interoperability used by

Fox VSTO et al.10 But back to reality Fragmentation Disconnection Encapsulation … all are bad for … transparency

What is the ecosystem? Just a few elements and they are scattered Accountability ProofExplanationJustificationVerifiability Transparency Trust

Access Control Essential For Establishing Trust Licensing Intellectual property Security/ defence Endangered species Sensitive Data Full life cycle data, information and knowledge management and stewardship

Provenance Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility Knowledge provenance; enrich with semantics (especially the relations between concepts previously isolated, and retaining context) and semantically-aware tools

11/4/2015 MODIS Terra & Aqua vs. AIRS Cloud Top Pressure AIRS vs. MODIS AquaAIRS vs. MODIS Terra MODIS Aqua vs. MODIS Terra Correlation maps for Jan 1 – 16, 2008 Impact: Findings using aerosol data apply to other geophysical parameters!

About your selected parameters: Parameter AParameter B Difference alert Parameter Name :Aerosol Optical Depth at 550 nm Dataset:MYD08_D3.005MOD08_D3.005  Diff Data-Day definitionUTC (00:00-24:00Z) The same but…. Temporal resolutionDaily Spatial resolution1x1 degree Sensor:MODIS Platform:AquaTerra  Diff EQCT13:3010:30  Diff Day Time NodeAscendingDescending  Diff Pre-Giovanni Processes :ATBD-MOD-30 Giovanni Processes:Spatial subset Time average Spatial subset Time average Your Selected Options: Spatial Area: Longitude ( -30, 150), Latitude (-10,60) Parameters: A: MYD08_D3.005 Aerosol Optical Depth at 550 nm B: MOD08_D3.005 Aerosol Optical Depth at 550 nm Temporal Range: Begin Date: Jan End Date: Jan Visualization Function: Lat –Lon map Time-averaged Continue process to display imageReturn to selection page Known Issues: The difference of EQCT and Day Time Node, modulated by data-day definition, caused the included overpass time difference, which makes the artifact difference. See sample images: MODIS Terra vs. MODIS Aqua AOD Correlation Included Overpass time Difference Semantic Advisor Parameter AParameter B Difference alert Parameter Name :Aerosol Optical Depth at 550 nm Dataset:MYD08_D3.005MOD08_D3.005  Diff Data-Day definitionUTC (00:00-24:00Z) The same but…. Temporal resolutionDaily Spatial resolution1x1 degree Sensor:MODIS Platform:AquaTerra  Diff EQCT13:3010:30  Diff Day Time NodeAscendingDescending  Diff Pre-Giovanni Processes :ATBD-MOD-30 Giovanni Processes:Spatial subset Time average Spatial subset Time average

Tetherless World Constellation tw.rpi.edu Themes Future Web Web Science Policy Social Xinformatics Data Science Semantic eScience Data Frameworks Semantic Foundations Knowledge Provenance Ontology Engineering Environments Inference, Trust Hendler Fox McGuinness Multiple depts/schools/programs ~ 30 (Post-doc, Staff, Grad, Ugrad)

Partitioning Data Science Xinformatics Semantic eScience

Back shed

19

Use cases 1.Do you have any data online from Hutchins from award number OCE ? 2.I want to download (temperature, biological,...) data in the following areas (N. Atlantic, bounding box, where the JGOFs survey was done,...) 3.What new data has been added to this repository since last year (and organize it by project) 4.Show me all the places where the surface temperature in the North Atlantic is 25 degrees during June. Tetherless World Constellation20

Quick prototype of use case 1 Tetherless World Constellation21

Current version Tetherless World Constellation22

Current version Tetherless World Constellation23