Transparency, applications, and ab- stuff – effect on tools for e-science: it’s all about Informatics June 21, 2010, IATUL 2010 Peter Fox (RPI and WHOI) Tetherless World Constellation
2 Working premise Scientists – actually ANYONE - should be able to access a global, distributed knowledge base of scientific data that: appears to be integrated appears to be locally available But… data and information is obtained by multiple means (instruments, models, analysis) using various (often opaque) protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed AND created in a form that facilitates generation, not use (except by accident) And… there exist(ed) significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology…
3 Data has Lots of Audiences From “Why EPO?”, a NASA internal report on science education, 2005 More Strategic Less Strategic SCIENTISTS! Tools
Means of conduct 4
So what about abduction? No, not the criminal meaning… Is a method of logical inference introduced by Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch". Abductive reasoning starts when an inquirer considers of a set of seemingly unrelated facts, armed with an intuition that they are somehow connected. The term abduction is commonly presumed to mean the same thing as hypothesis; however, an abduction is actually the process of inference that produces a hypothesis as its end result 5
Abductive Information System? What would this look like in application tools? If you consent that induction is fundamentally part of how an information system is developed, then how to allow for abduction before induction may be possible? Design factors? Architecture factors? Library factors? Cognitive factors? 6
Modern informatics enables a new scale-free** framework approach Use cases –requirements Stakeholders Distributed authority Access control Ontologies Maintaining Identity
Marine habitat - change Scallop, number, density Scallop, size, shape, color, place Scallop, shell fragment Rock What is this? Flora or fauna? Dirt/ mud; one person’s noise is another person’s signal Several disciplines; biology, geology, chemistry, oceanography Several applications; science, fishing, habitat change, climate and environmental change, data integration Complex inter-relations, questions Use case: What is the temperature and salinity of the water and are these marine specimens usual or part of an ecosystem change? Src: WHOI and the HabCam group
Multi-tiered interoperability used by
Fox VSTO et al.10 But back to reality Fragmentation Disconnection Encapsulation … all are bad for … transparency
What is the ecosystem? Just a few elements and they are scattered Accountability ProofExplanationJustificationVerifiability Transparency Trust
Access Control Essential For Establishing Trust Licensing Intellectual property Security/ defence Endangered species Sensitive Data Full life cycle data, information and knowledge management and stewardship
Provenance Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility Knowledge provenance; enrich with semantics (especially the relations between concepts previously isolated, and retaining context) and semantically-aware tools
11/4/2015 MODIS Terra & Aqua vs. AIRS Cloud Top Pressure AIRS vs. MODIS AquaAIRS vs. MODIS Terra MODIS Aqua vs. MODIS Terra Correlation maps for Jan 1 – 16, 2008 Impact: Findings using aerosol data apply to other geophysical parameters!
About your selected parameters: Parameter AParameter B Difference alert Parameter Name :Aerosol Optical Depth at 550 nm Dataset:MYD08_D3.005MOD08_D3.005 Diff Data-Day definitionUTC (00:00-24:00Z) The same but…. Temporal resolutionDaily Spatial resolution1x1 degree Sensor:MODIS Platform:AquaTerra Diff EQCT13:3010:30 Diff Day Time NodeAscendingDescending Diff Pre-Giovanni Processes :ATBD-MOD-30 Giovanni Processes:Spatial subset Time average Spatial subset Time average Your Selected Options: Spatial Area: Longitude ( -30, 150), Latitude (-10,60) Parameters: A: MYD08_D3.005 Aerosol Optical Depth at 550 nm B: MOD08_D3.005 Aerosol Optical Depth at 550 nm Temporal Range: Begin Date: Jan End Date: Jan Visualization Function: Lat –Lon map Time-averaged Continue process to display imageReturn to selection page Known Issues: The difference of EQCT and Day Time Node, modulated by data-day definition, caused the included overpass time difference, which makes the artifact difference. See sample images: MODIS Terra vs. MODIS Aqua AOD Correlation Included Overpass time Difference Semantic Advisor Parameter AParameter B Difference alert Parameter Name :Aerosol Optical Depth at 550 nm Dataset:MYD08_D3.005MOD08_D3.005 Diff Data-Day definitionUTC (00:00-24:00Z) The same but…. Temporal resolutionDaily Spatial resolution1x1 degree Sensor:MODIS Platform:AquaTerra Diff EQCT13:3010:30 Diff Day Time NodeAscendingDescending Diff Pre-Giovanni Processes :ATBD-MOD-30 Giovanni Processes:Spatial subset Time average Spatial subset Time average
Tetherless World Constellation Themes Future Web Web Science Policy Social Xinformatics Data Science Semantic eScience Data Frameworks Semantic Foundations Knowledge Provenance Ontology Engineering Environments Inference, Trust Hendler Fox McGuinness Multiple depts/schools/programs ~ 30 (Post-doc, Staff, Grad, Ugrad)
Partitioning Data Science Xinformatics Semantic eScience
Back shed
Use cases 1.Do you have any data online from Hutchins from award number OCE ? 2.I want to download (temperature, biological,...) data in the following areas (N. Atlantic, bounding box, where the JGOFs survey was done,...) 3.What new data has been added to this repository since last year (and organize it by project) 4.Show me all the places where the surface temperature in the North Atlantic is 25 degrees during June. Tetherless World Constellation20
Quick prototype of use case 1 Tetherless World Constellation21
Current version Tetherless World Constellation22
Current version Tetherless World Constellation23