Supporting Large-Scale Science with Workflows Deana Pennington University of New Mexico Long-Term Ecological Research Network Office ITR: Science Environment.

Slides:



Advertisements
Similar presentations
Scoping Research in Sustainability Information Science Steven D. Prager Department of Geography University of Wyoming David Bennett Department of Geography.
Advertisements

Soils to Satellites. NCRIS Capabilities Well Placed NCRIS capabilities have access to: Vast volumes of Data (uniformly and non-uniformly structured) High.
Soils to Satellites Logos used with consent. Content of this presentation except logos is released under TERN Attribution Licence v1.0
RESEARCH IN HANDS-FREE CONTROL OF DIGTAL PHOTOGRAMMETRIC 3D MEASUREMENTS Presented to: 12th International Scientific and Technical Conference “From imagery.
U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
University of Khartoum Institute of Environmental Sciences Dip/ M.Sc in Enviromental Sciences Fundamentals of Environmental Science By: Dr. Zeinab Osman.
LTER Planning Process Science Task Force (STF) Report to NSF September 2005.
1 Knowledge, Action and Systems Some emerging foundational issues in Computing … Can Information Studies Help? Eric Yu Faculty of Information Studies University.
Introduction to Kepler Deana Pennington, PhD University of New Mexico LTER Network Office, Sevilleta LTER PI CI-Team: Advancing CI-Based Science through.
How do we make sense of modeling and model analysis? Oleg Sokolsky Department of Computer and Information Science University of Pennsylvania Workshop on.
TEMPUS Programme Problem oriented Ecology and Biodiversity Module B Forest Ecology Saint Petersburg State University Faculty of Biology and Soil Sciences.
Leveraging semantic metadata for ecological data discovery and integration for analysis and modeling Matthew B. Jones Mark P. Schildhauer with contributions.
Biogeography Chapter 1.
Drivers for a PRAGMA Biodiversity Science Expedition Reed Beaman Florida Museum of Natural History University of Florida.
Place, community, and biosphere: An overview of the TERC Life Science Initiative's climate education work Gilly Puttick*, Brian Drayton, TERC *presenter.
Ecology —An Overview. What is Ecology? Ecology is the scientific study of the interactions between organisms and their environment. It is the science.
SEEK: Enabling Ecology and Biodiversity Science Through Cyberinfrastructure.
Introduction for BEAM Ecological Niche Modeling Working Meeting Deana Pennington University of New Mexico December 14, 2004.
Agricultural systems research: An introduction
Modeling Complex Interactions of Overlapping River and Road Networks in a Changing Landscape Programmatic overview Hypothesis Preliminary findings.
THEME 1: Improving the Experimentation and Discovery Process Unprecedented complexity of scientific enterprise Is science stymied by the human bottleneck?
Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher
Definition of Computational Science Computational Science for NRM D. Wang Computational science is a rapidly growing multidisciplinary field that uses.
VIRTUAL ECOLOGICAL INQUIRY MODULE: A Collaborative Project Between TAMU-ITS Center and CAS-CNIC Presented by: X. Ben Wu and Stephanie L. Knight Department.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
Semantic Mediation in SEEK/Kepler: Exploiting Semantic Annotation for Discovery, Analysis, and Integration of Scientific Data and Workflows Bertram Ludäscher.
Information Technology in Science Center for Teaching and Learning An NSF-funded project of the Colleges of Science and Education at Texas A&M University.
What Is Ecology? What is Landscape? What is Landscape Ecology? A road to Landscape Ecological Planning.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Procedures for managing workflow components Workflow components: A workflow can usually be described using formal or informal flow diagramming techniques,
FAOCGIARWMO. How will Global Environmental Change affect the vulnerability of food systems in different regions? How might food systems be adapted to.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
Consultation meetings: Jan 2005, Brussels, consultation meeting on topics for FP7 2-3 Feb 06, Brussels, Symposium in memoriam Anver Ghazi 17 Feb 06, Text.
Grid Technologies Arcot Rajasekar (SEEK) Paul Watson (North East eScience Centre)
National Ecological Observatory Network
DISCIPLINARY PERSPECTIVE BIOLOGY/ECOLOGY Workshop on Cyberinfrastructure for Environmental Research and Education November 1, 2002.
Jonathan Long and Carl Skinner With Contributions from the Science Synthesis Team USDA FS Pacific Southwest Research Station SocialEcological.
E-Science and Technology Infrastructure for Biodiversity and Ecosystem Research.
Mathematics and epidemiology: an uneasy friendship David Ozonoff, MD, MPH Boston University School of Public Health.
National Ecological Observatory Network (NEON) BIOAC Briefing November 17, 2004 Elizabeth Blood NEON Program Officer
Soil and Water Conservation Modeling: MODELING SUMMIT SUMMARY COMMENTS Dennis Ojima Natural Resource Ecology Laboratory COLORADO STATE UNIVERSITY 31 MARCH.
Exploring Biodiversity at Delcastle Technical High School Nate Nazdrowicz PhD Graduate Student University of Delaware Department of Entomology and Wildlife.
© Hazy, Tivnan, & Schwandt Boundary Spanning in Organizational Learning: Preliminary computational explorations Jim Hazy, Brian Tivnan & David Schwandt.
GeoSpatial and GeoTemporal Informatics for dynamic and complex systems May Yuan.
EScience Workshop on Scientific Workflows Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
Theme 2: Data & Models One of the central processes of science is the interplay between models and data Data informs model generation and selection Models.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Computational Tools for Population Biology Tanya Berger-Wolf, Computer Science, UIC; Daniel Rubenstein, Ecology and Evolutionary Biology, Princeton; Jared.
Landscape Ecology (EEES4760 & EEES6760) DES, University of Toledo, Spring 2009 Instructor: Dr. Jiquan Chen, Ph: ;
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
Marine Metadata Interoperability Acknowledgements Ongoing funding for this project is provided by the National Science Foundation.
Intellectual Merit: NSF supported researcher Roger Bales and colleagues have developed a prototype instrument cluster for the study of mountain hydrology,
Ecological Niche Modeling Conceptual Workflows Deana Pennington University of New Mexico December 16, 2004.
Scientific Workflows for the Sensor Web ICT for Earth Observation Anwar Vahed.
Why use landscape models?  Models allow us to generate and test hypotheses on systems Collect data, construct model based on assumptions, observe behavior.
Group A: Impacts on Organisms, Communities and Landscapes Q5: How This Topic can Potentially Connect With Other Topics in Chapter Outline Chapter 3: Impacts.
The Global Scene Wouter Los University of Amsterdam The Netherlands.
Staging of the Ecological Niche Modeling Mammal Prototype Project Deana Pennington University of New Mexico December 14, 2004.
New Ecological Science Advice for Ecosystem Protection The EPA Science Advisory Board (SAB) Staff Office supports three external scientific advisory committees.
What is cognitive psychology?
Data R&D Issues for GTL Bertram Ludäscher Data and Knowledge Systems
What contribution can automated reasoning make to e-Science?
Scientific Method.
Social Research Methodology and Supplementary Documentation John Kallas University of the Aegean, Department of Sociology.
ECOLOGY THE INTRODUCTION.
Scientific Workflows Lecture 15
Presentation transcript:

Supporting Large-Scale Science with Workflows Deana Pennington University of New Mexico Long-Term Ecological Research Network Office ITR: Science Environment for Ecological Knowledge (SEEK) project CI-Team: Advancing CI-Based Science through Education, Training, and Mentoring of Science Communities WORKS ’07 June 25, 2007

Scientific Research Cycle Theory Hypothesis Experiment Results Inference Research Design Data flow Knowledge flow Design flow Knowledge flow

Vegetation Composition & Structure Climate & Population Change Disturbance (Wildfire, Bugs, Others) Biodiversity Carbon Others Invasive species Wildfire Specialist Climatologist Plant Scientist Insect Scientist Domain Modelers Remote Sensing Scientists GIS Specialists Statisticians Mathematicians Biodiversity Scientist Carbon Scientist CausesConsequences System of interest

Plant Growth Plant Dispersal Species Invasion Climate Change & Species Distribution Wildfire Carbon Biota Plant Satellite Imagery Environmental Field Ecologic Query & Integrate Plants Environmental Query & Integrate Transform & Integrate Data Flow – heterogeneous datasets/models/workflows

**Metaprovenance Provenance = dataset derivation – explicit information about which workflow components were used Metaprovenance = dataset derivation – capture tacit information about why those components were used and which components go together

For I = 1 to N Climate scenarios For j = 1 to N Algorithms Climate Workflow Wildfire Workflow Plant Growth Workflow System Workflow For k = 1 to N Parameter sets Other Subsystem Workflows Many output datasets Complex workflows/parameter sweeps

**Metaprovenance Project coordination Workflow => 1000 datasets Parameter sweep => 100 parameter sets Which dataset do I go to to see…??? Provenance = Given a dataset, what components/parameters were used? Metaprovenance = Given a set of components/parameters, which dataset was produced?

Science Dashboard? Enter project level information – project approach and design Control parameters

Abstract Workflow Executable Workflow Conceptual Model Scientist-Developer Collaboration Scientist- Developer Collaboration Vegetation Composition & Structure Climate Change Cognitive Networks Scientist-Scientist Collaboration Design Flow

Abstract Workflow Executable Workflow Cognitive Network Conceptual Model Formal Ontology Knowledge Flow Ontology-driven Workflows

Theory Experimental Design Empirical Results Data Analysis Inference Hypothesis Generation Conceptual Model Assumptions Idealizations Simplification Hypothesis testing Knowledge-Driven Workflows

Acknowledgments This work was heavily influenced by discussion within the SEEK project and especially the SEEK Knowledge Representation team. I appreciate all of their interaction. Only my own perspective is expressed, and they would not necessarily agree. The work was supported by National Science Foundation grant # for the Science Environment for Ecological Knowledge (SEEK) project and grant # for the CI-Team Demonstration Project: Advancing Cyberinfrastructure-Based Science through Education, Training, and Mentoring of Science Communities.