Indiana University School of Informatics The LEAD Gateway Dennis Gannon, Beth Plale, Suresh Marru, Marcus Christie School of Informatics Indiana University.

Slides:



Advertisements
Similar presentations
GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution.
Advertisements

LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
As computer network experiments increase in complexity and size, it becomes increasingly difficult to fully understand the circumstances under which a.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
Education and Outreach Within the Modeling Environment for Atmospheric Discovery (MEAD) Project Daniel J. Bramer University Of Illinois at Urbana-Champaign.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
2012: Hurricane Sandy 125 dead, 60+ billion dollars damage.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
MCell Usage Scenario Project #7 CSE 260 UCSD Nadya Williams
May 29, 2007 Dynamically Adaptive Weather Analysis and Forecasting in LEAD: Issues in Data Management, Metadata, and Search Beth Plale Director, Center.
The Changing Face of Research Anthony Beitz DART Integration Manager.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Linked Environments for Atmospheric Discovery (LEAD): Web Services for Meteorological Research and Education.
1 Using the Weather to Teach Computing Topics B. Plale, Sangmi Lee, AJ Ragusa Indiana University.
Metadata, Ontologies, and Provenance: Towards Extended Forms of Data Management Beth Plale, Yogesh Simmhan Computer Science Dept.
18:15:32Service Oriented Cyberinfrastructure Lab, Grid Deployments Saul Rioja Link to presentation on wiki.
L inked E nvironments for A tmospheric D iscovery Linked Environments for Atmospheric Discovery (LEAD) Kelvin K. Droegemeier School of Meteorology and.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
The Collaborative Radar Acquisition Field Test (CRAFT): A Unique Public- Private Partnership in Mission-Critical Data Distribution Kelvin K. Droegemeier.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
Addressing the Data Deluge: the Structuring, Sharing, and Preserving of Scientific Experiment Data Beth Plale Sangmi Lee Scott Jensen Yiming Sun Computer.
CyberInfrastructure to Support Scientific Exploration and Collaboration Dennis Gannon (based on work with many collaborators, most notably Beth Plale )
OGCE Workflow Suite GopiKandaswamy Suresh Marru SrinathPerera ChathuraHerath Marlon Pierce TeraGrid 2008.
Module 7: Fundamentals of Administering Windows Server 2008.
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Grids for Chemical Informatics Randall Bramley, Geoffrey Fox, Dennis Gannon, Beth Plale Computer Science, Informatics, Physics Pervasive Technology Laboratories.
L inked E nvironments for A tmospheric D iscovery leadproject.org Using the LEAD Portal for Customized Weather Forecasts on the TeraGrid Keith Brewster.
Kelvin K. Droegemeier School of Meteorology University of Oklahoma AAAS Annual Meeting 15 February, 2009 Transforming Severe Weather Prediction Through.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Research and Educational Networking and Cyberinfrastructure Russ Hobby, Internet2 Dan Updegrove, NLR University of Kentucky CI Days 22 February 2010.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger September 29, 2009.
Kelvin K. Droegemeier and Yunheng Wang Center for Analysis and Prediction of Storms and School of Meteorology University of Oklahoma 19 th Conference on.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Streamflow - Programming Model for Data Streaming in Scientific Workflows Chathura Herath.
Sponsored by the National Science Foundation A New Approach for Using Web Services, Grids and Virtual Organizations in Mesoscale Meteorology.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
Applications and Requirements for Scientific Workflow Introduction May NSF Geoffrey Fox Indiana University.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
Towards Personalized and Active Information Management for Meteorological Investigations Beth Plale Indiana University USA.
GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.
1 Earth Science Technology Office The Earth Science (ES) Vision: An intelligent Web of Sensors IGARSS 2002 Paper 02_06_08:20 Eduardo Torres-Martinez –
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Scientific Workflows for the Sensor Web ICT for Earth Observation Anwar Vahed.
→ MIPRO Conference,Opatija, 31 May -3 June 2005 Grid-based Virtual Organization for Flood Prediction Miroslav Dobrucký Institute of Informatics, SAS Slovakia,
XMC Cat: An Adaptive Catalog for Scientific Metadata Scott Jensen and Beth Plale School of Informatics and Computing Indiana University-Bloomington Current.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
OGCE Workflow and LEAD Overview Suresh Marru, Marlon Pierce September 2009.
LEAD Project Discussion Presented by: Emma Buneci for CPS 296.2: Self-Managing Systems Source for many slides: Kelvin Droegemeier, Year 2 site visit presentation.
1. 2 Quick Background I have an ecological background but I strayed……and ended up in computer science The good news is I have been able to blend the two.
Chapter 1 Overview of Databases and Transaction Processing.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
1 Building Gateways to Grid Capabilities Dennis Gannon (with collaborator Beth Plale) Department of Computer Science School of Informatics Indiana University.
A Quick tour of LEAD for the VGrADS
Open Grid Computing Environments
OGCE Portal Applications for Grid Computing
Overview of Workflows: Why Use Them?
Presentation transcript:

Indiana University School of Informatics The LEAD Gateway Dennis Gannon, Beth Plale, Suresh Marru, Marcus Christie School of Informatics Indiana University

Indiana University School of Informatics Overview The LEAD ITR Project –Science Objectives –Adaptive CyberInfrastructure for Mesoscale Storm Prediction A tour of the LEAD project –Components of our approach to Data and Data Driven Adaptive Workflow Experience so far. The Gateway Lifecycle

Indiana University School of Informatics Predicting Storms Hurricanes and tornadoes cause massive loss of life and damage to property Underlying physical systems involve highly non-linear dynamics so computationally intense Data comes from multiple sources –“real time” derived from streams of data from sensors –Archived in databases of past storms Infrastructure challenges: –Data mine instrument radar data for storms –Allocate supercomputer resources automatically to run forecast simulations –Monitor results and retarget instruments. –Log provenance and metadata about experiments for auditing.

Indiana University School of Informatics The LEAD Project

Indiana University School of Informatics Analysis/Assimilation Quality Control Retrieval of Unobserved Quantities Creation of Gridded Fields Prediction/Detection PCs to Teraflop Systems Product Generation, Display, Dissemination End Users NWS Private Companies Students Traditional Methodology STATIC OBSERVATIONS Radar Data Mobile Mesonets Surface Observations Upper-Air Balloons Commercial Aircraft Geostationary and Polar Orbiting Satellite Wind Profilers GPS Satellites The Process is Entirely Serial and Static (Pre-Scheduled): No Response to the Weather! The Process is Entirely Serial and Static (Pre-Scheduled): No Response to the Weather!

Indiana University School of Informatics Analysis/Assimilation Quality Control Retrieval of Unobserved Quantities Creation of Gridded Fields Prediction/Detection PCs to Teraflop Systems Product Generation, Display, Dissemination End Users NWS Private Companies Students The LEAD Vision: Adaptive Cyberinfrastructure DYNAMIC OBSERVATIONS Models and Algorithms Driving Sensors The CS challenge: Build cyberinfrastructure services that provide adaptability, scalability, availability, useability, and real-time response.

Indiana University School of Informatics Change the Paradigm To make fundamental advances we need: –Adaptivity in computational model. But also Cyberinfrastructure to: –Execute complex scenarios in response to weather events Stream processing, triggers Close loop with the instruments. –Acquire computational resources on demand. Need supercomputer-scale resources Invoked in response to weather events –Deal with data deluge User can no longer manage his/her own experiment products

Indiana University School of Informatics The LEAD Gateway Portal To support three classes of users –Meteorology research scientists & grad students. –Undergrads in meteorology classes –People who want easy access to weather data. Go to:

Indiana University School of Informatics Gateway Components A Framework for Discovery –Four basic components Data Discovery –Catalogs and index services The experiment –Computational workflow managing on-demand resources Data analysis and visualization Data product preservation, –automatic metadata generation and experimental data providence.

Indiana University School of Informatics Data Search Select a region and a time range and desired attributes

Indiana University School of Informatics Portal: Experimental Data & Metadata Space CyberInfrastructure extends user’s desktop to incorporate vast data analysis space. As users go about doing scientific experiments, the CI manages back-end storage and compute resources. –Portal provides ways to explore this data and search and discover it. Metadata about experiments is largely automatically generated, and highly searchable. –Describes data object (the file) in application-rich terms, and provides URI to data service that can resolve an abstract unique identifier to real, on-line data “file”.

Indiana University School of Informatics Workflow: Composing Computational Tools to build new Tools Workflow is a term that describes the process of moving data through a sequence of analysis and transformational steps to achieve a goal. Another Paradigm Shift for the users. Each activity a user initiates in LEAD is an Experiment which consists of –Data discovery and collection. –Applied analysis and transformation A graph of activities (workflow) –Curated data products and results Each workflow activity is logged using an event system and stored as metadata in the users workspace. –Provides a complete provenance of work.

Indiana University School of Informatics The Experiment Builder A Portal “wizzard” that leads the user through the set-up of a workflow Asks the user: –“Which workflow do you want to run?” Once this is know, it can prompt the user for the required input data sources Then it “launches” the workflow.

Indiana University School of Informatics Parameter Selection

Indiana University School of Informatics Selecting the forecast region

Indiana University School of Informatics

Indiana University School of Informatics Gateway Support for Adaptive Queries LEAD requires ability to construct workflows that are Data Driven –Weather data streams define nature of computation Persistent and Agile –Data mining of data stream, detects “interesting” feature, event triggers workflow scenario that has been waiting for months. Adaptive –In response to weather: weather changes. –Nature of workflow may have to change on-the-fly. –Resource and requirements change.

Indiana University School of Informatics Experience with on-demand computing We use TeraGrid. –Actually “best effort” and not yet “on demand” –Use Grid technology for remote job execution and security. Reliability is critical. Workflow can automatically resubmit a failed task to another resource Urgent Computing handled by the Spruce Gateway.

Indiana University School of Informatics Validating Scientific Discovery The Gateway is becoming part of the process of science by being an active repository of data provenance Disks are cheap, so why not record everything? The system records each computational experiment that a user initiates –A complete audit trail of the experiment or computation –Published results can include link to provenance information for repeatability and transparency.

Indiana University School of Informatics Experience so far First release to support “WxChallenge: the new collegiate weather forecast challenge” –The goal: “forecast the maximum and minimum temperatures, precipitation, and maximum sustained wind speeds for select U.S. cities. –to provide students with an opportunity to compete against their peers and faculty meteorologists at 64 institutions for honors as the top weather forecaster in the nation.” –79 “users” ran 1,232 forecast workflows generating 2.6TBybes of data. Over 160 processors were reserved on Tungsten from 10am to 8pm EDT(EST), five days each week National Spring Forecast –First use of user initiated 2Km forecasts as part of that program. Generated serious interest from National Severe Storm Center. Integration with CASA project scheduled for final year of LEAD ITR.

Indiana University School of Informatics The LEAD Gateway Lifecycle Work began in 2003 with requirements analysis by the LEAD meteorology and CS teams. First 2 years of development supported by LEAD ITR and NMI Portals project. Year 3 & 4 support of 2 FTE from TG. –Public Release March Current Status –A new production release in July –Last year of LEAD ITR: hardened version of the Gateway to transition to community support UCAR - UNIDATA may be the host. Extensive planning underway.