LEAD Workflow Orchestration Lavanya Ramakrishnan Renaissance Computing Institute University of North Carolina – Chapel Hill Duke University North Carolina.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
Database Architectures and the Web
Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle.
Agreement-based Distributed Resource Management Alain Andrieux Karl Czajkowski.
A Java Architecture for the Internet of Things Noel Poore, Architect Pete St. Pierre, Product Manager Java Platform Group, Internet of Things September.
The LEAD Effort at Unidata The Unidata Seminar will start at 1:30 PM MST.
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
Transparent Robustness in Service Aggregates Onyeka Ezenwoye School of Computing and Information Sciences Florida International University May 2006.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
Linked Environments for Atmospheric Discovery (LEAD): Web Services for Meteorological Research and Education.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
1 Using the Weather to Teach Computing Topics B. Plale, Sangmi Lee, AJ Ragusa Indiana University.
Focus Study: Mining on the Grid with ADaM Sara Graves Sandra Redman Information Technology and Systems Center and Information Technology Research Center.
Grid Computing for Real World Applications Suresh Marru Indiana University 5th October 2005 OSCER OU.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
18:15:32Service Oriented Cyberinfrastructure Lab, Grid Deployments Saul Rioja Link to presentation on wiki.
L inked E nvironments for A tmospheric D iscovery Linked Environments for Atmospheric Discovery (LEAD) Kelvin K. Droegemeier School of Meteorology and.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
CONTENTS Arrival Characters Definition Merits Chararterstics Workflows Wfms Workflow engine Workflows levels & categories.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Cluster Reliability Project ISIS Vanderbilt University.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
Development Timelines Ken Kennedy Andrew Chien Keith Cooper Ian Foster John Mellor-Curmmey Dan Reed.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Rule-Based Programming for VORBs Bertram Ludaescher Arcot Rajasekar Data and Knowledge Systems San Diego Supercomputer Center U.C. San Diego.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
Streamflow - Programming Model for Data Streaming in Scientific Workflows Chathura Herath.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Sponsored by the National Science Foundation A New Approach for Using Web Services, Grids and Virtual Organizations in Mesoscale Meteorology.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Indiana University School of Informatics The LEAD Gateway Dennis Gannon, Beth Plale, Suresh Marru, Marcus Christie School of Informatics Indiana University.
System Center Lesson 4: Overview of System Center 2012 Components System Center 2012 Private Cloud Components VMM Overview App Controller Overview.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
OGCE Workflow and LEAD Overview Suresh Marru, Marlon Pierce September 2009.
End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009.
LEAD Project Discussion Presented by: Emma Buneci for CPS 296.2: Self-Managing Systems Source for many slides: Kelvin Droegemeier, Year 2 site visit presentation.
MSF and MAGE: e-Science Middleware for BT Applications Sep 21, 2006 Jaeyoung Choi Soongsil University, Seoul Korea
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
VgES Version 0.7 Release Overview UCSD VGrADS Team Andrew A. Chien, Henri Casanova, Yang-suk Kee, Jerry Chou, Dionysis Logothetis, Richard.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
VGrADS and GridSolve Asim YarKhan Jack Dongarra, Zhiao Shi, Fengguang Song Innovative Computing Laboratory University of Tennessee VGrADS Workshop – September.
Lessons from LEAD/VGrADS Demo Yang-suk Kee, Carl Kesselman ISI/USC.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
RESERVOIR Service Manager NickTsouroulas Head of Open-Source Reference Implementations Unit Juan Cáceres
A Quick tour of LEAD for the VGrADS
SC’07 Demo Draft VGrADS Team June 2007.
LEAD-VGrADS Day 1 Notes.
Design and Manufacturing in a Distributed Computer Environment
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Wide Area Workload Management Work Package DATAGRID project
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

LEAD Workflow Orchestration Lavanya Ramakrishnan Renaissance Computing Institute University of North Carolina – Chapel Hill Duke University North Carolina State University

Outline Background and LEAD Service Architecture —web services, BPEL LEAD Workflows —status, site visit example run —characteristics, workflow structures LEAD Virtual Grid Integration —resource management

Linked Environments for Atmospheric Discovery Rationale —each year, mesoscale weather – floods, tornadoes, hail, strong winds, lightning, hurricanes and winter storms – causes hundreds of deaths, routinely disrupts transportation and commerce, and results in annual economic losses in excess of $13B. From “offline” to “online” forecasting —data assimilation and adaptive evaluation

Static Forecasting Large-scale tightly coupled components —not adaptable to weather or resource behavior —modifications are not trivial NetRad Radar ingest Fetch data products Convert to format for assimilation Assimilate into 3D grid forecast model execution Plan ensemble run Store results to personal catalog Analyze final product Center for Analysis and Prediction of Storms (CAPS, OU): It took an MS student 6 months, working with a research scientist, to learn how to modify the operational forecast system for her needs and run it using real data

Dynamic Adaptive LEAD System Meteorology goal —to provide timely and accurate forecasts using dynamic adaptation Computer Science goal —map application requirements to resource capabilities –redundant runs, scheduling policies —adapt to weather as well as resource behavior Need real time monitoring to make adaptation decisions

Define Domain, Data Requirements and Query for Desired Data ADAS Analysis Processing -producing- ADAS Analysis (3D Gridded Fields) + Background Fields ESML & LDM Decoding ADAS Remapping, Gridding, Conversion, Quality control ADAS-to-WRF Converter Yielding 3D Gridded Fields in WRF Height Coordinate STOP Adjust Forecast Configuration and Schedule Resources START Allocate Computational Resources and Move/Stream Data to Appropriate Location (e.g., PACI Center) Single or Multiple Copies of WRF Forecast Model WRF Gridded Output Meta Data Creation and Cataloging myLEAD Storage WRF Ensemble Initial Conditions Visualization Data Mining with ADaM Define Forecast Configuration, Mining Configuration and Thresholds Monitor Education & Meteorology LEAD Control and Data Flow

Service Oriented LEAD Workflows NetRad Radar ingest Fetch data products Convert to format for assim Assimilate into 3D grid Forecast model execution Plan ensemble run Store results to personal catalog Analyze final product BPEL Workflow Source: Beth Plale

Business Process Execution Language (BPEL) BPEL (aka WSBPEL and BPEL4WS) — enable cross-enterprise automated business processes —multiple vendor participation –Oracle, Microsoft, IBM, SUN, SAP, BEA, EDS, Adobe, … —multiple implementations –Oracle BPEL, IBM BPWS4J and Microsoft BizTalk Process Managers, Active BPEL ( Open Source Process Manager Features —loop and control logic, synchronous/asynchronous communication —composition and concurrent execution

LEAD Site Visit Experimental Ensemble Data Assimilation WRF IDV … IU (chinkapin) UNC (dante) ARPSPLT Unidata (lead1) OU (aquaman)

Characteristics of LEAD Workflows Coupled analysis and assimilation tools, data repositories —change configuration rapidly and automatically in response to weather —Streaming data, steer remote observing technologies Multilevel monitoring and intelligent control —workflow, resource, application, service —performance and reliability guarantees of the resources Model Resources Data Need resources Have more resources Get data Have more data Have more data

Adapting Workflow Structure Reactive Adaptation —e.g. service failure, resource behavior Proactive prediction and decisions —adjust i, j, k (weather science meets the infrastructure science) to meet individual workflow guarantees —global optimization of number of workflows being serviced …… … Adaptation in Time (i) Adaptation in Space (j) Adaptation based on Initial Conditions (k)

Outline Background and LEAD Service Architecture web services, BPEL LEAD Workflows status, site visit example run characteristics, workflow structures LEAD Virtual Grid Integration —resource management

Distributed Resources Computation Specialized Applications Steerable Instruments Storage Data Bases Resource Access Services GRAM Grid FTP SSH Scheduler LDM OPenDAP Generic Ingest Service User Interface Desktop Applications IDV WRF Configuration GUI LEAD Portal Portlets Visualization Workflow Education Monitor Control Ontology Query Browse Helper Crosscutting Services Authorization Authentication Monitoring Notification Configuration and Execution Services Workflow Monitor MyLEAD Workflow Engine/Factories Resource Cat THREDDS Application Resource Broker (Scheduler) Host Environment GPIR Application Host Execution Description WRF, ADaM, IDV, ADAS Application Description Application & Configuration Services Client Interface Observations Streams Static Archived Data Services Workflow Services Catalog Services RLS OGSA- DAI Geo-Reference GUI Control Service Query Service Stream Service Ontology Service Decoder/ Resolver Service Transcoder Service/ ESML LEAD Architecture

Distributed Resources (compute, disk) Protocols (e.g. web services, GridFTP, Gatekeeper) Crosscutting Services Authorization Authentication Monitoring Notification MyLEAD Client Interface LEAD Architecture Resource and Data Management (VGrid, myLEAD, etc) Application Services (e.g. WRF, IDV, etc) BPEL Workflow Engine LEAD Portal

Virtual Grid provides LEAD … Dynamic, scalable resource abstraction framework —scheduler, resource broker Integrated monitoring and notification of resource behavior —NWS, NWS-HAPI DVCW vgFAB vgES APIs vgMON Information Services vgLAUNCH VG vgAgent Grid Resources Resource Managers

DVCW vgFAB Proposed LEAD – VGrADS Architecture LEAD VGrid Manager vgES APIs vgMON vgDL Information Services Resource Managers vgLAUNCH VG vgAgent Grid Resources LEAD Event System BPEL Workflow Engine Expectations Violations Decision Process Other LEAD System Events (e.g. application, service)

Some Thoughts … LEAD VGrid Manager vgDL VG LEAD Event System BPEL Workflow Engine Expectations Violations Decision Process Web Service Progress of workflow Resource behavior Weather, Application behavior Performance, Fault tolerance, Availability Steer Workflow Find and bind resources Workflow Contract {time, accuracy (performance,reliability)} Workflow description Meet Guarantees

Streaming data —Unidata LDM Service monitoring —web service load Application —behavior on resources Resource —performance —reliability BPEL Workflow Engine LDM Service GridFTP Service WRF Service IDV Viz Service Data arrives 1. “Data has arrived” 3. Next step in workflow – Launch WRF 0. Possible advanced reservation 5. Move data to compute (virtual) nodes 4. Get resources Resource Broker Data Mining 6. Run WRF 8. WRF complete 7. More WRF runs needed? If yes repeat 5 to 6 2. Get required resources Start End Resources - compute, network storage LEAD Dynamic Workflows Multilevel monitoring Multiple decision points

vgDL Specification for LEAD LEADSpec = LDMNode = {isLDMNode} // Note the loose coupling far WRFNode = {memory >= 500MB, cpu > 2000, diskspace > 4GB} far VizNode = {memory >=4GB, cpu > 4000} vgidLEAD = vgCreateVG(vgESsrv.renci.org, LEADspec, 1000, null); vgRoot = vgGetRoot(vgidLEAD); LDMNode WRFNode VizNode vgRoot WRFspec = LooseBagof [1:32]; C = Clusterof [4:256]; node = {node.memory > 500MB, node.cpu > 2000}; C = {C.hasSharedFileSystem = true} vgidLEAD = vgGetMyVG(); wrfNode = vgGetMyNode(); status = vgAddToVG(vgidLEAD, WRFspec, wrfNode,1000, init_WRF); … … … Provisioning

LEAD Virtual Grid Research Implications Streaming data in resource scheduling —fixed point makes scheduling complicated Persistent and transient services —service directory, monitoring, scheduling Data management across the virtual grid —run time knowledge of data location Integrated decision process —performance and reliability guarantees —application, weather

Comments?

LEAD Monitoring Architecture Real-time Resource Monitoring (NWS-HAPI) Steer the workflow, etc Workflow (triggered by real-time data arrival) LEAD Monitoring and Orchestration Target Resource Set Resource Behavior Prediction Application Code Analysis (SvPablo) Historical data Workflow Engine Decision Process Scheduler Real-time Resource Monitoring (NWS-HAPI) Steer the workflow, etc Workflow (triggered by real- time data arrival) LEAD Monitoring and Orchestration Target Resource Set Application and Resource Behavior Prediction Application Code Analysis (SvPablo) Historical data Workflow Engine Decision Process Scheduler Weather, application change

LEAD Control and Data Flow LEGEND Control Flow Data Resources Input Data/Model grids in LEAD usable form ADAS Output in standard form ADAS Output in WRF input form WRF Output ADaM Output LEAD Metadata (Color Determines Resource) Experiment Metadata