Apache Airavata GSOC 2013. Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
The National Grid Service and OGSA-DAI Mike Mineter
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks.
P-GRADE and WS-PGRADE portals supporting desktop grids and clouds Peter Kacsuk MTA SZTAKI
A Computation Management Agent for Multi-Institutional Grids
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
6/2/20071 Grid Computing Sun Grid Engine (SGE) Manoj Katwal.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Interpret Application Specifications
Peoplesoft: Building and Consuming Web Services
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Discussion and conclusion The OGC SOS describes a global standard for storing and recalling sensor data and the associated metadata. The standard covers.
Assignment 3: A Team-based and Integrated Term Paper and Project Semester 1, 2012.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
DISTRIBUTED COMPUTING
Software for Science Gateways: Open Grid Computing Environments Marlon Pierce, Suresh Marru Pervasive Technology Institute Indiana University
ANSTO E-Science workshop Romain Quilici University of Sydney CIMA CIMA Instrument Remote Control Instrument Remote Control Integration with GridSphere.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
GridFE: Web-accessible Grid System Front End Jared Yanovich, PSC Robert Budden, PSC.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
PNPI HEPD seminar 4 th November Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)
Javascript Cog Kit By Zhenhua Guo. Grid Applications Currently, most grid related applications are written as separate software. –server side: Globus,
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Apache Airavata (Incubating) Gateway to Grids & Clouds Suresh Marru Nov 10 th 2011.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
AgINFRA science gateway for workflows and integrated services 07/02/2012 Robert Lovas MTA SZTAKI.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
A Collaborative Framework for Scientific Data Analysis and Visualization Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox Department of Computer.
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Some comments on Portals and Grid Computing Environments PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics,
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
The Gateway Computational Web Portal Marlon Pierce Indiana University March 15, 2002.
OGCE Workflow and LEAD Overview Suresh Marru, Marlon Pierce September 2009.
Application Web Service Toolkit Allow users to quickly add new applications GGF5 Edinburgh Geoffrey Fox, Marlon Pierce, Ozgur Balsoy Indiana University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
New Workflow Manager Katarzyna Bylec PSNC. Agenda Introduction WLIN Workflows DEMO KIWI Design Workflow Manager System Components descripton ▫ KIWI Portal.
Review of PARK Reflectometry Group 10/31/2007. Outline Goal Hardware target Software infrastructure PARK organization Use cases Park Components. GUI /
Google Summer of Code Project Updates Jeff Kinnison, University of Notre Dame Pradyut Madhavaram, City University of New York.
Accessing the VI-SEEM infrastructure
Integrating ArcSight with Enterprise Ticketing Systems
湖南大学-信息科学与工程学院-计算机与科学系
Overview of Workflows: Why Use Them?
Gordon Erlebacher Florida State University
Presentation transcript:

Apache Airavata GSOC 2013

Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced Science Tools Target Community: Science Gateways Enabling & Democratizing Scientific Research

What does Apache Airavata do? Compose, manage, execute, and monitor distributed, computational workflows. Wrap legacy command line scientific applications with Web services. Run jobs on computational resources ranging from local resources to computational grids and clouds. Manage provenance data.

Workflow Interpreter Application Factory Message Box Regist ry Apache Airavata API Lorem ipsumLorem ipsum insolensinsolens p1p1 m5m5 duo duo x End Users Gateway Developer Scientific Applicati on Core Developer Computational Resources Apache Airavata

Apache Airavata Components ComponentDescription XBayaWorkflow graphical composition tool. Registry ServiceInsert and access application, host machine, workflow, and provenance data. Workflow Interpreter Service Execute the workflow on one or more resources. Application Factory Service (GFAC) Manages the execution and management of an application in a workflow Messaging SystemWS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events Airavata APISingle wrapping client to provide higher level programming interfaces.

Hi, I’m Nolram. I’m a computational physicist. I run computational experiments everyday This is how typically I run my experiments

Scientific Application Another Scientific Application First I collect my observed data And then pass data to my applications & get the result This is starting to become a very tiring task

How can I make this much simpler…? Logically, this is how my life would be made easier… Is it possible to automate this flow sequence without my guidance?

Scientists from many different fields face this problem everyday. The solution is to use a workflow-powered science gateway to manage the experiment online. What is a workflow you ask? Well, you just saw one in our previous animation…

We introduce Apache Airavata, a system capable of composing, managing, executing, and monitoring small to large scale applications and workflows Want to see how it works? A Typical Workflow

Apache Airavata I will handover my data & my experiment details (the workflow) to the Airavata server The Gateway Airavata will complete the experiment & return me the results Results Progress of the experiment … and while I wait for results, Airavata will notify me with progress updates of my experiment

Let’s look closely how Airavata manages workflows. The Gateway Results Experiment progress Apache Airavata

Let’s look closely how Airavata manages workflows. The Gateway Results Experiment progress

Airavata main has 4 components… The Gateway 1. Workflow Interpreter Steer the workflow execution 2. The GFac Steer science app executions & data transfers Workflow Interpreter GFac Message Box Registry 3. The Registry Defines the available applications & records all results of experiments 4. The Message Box Records the progress of the workflow execution

A Stable API for Airavata Apache Airavata End Users Gateway Developer Scientific Application Computational Resources

Application Registration UI Application Registration UI Application Developer A1 Airavata Service Interface (wraps client API) Airavata Service Interface (wraps client API) Service Map XML Airavata Server Service Map to AWSDL Web Based workflow composer Workflow Developer Get AWSDL Put XWF A2 A3 W1 W2 W3 Web Based Experiment Builder Experiment Builder Get WI’s E1 E2 E3 W4 Shred Workflow Inputs Launch Workflow Web Based Workflow Monitor Watch Progress M1 M2 Get Workflow Graph M3 Monitor Workflow

Goal of the project Design Web-Based interfaces for Airavata: – Application Registration – Workflow Construction – Workflow Execution – Workflow Monitoring Provide an opportunity for GSoC to understand Distributed System in action Scope for Research and Software Engineering papers

Data Model Application Description – User describes inputs and outputs of the application. – Currently this information is captured in Service Map Schema. – This schema is stored in Airavata Registry as XML. Also the schema utility generates a application service WSDL from this schema using the Airavata WSDL Generator.

Application Registration UI Application Registration UI Application Developer A1 Service Map XML Service Map to AWSDL Web Based workflow composer Workflow Developer Get AWSDL A2 W1 W2 XML Airavata Server API Launch & Manage Jobs Notify progress of job or workflow execution Real-Time Monitoring Messaging Subsystem Registry Execute & Manage Computations Workflow Interpreter Application Factory (Gfac) Applicatio n Desc Workflow

A peek at one of the cluster Interconnect Nodes

Scheduling ‘qsub’ batch jobs on the cluster worker node SGE MASTER node Queue-AQueue-BQueue-C A Slot 1A Slot 2B Slot 1C Slot 1C Slot 2C Slot 3B Slot 1B Slot 2B Slot 3 B Slot 1C Slot 1 C Slot 2A Slot 1B Slot 1C Slot 1  Queues  Policies  Priorities  Share/Tickets  Resources  Users/Projects JOB YJOB ZJOB XJOB UJOB OJOB N

System characteristics System status Resources Job policies Resources Resource Matching JOB User User policies Groups Roles Departments Projects SelectionScheduling

Simplified Gateway Architecture One time Gateway Community Setup Community Account Grid Certificate username, password Gateway Interface Gateway Server Compute Servers Gateway Authentication Fetch Community Credential Grid Proxy Job Submit or File Transfer request Output Proxy, Job Request Job Status, Output Step 0 Step 1 Step 2,3,,

ParamChemBioVLabGridChemDES VLAB UltraScan Apache Airavata 1.0 NSG ParamChemBioVLab CIPRESPOPLAR GridChemDES UltraScan VLAB… …… Apache Airavata 2.0 Apache Airavata 1.0 Apache Airavata 1.0 Apache Airavata 1.0 Apache Airavata 1.0 Apache Airavata 1.0