1 P-GRADE Portal tutorial at EGEE'09 Gergely Sipos MTA SZTAKI EGEE Training and Induction.

Slides:



Advertisements
Similar presentations
1 CEOS WGISS Meeting, May 8-12, 2006, Budapest MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences
Advertisements

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to EGEE hands-on Gergely Sipos.
1 P-GRADE Portal and GEMLCA Legacy Code Architecture Peter Kacsuk MTA SZTAKI
P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann MTA SZTAKI.
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
P-GRADE and WS-PGRADE portals supporting desktop grids and clouds Peter Kacsuk MTA SZTAKI
1 MTA SZTAKI Application development on EGEE with P-GRADE Portal Gergely Sipos
EGEE-II INFSO-RI Enabling Grids for E-sciencE Grid application development with gLite and P-GRADE Portal Miklos Kozlovszky MTA SZTAKI.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
WS-PGRADE: Supporting parameter sweep applications in workflows Péter Kacsuk, Krisztián Karóczkai, Gábor Hermann, Gergely Sipos, and József Kovács MTA.
Grid Execution Management for Legacy Code Applications Exposing Application as Grid Services Porto, Portugal, 23 January 2007.
1 MTA SZTAKI Hungarian Academy of Sciences Grid application support by the P-GRADE Portal Peter Kacsuk.
1 P-GRADE Portal and GEMLCA: A workflow-oriented portal and application hosting environment Miklos Kozlovszky.
1 Application Specific Module for P-GRADE Portal 2.7 Application Specific Module overview Akos Balasko MTA-SZTAKI LPDS
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
1 portal.p-grade.hu További lehetőségek a P-GRADE Portállal Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
1 P-GRADE Portal: Towards a User-friendly Grid Environment Tamas Kiss Centre for Parallel Computing.
1 portal.p-grade.hu Further information on P-GRADE Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
07/06/11 New Features of WS-PGRADE (and gUSE) 2010 Q Q2 Miklós Kozlovszky MTA SZTAKI LPDS.
From P-GRADE to SCI-BUS Peter Kacsuk, Zoltan Farkas and Miklos Kozlovszky MTA SZTAKI - Computer and Automation Research Institute of the Hungarian Academy.
INFSO-RI Enabling Grids for E-sciencE Supporting legacy code applications on EGEE VOs by GEMLCA and the P-GRADE portal P. Kacsuk*,
Introduction to WS-PGRADE and gUSE Tutorial Akos Balasko 04/17/
WS-PGRADE portal and its usage in the CancerGrid project M. Kozlovszky, P. Kacsuk Computer and Automation Research Institute of the Hungarian Academy of.
1 Advanced features of the P-GRADE portal Peter Kacsuk, Gergely Sipos Peter Kacsuk, Gergely Sipos MTA.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
1 P-GRADE Portal tutorial MTA SZTAKI Gergely Sipos
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Application Porting Support in EGEE Gergely Sipos MTA SZTAKI EGEE’08.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
P-GRADE and GEMLCA.
1 P-GRADE Portal: a workflow-oriented generic application development portal Peter Kacsuk MTA SZTAKI, Hungary Univ. of Westminster, UK.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status report on Application porting at SZTAKI.
Parameter Study Principles & Practices. What is Parameter Study? Parameter study is the application of a single algorithm over a set of independent inputs:
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
1 P-GRADE Portal: An easy to use graphical interface for Globus and EGEE Grids.
Parameter Study Principles & Practices. Outline Data Model of the PS Part I Simple PS –Generating simple PS Workflow by introducing PS Input port – using.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
1 P-GRADE Portal tutorial at EGEE’09 Introduction to hands-on Gergely Sipos MTA SZTAKI EGEE.
The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no Workflow repository, user.
1 Practical information for the GEMLCA / P-GRADE hands-on Tamas Kiss University of Westminster.
SHIWA and Coarse-grained Workflow Interoperability Gabor Terstyanszky, University of Westminster Summer School Budapest July 2012 SHIWA is supported.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to P-GRADE Portal hands-on Miklos Kozlovszky MTA SZTAKI
1 portal.p-grade.hu Workflow and parameter study management by P-GRADE Portal Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
EGEE-II INFSO-RI Enabling Grids for E-sciencE P-GRADE overview and introduction: workflows & parameter sweeps (Advanced features)
1 Other features and next steps Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
1 Support for Parameter Study applications in the P-GRADE Portal Cevat Şener Dept. Of Computer Engineering, METU.
1 P-GRADE Portal hands-on Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
1 Further information and next steps Further information and next steps Gergely Sipos MTA SZTAKI
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
1 P-GRADE Portal and Developer Alliance Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences portal.p-grade.hu.
1 Support for parameter study applications in the P-GRADE Portal Gergely Sipos MTA SZTAKI (Hungarian Academy of Sciences)
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Application specific portlet other portal features and next steps Miklos Kozlovszky.
Tamas Kiss University Of Westminster
P-GRADE overview and introduction: workflows & parameter sweeps (Advanced features) Gergely Sipos MTA SZTAKI
Introduction to gUSE and WS-PGRADE portal
P-GRADE Portal tutorial
Grid Application Support Group Case study Schrodinger equations on the Grid Status report 16. January, Created by Akos Balasko
Introduction to P-GRADE Portal hands-on
Application development on EGEE with P-GRADE Portal
Introduction to the SHIWA Simulation Platform EGI User Forum,
Workflow level parametric study support by the P-GRADE portal
Presentation transcript:

1 P-GRADE Portal tutorial at EGEE' Gergely Sipos MTA SZTAKI EGEE Training and Induction EGEE Application Porting Support

2 Agenda of the morning Introduction to workflow concept Workflow hands-on ~ Break Parameter studies Parameter study hands-on Further information and next steps

3 Workflow The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules to achieve, or contribute to, an overall business goal. Workflow management system (WFMS) is the software that does it Workflow Reference Model, 19/11/1998

4 Why use workflows in Grid? Build distributed applications through orchestration of multiple services A single job or a single service is good for nothing… Integration of multiple teams involved Collaborative work Unit of reusage (E-)science requires traceable, repetable analysis (Typically) ease of use grids Graphical representation

5 Grid Workflow definition examples Grid workflow can be defined as the composition of grid application services which execute on heterogeneous and distributed resources in a well-defined order to accomplish a specific goal. R. Buyya The automation of the processes, which involves the orchestration of a set of Grid services, agents and actors that must be combined together to solve a problem or to define a new service. Geoffrey Fox [GGF 10]

6 25 x 10 x 25 x 5 x Forecasting dangerous weather situations (storms, fog, etc.), crucial task in the protection of life and property Processed information: surface level measurements, high- altitude measurements, radar, satellite, lightning, results of previous computed models Requirements: Execution time < 10 min High resolution (1km) Example: Ultra-short range weather forecast with P-GRADE Portal Execution on a GT2 based Hungarian Grid

7 Montage application ~7,000 compute jobs in instance ~10,000 nodes in the executable workflow same number of clusters as processors speedup of ~15 on 32 processors Example: Montage workflow with Pegasus (and DAGMan) Pegasus: a Framework for Mapping Complex Scientific Workflows onto Distributed Systems, Ewa Deelman, Gurmeet Singh, Mei-Hui Su, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Karan Vahi, G. Bruce Berriman, John Good, Anastasia Laity, Joseph C. Jacob, Daniel S. Katz, Scientific Programming Journal, Volume 13, Number 3, 2005 Tasks run on NSF’s TeraGrid

8 Example: CancerGrid workflow with gUSE (and WS-PGRADE) 1 1 x1x1 N xNxN NxM xNxN N xNxN N Generator job N=20e-30e, M=100  ~2.7 billion tasks !!! Generator job 1 CancerGrid Portal Workflow is hidden from end users Tasks run on Desktop Grids and RDBMS

9 Grid WFMS Source: Jia Yu and Rajkumar Buyya: A Taxonomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing, Volume 3, Numbers 3-4 / September, 2005Volume 3, Numbers 3-4 / September, 2005

10 What does a typical Grid WFMS provide? A level of abstraction above grid processes –gridftp, lcg-cr, lfc-mkdir,... –condor-submit, globus-job-run, glite-wms-job-submit,... –lcg-infosites,... A level of abstraction above „legacy processes” –SQL read/write –HTTP file transfer –... Automated mapping and execution of tasks grid resources –Submission of jobs –Invocation of (Web) services –Manage data –Catalog intermediate and final data products Improve successful application execution Improve application performance Provide provenance tracking capabilities

11 What does a typical grid workflow consist of? Dataflow graph Activities –Definition of Jobs –Specification of services Data channels –Data transfer –Coordination Cyclic (DAG) /acyclic Conditional statements

12 Data lifecycle in workflows Workflow Creation Workflow Mapping and Execution Workflow Reuse

13 User interaction Workflow Creation Workflow Mapping and Execution Workflow Reuse WF definition tools WF enactment service Storages, Catalogs

14 Layered architecture of WFMS Grid scheduler e.g. Condor Schedd Reliable, scalable execution of independent tasks (locally, across the network), priorities, scheduling WF scheduler e.g. Condor DAGMan Reliable and scalable execution of dependent tasks WF optimizer e.g. Pegasus Mapper A decision system that develops strategies for reliable and efficient execution in a variety of environments Cyberinfrastructure: Cluster, Condor pool, OSG, EGEE, TeraGrid Abstract Workflow Results

15 (Some of the) available grid workflow systems (Some of the) available grid workflow systems Categories for –Composition tools –Description languages Scientific Industrial Formalism –Engines Some relevant tools for ARC, gLite, Globus, UNICORE grid users Condor DAGMan –Used as an enactor in P-GRADE Portal, Pegasus, … –Uses DAGMan WF language (DAG = Directed Acyclic Graph) MOTEUR –Interfaced with “pilot job” framework on EGEE (pull style job execution) –Uses SCUFL WF language gLite WMS –Describe workflows in JDL –Share Input-Output sandboxes with multiple jobs Taverna –Mainly for cluster computing –ARC interface is available by Lubeck University …

16 P-GRADE Portal A Grid WFMS

17 Short History of P-GRADE portal Parallel Grid Application Development Environment Initial development started in the Hungarian SuperComputing Grid project in 2003 It has been continuously developed since 2003 Around 30 manyear development + training + user support Detailed information: Open Source community development since January 2008: Current version: 2.8

18 Current P-GRADE Portal related projects GGF GIN (Since 2006) –Providing the GIN Resource Testing portal EU EGEE-II, EGEE-III ( ) –Tool recommended for application development –Intensively used in new users’ training EU SEE-GRID-SCI ( ) –Interfacing to DSpace-based workflow storage –Infrastructure testing workflows EU CancerGrid ( ) –Development of new generation P-GRADE (gUSE and WS-PGRADE) –Integration with desktop grids EU EDGeS ( ) –Transparent access to Desktop Grid systems

19 Portal installations P-GRADE Portal services: –SEE-GRID infrastructure –Several VOs of EGEE: Biomed, Astronomy, Central European, NA4,... –GILDA: Training VO of EGEE –Many national Grids (UK National Grid Service, HunGrid, Turkish Grid, etc.) –US Open Science Grid, TeraGrid –OGF Grid Interoperability Now (GIN) VO –… Portal services and account request: Account request form on portal login page

20 Multi-Grid portal installation:

21 Design principles of P-GRADE portal P-GRADE Portal is not only a user interface, it is a –General purpose –Workflow-level –Multi-Grid –Application Development and Execution Environment P-GRADE Portal includes a high-level middleware layer for orchestrating jobs on grid resources –inside a grid –among several different grids (and several VOs) P-GRADE Portal is grid-neutral: –Unlike many existing grid portals it is not tailored to any particular grid type –Can be connected to various grids based on different grid middleware LCG-2, gLite, GT2, GT4, ARC, Unicore, etc. –Implements the high-level grid middleware services on top of the existing grid middleware services –The workflow interface is the same no matter which type of grid is connected to it

22 What is a P-GRADE Portal workflow? A directed acyclic graph where –Nodes represent jobs (batch programs to be executed on a computing element) –Ports represent input/output files the jobs expect/produce –Arcs represent file transfer operations semantics of the workflow: –A job can be executed if all of its input files are available

23 Three levels of parallelism – – PS workflow level: Parameter study execution of the workflow – – Workflow level: Parallel execution among workflow nodes (WF branch parallelism) Multiple jobs run parallel Each job can be a parallel program – – Job level: Parallel execution inside a workflow node (MPI job as workflow component) Multiple instances of the same workflow process different data files

24 ~100 independent jobs to run Example: Computational Chemistry Department of Chemistry, University of Perugia SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME- DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD A single execution can be between 5 hours and 10 hours SEQUENTIAL FORTRAN 90 Many simulations at the same time Full story: EGEE Grid Application Porting Support -

25 Typical user scenario Job compilation phase Portal server Grid services DOWNLOAD BINARI(ES) UPLOAD JOB SOURCE(S) Client COMPILE – EDIT

26 Typical user scenario Workflow development phase Portal server Grid services START EDITOR OPEN & EDIT WORKFLOW ADD BINARIES SAVE WORKFLOW Client DSpace WF repository IMPORT WORKFLOW

27 MyProxy Certificate servers Portal server Grid services TRANSFER FILES, SUBMIT JOBS DOWNLOAD (SMALL) RESULTS Typical user scenarios Workflow execution phase VISUALIZE JOBS and WORKFLOW PROGRESS MONITOR JOBS DOWNLOAD PROXY CERTIFICATES Client

28 Accessing local and remote files Portal server Grid services Computing elements Storage elements and File catalogs REMOTE INPUT FILES REMOTE OUTPUT FILES LOCAL INPUT FILES & EXECUTABLES LOCAL OUTPUT FILES LOCAL INPUT FILES & EXECUTABLES LOCAL OUTPUT FILES Only the permanent files! Use legacy executables with Grid files without touching the code

29 Extended DAGMan Java Webstart workflow editor Web browser EGEE, Globus (and ARC) Grid services + MyProxy service (gLite WMS, LFC,…; Globus GRAM, …) Globus and gLite command line clients + scripts P-GRADE Portal structural overview Extended DAGMan WF specification Globus GIIS gLite BDII DSpace repository

30 Web interface - Portlets

31 notifications NOTIFY

32 Workflow portlet WORKFLOW EDITOR

33 Graphical workflow editing To define a graph: 1.Drag & drop components: jobs and ports 2.Define their properties 3.Connect ports by channels (no cycles, no loops) System generates JDL for each job automatically

34 Workflow Editor Properties of a job Properties of a job: Executable file Type of executable (Sequential / Parallel) Command line parameters Which resource to use? Which VO? Broker or Computing element?

35 Workflow Editor Defining input-output files File properties Type: input: the executable reads output: the executable generates File type: local: comes from my desktop remote: comes from an SE File: location of the file Internal file name: Executable uses this e.g. fopen(“file.in”, …) File storage type (output files only): Permanent: final result Volatile: temp. data channel

36 Client side location: result.dat LFC logical file name (LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat GridFTP address (in Globus Grids): gsiftp://somengshost.ac.uk/mydir/result.dat Local file Remote file How to refer to an I/O file? Client side location: c:\experiments\11-04.dat LFC logical file name (LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat GridFTP address (in Globus Grids): gsiftp://somengshost.ac.uk/mydir/11-04.dat Input fileOutput file

37 Upload a workflow from client side or from FTP server UPLOAD STORED on FTP server

38 Importing an application INCOMPLETE WORKFLOW  Open it in editor and save it again

39 Import a workflow from DSpace repository

40 External access to DSpace

41 Certificate and proxy management Portlet

42 OGF GIN interoperability portal by P-GRADE Acccessing Globus, gLite and ARC based grids/VOs simultaneously P-GRADE portal Proxy 1 Proxy 2 Proxy 5 Proxy 4 Proxy 3 Proxy 6

43 Application execution

44 Fault-tolerant execution Utilizing –Condor DAGMan’s rescue mechanism –EGEE job resubmission mechanism of WMS If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically –kills the job on this site and –resubmits the job to the broker by prohibiting this site. As a result –the portal guarantees the correct submission of a job as long as there exists at least one matching resource –job submission is reliable even in an unreliable grid

45 Information system visualization

46 LFC-SE file browser portlet

47 Compilation support

48 WORKFLOW HANDS-ON

49 From workflows to parameter studies Advanced execution patterns

50 Scaling up a workflow to a parameter study Complete workflow P-GRADE Portal: Files in the same LFC catalog (e.g. /grid/gilda/sipos/myinputs ) P-GRADE Portal: Results produced in the same catalog

51 Advanced parameter studies Generator component(s) Initial input data Generate or cut input into smaller pieces Collector component(s) Aggregate result Complete workflow P-GRADE Portal: Files in the same LFC catalog (e.g. /grid/gilda/sipos/myinputs ) P-GRADE Portal: Results produced in the same catalog

52 Concept of parameter study workflows GEN SEQ COLL SEQ Parameter study part Collector part evaluates and integrates the results Generator part generates the input parameter space

53 Turning a WF into a parameter study By switching at least one of the open input ports into a “PS Input port” the WF is turned into a Parameter Study

54 Input-output files are stored in SEs /grid/gilda/sipos/InputImages Image.0 Image.1 /grid/gilda/sipos/XCoordinates XCoordinate.0 XCoordinate.1 /grid/gilda/sipos/YCoordinates YCoordinate.0 YCoordinate.1 /grid/gilda/sipos/Output ImagePart.0 ImagePart x 2 x 2 = 8 execution of the whole workflow CROSS PRODUCT of data items

55 A B Typical data-flow compositions A X B M WF A1A1 A2A2 A3A3 B1B1 B2B2 B3B3 {A 1, A 2, A 3 }{B 1, B 2, B 3 } X WF A1A1 A2A2 A3A3 B1B1 B2B2 B3B3 {A 1, A 2, A 3 } {B 1, B 2, B 3 } dot iterator: one-to-one cross iterator: all-to-all WF AiAi BjBj {A 1, A 2, A 3 } match iterator If A i and B j have a common ancestor {B 1, B 2, B 3 } A M B CROSS ITERATORDOT ITERATOR MATCH ITERATOR Find these in e.g. TAVERNA, MOTEUR P-GRADE Portal supports this

56 PS Input Port Grid Directory instead of FILE reference

57 Parameter generator Generator can be attached to any parameter input port Generator can be Auto generator: to generate text files Custom generator: to generate any content Generated files are moved into SE by the portal

58 Definition Window of Auto Generator Job User defines the template of the text file User puts key(s) into the template User defines values for the key(s) Integer number Real number Custom set …

59 Placement of result

60 Will contain one compressed file for each execution of the workflow. Use the default value! Choose a „reliable” Storage Element Placement of result

61 Executing PS workflows PS Details for parameter sweep workflows applications

62 Detailed view of a PS workflow Workflow instances Overall statistics of workflow instances Collector job(s) Generator job(s)

63 PARAMETER STUDY HANDS-ON

64 Thank you! Learn once, use everywhere Develop once, execute anywhere

65 Backup slides to answer questions

66 Proxy delegations MyProxy server P-GRADE Portal server GILDA services Proxy VOMS server Proxy VOMS ext. Proxy VOMS ext. username password Proxy based authentication Login & psw based authentication username password

67 Settings Portal administrator can –connect the portal to several grids –register default resources of the connected grids

68 Settings User can customize the connected grids by adding and removing resources