1 P-GRADE Portal tutorial MTA SZTAKI Gergely Sipos
2 Agenda Basics of P-GRADE Portal (~1 hour) Workflow hands-ons (~90 minute) Developing Application Specific Portal using P-GRADE (~30 minute) Agenda: –
3 P-GRADE overview and introduction: workflows & parameter sweeps (Basics)
4 Introduction of LPDS (Lab of Parallel and Distr. Systems) Research division of MTA SZTAKI from 1998 Head: Peter Kacsuk, Prof. 22 research fellows Foundation member – Central European Grid Consortium (2003) – Hungarian Grid Competence Center (2003) Participant or coordinator in many European and national Grid research, infrastructure, and educational projects (from 2000) – FP5: GridLab, DataGrid – FP6: EGEE I-II, SEE-GRID I-II, CoreGrid, ICEAGE, CancerGrid – FP7: EGEE III, SEE-GRID-SCI, EDGeS (coordinator), ETICS, S- CUBE Central European Grid Training Center in EGEE (from 2004)
5 Short History of P-GRADE portal Parallel Grid Application and Development Environment Initial development started in the Hungarian SuperComputing Grid project in 2003 It has been continuously developed since 2003 Detailed information: Open Source community development since January 2008:
6 Download of OSS P-GRADE portal 110 downloads within the first month ~697 total downloads until now
7 Main P-GRADE related projects EU SEE-GRID-1 ( ) –Integration with LCG-2 and gLite EU SEE-GRID-2 ( ) –Parameter sweep extension EU CoreGrid ( ) –To solve grid interoperation for job submission –To solve grid interoperation for data handling: SRB, OGSA-DAI GGF GIN (2006) –Providing the GIN Resource Testing portal EGEE 2,3 ( ) –Respect program tool used for training and application development ICEAGE ( ) –P-GRADE portal is used for training as official portal of the GILDA training infrastructure EU EDGeS ( ) –Transparent access to any EGEE and Desktop Grid systems –See Demo Booth 5: EDGes – Desktop Grid Extension of the EGEE Infrastructure
8 References P-GRADE Portal service is available for –SEE-GRID infrastructure –Central European VO of EGEE –GILDA: Training VO of EGEE –Many national Grids (UK National Grid Service, GridIreland, Turkish Grid, Croatian Grid, etc.) –US Open Science Grid –Economy-Grid, Swiss BioGrid, Bio and Biomed EGEE VOs, BioInfoGrid, BalticGrid –GIN VO of OGF –EGEE Respect program tool
9 Portal installations
10 Multi-Grid service portal To be used today!
11 Current situation and trends in Grid computing Fast evolution of Grid middleware: –GT2, OGSA, GT3 (OGSI), GT4 (WSRF), LCG-2, gLite, … Many production Grid systems are built with them –EGEE (LCG-2 gLite), UK NGS (GT2), Open Science Grid (GT2 GT4), NorduGrid (~GT2) Although the same set of core services are available everywhere, they are implemented in different ways –Data services –Computation services –Security services (single sign-on) –(Brokers)
12 E-scientists’ concerns The P-GRADE Grid Portal gives you the answers! How to concentrate on my own research if the middleware I would like to use is in continuous change? How can I learn and understand the usage of the Grid? How can I develop Grid applications? How can I execute grid applications? How to tackle performance issues? How to use several Grids at the same time? How to migrate my application from one grid to another? How can I collaborate with fellow researchers?
13 Motivations for developing P-GRADE portal P-GRADE portal should –Hide the complexity of the underlying grid middlewares –Provide a high-level graphical user interface that is easy-to- use for e-scientists –Support many different grid programming approaches: Simple Scripts & Control (sequential and MPI job execution) Scientific Application Plug-ins Complex Workflows Parameter sweep applications: both on job and workflow level Interoperability: transparent access to grids based on different middleware technology (both computing and data resources) –Support several levels of parallelism
14 Layers in a Grid system Basic Grid services: AA, job submission, info, … Higher-level grid services (brokering,…) Application toolkits, standards Application Grid middleware Command line tools P-GRADE Portal services Graphical interface
15 Design principles of P-GRADE portal P-GRADE Portal is not only a user interface, it is a –General purpose –Workflow-level –Multi-Grid –Application Development and Execution Environment P-GRADE Portal includes a high-level middleware layer for orchestrating grid resources –inside a grid –among several different grids P-GRADE Portal is grid-neutral: –Unlike many existing grid portals it is not tailored to any particular grid type –Can be connected to various grids based on different grid middleware LCG-2, gLite, GT2, GT4, ARC, Unicore, etc. –Implements the high-level grid middleware services on top of the existing grid middleware services –The workflow interface is the same no matter which type of grid is connected to it
16 What is a P-GRADE Portal workflow? a directed acyclic graph where –Nodes represent jobs (batch programs to be executed on a computing element) –Ports represent input/output files the jobs expect/produce –Arcs represent file transfer operations semantics of the workflow: –A job can be executed if all of its input files are available
17 Three Levels of parallelism – PS workflow level: Parameter study execution of the workflow – Workflow level: Parallel execution among workflow nodes (WF branch parallelism) Multiple jobs can run parallel Each job can be a parallel program – Job level: Parallel execution inside a workflow node (MPI job as workflow component) Multiple instances of the same workflow can process different data files
18 25 times Example: Computational Chemistry Department of Chemistry, University of Perugia SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME- DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD A single execution can be between 5 hours and 10 hours SEQUENTIAL FORTRAN 90 Many simulations at the same time See at demo booth 11: EGEE Application Porting Support Group
19 Grid interoperation by P-GRADE Acccessing Globus, gLite and ARC based grids/VOs simultaneously P-GRADE portal
20 Typical user scenario Job compilation phase Certificate servers Portal server Grid services DOWNLOAD BINARI(ES) UPLOAD SOURCE(S) Client COMPILE – EDIT
21 Typical user scenario Application development phase Certificate servers Portal server Grid services START EDITOR OPEN & EDIT WORKFLOW or PARAMETER STUDY SAVE APPLICATION Client
22 Certificate servers Portal server Grid services TRANSFER FILES, SUBMIT JOBS DOWNLOAD (SMALL) RESULTS Typical user scenarios Workflow execution phase VISUALIZE JOBS and APPLICATION PROGRESS MONITOR JOBS DOWNLOAD PROXY CERTIFICATES Client
23 P-GRADE Portal structural overview User interface layer Presents the user interface Internal layer – Java classes Represents the internal concepts Java Webstart workflow editor Web browser EGEE and Globus Grid services (gLite WMS, LFC,…; Globus GRAM, GridFTP, …) Client P-GRADE Portal server Grid Grid layer – gLite and Globus command line tools Interfacing with grid services
24 Interface layer User interface layer Java Webstart workflow editor Web browser Client Web server P-GRADE Portal server Gridpshere Web portal framework Gridsphere portlets P-GRADE portlets Workflow monitor: Java applet generator Workflow editor: Java webstart application
25 Interface layer functionalities User interface layer Java Webstart workflow editor Web browser Client Web server P-GRADE Portal server Gridpshere Web portal framework Gridsphere portlets P-GRADE portlets Workflow monitor: Java applet generator Workflow editor: Java webstart application Workflow portlet Workflow manager, Storage, Upload Certificate portlet Upload, download and other operations Settings portlet Grid settings, Quota settings File management Manage files in the grid Compiler portlet Compile jobs on portal server Login Welcome...
26 P-GRADE vs. Non-P-GRADE portlets P-GRADE Portal portlets GridSphere 2.x Grid Portal framework
27 Interface layer User interface layer Java Webstart workflow editor Web browser Client Web server P-GRADE Portal server Gridpshere Web portal framework Gridsphere portlets P-GRADE portlets Workflow monitor: Java applet generator Workflow editor: Java webstart application
28 Interface layer User interface layer Java Webstart workflow editor Web browser Client Web server P-GRADE Portal server Gridpshere Web portal framework Gridsphere portlets P-GRADE portlets Workflow monitor: Java applet generator Workflow editor: Java webstart application
29 Internal layer P-GRADE Portal server Grid layer Interfacing with grid services Gridsphere portletsP-GRADE portlets Workflow monitor: Java applet generator Workflow editor: Java webstart application “Tracefile” Java package Parses workflow monitoring information Workflow editor server (Java servlet) Workflow retrieval, upload Workflow state publication “Szupergrid” Java package: Workflow representation Resources configuration Quota management Certificate management Java interfaces
30 Grid layer P-GRADE Portal server Grid layer Gridsphere portletsP-GRADE portlets Workflow monitor: Java applet generator Workflow editor: Java webstart application “Szupergrid” Java package: “Tracefile” Java package Workflow editor server (Java servlet) Workflow manager (Condor DAGMan) shell scripts Grid middleware clients: gLite User Interface Globus client packages EGEE and Globus Grid services (gLite WMS, LFC,…; Globus GRAM, …)
31 Grid layer P-GRADE Portal server Grid layer Gridsphere portletsP-GRADE portlets Workflow monitor: Java applet generator Workflow editor: Java webstart application “Szupergrid” Java package: “Tracefile” Java package Workflow editor server (Java servlet) Workflow manager (Condor DAGMan) shell scripts Grid middleware clients: gLite User Interface Globus client packages EGEE and Globus Grid services (gLite WMS, LFC,…; Globus GRAM, …) Client side command line tools and programming APIs to interact with gLite and Globus Grid Services
32 Portlets/functionalities of P-GRADE portal Settings (portlet) Certificate and proxy management (portlet) Information system visualization (portlet) Graphical workflow editing Workflow manager (portlet) LFC (EGEE) file management (portlet) Compilation support (portlet) Fault-tolerance support
33 Settings Portlet Portal administrator can –connect the portal to several grids –register the basic resources of the connected grids
34 Settings Portlet User can customize the connected grids by adding and removing resources
35 Certificate and proxy management Portlet User can upload his certificates of various grids to the MyProxy server User can download proxys and allocate to grids User can use simultaneously as many proxys as many grids are connected to the portal As a result parallel branches of a workflow can be executed simultaneously in several grids SEE-GRID access HUNGRID access
36 EGEE Grid UK NGS P-GRADE-Portal London Rome Athens Solving Grid interoperation by P-GRADE Portal Different jobs can be parallel executed in different grids
37 Interoperation vs. Interoperability Interoperation: –short term solution that defines what needs to be done to achieve interoperation between current production grids using existing technologies Interoperability: –native ability of Grids and Grid middleware to interact directly via common open standards As defined by the GIN (Grid Interoperation Now) CG (Community Group) of the OGF (Open Grid Forum) Grid 1Grid 2Grid 3 P-GRADE Portal Grid 1 Grid 2Grid 3 Interoperation Interoperability
38 Information system Portlet
39 Graphical workflow editing The aim is to define a DAG of batch jobs: 1.Drag & drop components: jobs and ports 2.Define their properties 3.Connect ports by channels (no cycles, no loops, no conditions) 4.Automatically generates JDL file
40 Workflow Editor Properties of a job Properties of a job: Binary executable Type of executable Number of required processors Command line parameters The resource to be used for the execution: Grid/VO (Computing element)
41 Workflow Editor Defining broker jobs Select a Grid with broker! (*_BROKER) Ignore the resource field! If default JDL is not sufficient use the built-in JDL editor!
42 Workflow Editor Defining input-output files File properties Type: input: the job reads output: the job generates File type: local: comes from my desktop remote: comes from an SE File: location of the file Internal file name: Executable reads the file in this name – fopen(“file.in”, …) File storage type (output files only): Permanent: final result Volatile: only data channel
43 Client side location: result.dat LFC logical file name (LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat GridFTP address (in Globus Grids): gsiftp://somengshost.ac.uk/mydir/result.dat Local file Remote file How to refer to an I/O file? Client side location: c:\experiments\11-04.dat LFC logical file name (LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat GridFTP address (in Globus Grids): gsiftp://somengshost.ac.uk/mydir/11-04.dat Input fileOutput file
44 Local vs. remote files Portal server Grid services Computing elements Storage elements REMOTE INPUT FILES REMOTE OUTPUT FILES LOCAL INPUT FILES & EXECUTABLES LOCAL OUTPUT FILES LOCAL INPUT FILES & EXECUTABLES LOCAL OUTPUT FILES Only the permanent files! Your binary can access data services directly too GridFTP API GFAL API lfc-*, lcg-* commands
45 Workflow manager Lists available workflows Enables –Submitting –Aborting –Deleting existing workflows Shows status, logs and results of workflow executions Orchestrates job executions inside a workflow
46 LFC (EGEE) file management
47 Compilation support
48 Fault-tolerant Grid applications Utilizing –Condor DAGMan’s rescue mechanism –EGEE job resubmission mechanism of WMS If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically –kills the job on this site and –resubmits the job to the broker by prohibiting this site. As a result –the portal guarantees the correct submission of a job as long as there exists at least one matching resource –job submission is reliable even in an unreliable grid
49 Lessons learnt P-GRADE portal provides –Easy-to-use but powerful workflow system (graphical editor, wf manager, etc.) –Three levels of parallelism MPI job level Workflow branch level Parameter sweep at workflow level –Multi-grid/multi-VO access mechanism for various grids (LCG-2, gLite and GT2) Simultaneous access Transparent access Migrating a workflow from one grid to another requires no modification in the workflow
50 About the Practices
51 Practice 1 Solve matrix multiplication on HUNGRID (one job workflow) Job executable: C code, compiled on GILDA UI Expects command line parameters: M V Knows nothing about the grid Job input/output files: Program reads matrixes from two files called INPUT1 and INPUT2 Program writes result matrix into file called OUTPUT Local execution on a PC:./multiply M V Task: Execute the program on EGEE, transfer input and output files in Sandboxes from the client Binary executable INPUT INPUT OUTPUT
52 gLite Storage Element Practice 2 Save the multiplication OUTPUT on a Storage Element and register in the File Catalog Modify output file type from “Local” to “Remote” Specify a logical file name as target location: lfn:/grid/hungrid/userXX Binary executable INPUT INPUT OUTPUT Storage Element is selected automatically by gLite middleware lfn:/grid/hungrid/userXX/… Logical file name is defined by you gLite File Catalog Browse result file using the File Manager Portlet
53 Practice 3 Combine jobs to build a MatrixOperations workflow AB[*, 0] T * AB[*, 1] Matrix AMatrix B A * B A * B [ *, 0 ]A * B [ *, 1 ] B ( A * B [ *, 0 ] T ) * ( A * B [ *, 1 ] ) A * B [ *, 0 ] T
54 Practice 4 Matrix multiplication PS parameter study workflow with 5 parameters Multiplication job Matrix
55 Multiplication job Auto generator Input files stored on SEs and registered in LFC catalog Y 9 <= Y <=21, step Matrix2 Output files stored on SEs and registered in LFC catalog... Practice 4 Matrix multiplication PS parameter study workflow with 5 parameters Collector Compressed output files
56 Thank you! Learn once, use everywhere Develop once, execute anywhere