EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org Grid application development with gLite and P-GRADE Portal Miklos Kozlovszky MTA SZTAKI.

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to EGEE hands-on Gergely Sipos.
Advertisements

INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Supporting MPI Applications on EGEE Grids Zoltán Farkas MTA SZTAKI.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
Porto, January Grid Computing Course Summary of day 2.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of the EGEE project and the gLite middleware Gergely Sipos MTA SZTAKI
1 portal.p-grade.hu Further information on P-GRADE Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Configuring and Maintaining EGEE Production.
Example Gridification via command-line Application Developer Training Day IV. Miklos Kozlovszky Ankara, 25. October, 2007.
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
EGEE-II INFSO-RI Enabling Grids for E-sciencE An Introduction to the EGEE Project Presented by Min Tsai ISGC 2007, Taipei With thanks.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Training services offered by SZTAKI for EGEE and EGI Gergely Sipos MTA SZTAKI (Hungarian.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to GILDA and gaining access.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE Gergely Sipos
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
1 P-GRADE Portal tutorial MTA SZTAKI Gergely Sipos
EGEE-III INFSO-RI Enabling Grids for E-sciencE Application Porting Support in EGEE Gergely Sipos MTA SZTAKI EGEE’08.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status report on Application porting at SZTAKI.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks HYP3D Gilles Bourhis Equipe SIMPA, laboratoire.
EGEE-0 / LCG-2 middleware Practical.
INFSO-RI Enabling Grids for E-sciencE GILDA and GENIUS Guy Warner NeSC Training Team An induction to EGEE for GOSC and the NGS NeSC,
1 P-GRADE Portal tutorial at EGEE’09 Introduction to hands-on Gergely Sipos MTA SZTAKI EGEE.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
1 Practical information for the GEMLCA / P-GRADE hands-on Gergely Sipos On behalf of: MTA.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of the EGEE project and the gLite middleware Mike Mineter
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to P-GRADE Portal hands-on Miklos Kozlovszky MTA SZTAKI
EGEE-II INFSO-RI Enabling Grids for E-sciencE P-GRADE overview and introduction: workflows & parameter sweeps (Advanced features)
1 P-GRADE Portal hands-on Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
1 Porting applications to the NGS, using the P-GRADE portal and GEMLCA Peter Kacsuk MTA SZTAKI Hungarian Academy of Sciences Centre for.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using Certificate & Simple Job Submission Jinny Chien ASGC.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Application Porting Support Gergely Sipos,
RI EGI-TF 2010, Tutorial Managing an EGEE/EGI Virtual Organisation (VO) with EDGES bridged Desktop Resources Tutorial Robert Lovas, MTA SZTAKI.
1 Support for parameter study applications in the P-GRADE Portal Gergely Sipos MTA SZTAKI (Hungarian Academy of Sciences)
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Guy Warner NeSC Training Team Induction to Grid Computing.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite grid middleware Gergely Sipos MTA SZTAKI
P-GRADE Portal tutorial
Introduction to Grid Technology
Introduction to P-GRADE Portal hands-on
Overview of the EGEE project and the gLite middleware
EGEE Middleware: gLite Information Systems (IS)
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE Grid application development with gLite and P-GRADE Portal Miklos Kozlovszky MTA SZTAKI

Enabling Grids for E-sciencE EGEE-II INFSO-RI Presenter MTA SZTAKI (Hungarian Academy of Sciences) Laboratory of Parallel and Distributed Systems –Miklos Kozlovszky EGEE-III (Enabling Grids for E-sciencE) oGASUC Team oTrainings and dissemination activities SEE-GRID2 / SEE-GRID-SCI (South Eastern European GRID-enabled eInfrastructure Development) o Manager of “Dissemination and Training” (WP5/NA3)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Introduction of LPDS (Lab of Parallel and Distr. Systems) Research division of MTA SZTAKI from 1998 Head: Peter Kacsuk, Prof. 22 research fellows Foundation member – Central European Grid Consortium (2003) – Hungarian Grid Competence Center (2003) Participant or coordinator in many European and national Grid research, infrastructure, and educational projects (from 2000) – FP5: GridLab, DataGrid – FP6: EGEE I-II, SEE-GRID I-II, CoreGrid, ICEAGE, CancerGrid – FP7: EGEE III, SEE-GRID-SCI, EDGeS (coordinator), ETICS, S-CUBE Central European Grid Training Center in EGEE (from 2004)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Webpage Find it from EGEE User Forum Webpage OR EGEE Training webpage (Google EGEE NA3)  Events and registration (top menu) ..., Paris, December Save the direct link! Long term storage of training material –Presentations in PPT –Tutorials in HTML/DOC/PDF

Enabling Grids for E-sciencE EGEE-II INFSO-RI Feedback form Your comments and feedbacks are highly valuable for EGEE training Please fill in the feedback form and return at the end of the course Anonymous Scores: (very bad - very good) Comments are highly appreciated

Enabling Grids for E-sciencE EGEE-II INFSO-RI Goals of the day Basic concepts of –Workflow –Parameter study on EGEE Implementation in P-GRADE Portal Further information –How to learn more –How to get access to EGEE –How to port your own application to EGEE

Enabling Grids for E-sciencE EGEE-II INFSO-RI Agenda Application development on gLite * –Workflow and parameter study concepts on EGEE –Workload management and data services in gLite Workflow and parameter study support in P-GRADE Portal Hands-on –Workflow exercises –Parameter study exercises How to learn more * = (mostly skipped, please refer to previous presentations from yesterday)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Agenda Application development on gLite –Workflow and parameter study concepts on EGEE –Workload management and data services in gLite Workflow and parameter study support in P-GRADE Portal Hands-on –Workflow exercises –Parameter study exercises How to learn more

Enabling Grids for E-sciencE EGEE-II INFSO-RI Grid vision GRIDMIDDLEWAREGRIDMIDDLEWARE Visualising Workstation Mobile Access Supercomputer, PC-Cluster Data-storage, Sensors, Experiments Internet, networks

Enabling Grids for E-sciencE EGEE-II INFSO-RI Problems to solve Standardised access to resources –Computers –Storages –Special equipments –Software services Access policy Load balancing Monitoring resources and services Monitoring applications Fault management Programming contepts, level of abstraction User interfaces...

Enabling Grids for E-sciencE EGEE-II INFSO-RI EGEE grid, gLite middleware Where computer science meets the application communities! The tools, services used by the VO’s applications NA4 Recommended External Software Packages for Egee CommuniTies –Current RESPECT tools:  GridWay  P-GRADE Portal –  “Grid software” menu Basic gLite services: CE, SE, info, security Higher-level gLite services (WMS,...) Application toolkits Application Production infrastructure contains these services –High level services: help the users building their computing infrastructure but should not be mandatory –Basic services: Must be complete and robust; Should not assume the use of Higher-Level Grid Services Command line & APIs

Enabling Grids for E-sciencE EGEE-II INFSO-RI INTERNET gLite middleware runs on each EGEE site to provide –Data services: Computing Element –Computation services: Storage Element –Security service Sites and users form Virtual Organisations: basis for collaboration Each VO can / must have central software services and support groups VO concept P-GRADE Portal

Enabling Grids for E-sciencE EGEE-II INFSO-RI Basic gLite use case: Job submission Computing Element Storage Element Site X Information System Submit job (executable + small inputs) Submit job query Retrieve output Resource Broker User Interface publish state File and Replica Catalog VO Management Service (DB of VO users) query create proxy process Retrieve status & (small) output files Logging and bookkeeping Job status Logging Input file(s) Output file(s) Register file

Enabling Grids for E-sciencE EGEE-II INFSO-RI User Interface (UI) User Interface (UI): The place where users logon to the Grid Computing Element (CE) Computing Element (CE): A batch queue on a site’s computers where the user’s job is executed Storage Element (SE) Storage Element (SE): provides (large-scale) storage for files Resource Broker (RB) (Workload Management System (WMS) Resource Broker (RB) (Workload Management System (WMS): Matches the user requirements with the available resources on the Grid Main components Information System Information System: Characteristics and status of CE and SE File and replica catalog File and replica catalog: Location of grid files and grid file replicas Logging and Bookkeeping (LB) Logging and Bookkeeping (LB): Log information of jobs

Enabling Grids for E-sciencE EGEE-II INFSO-RI User Interface (UI) User Interface (UI): The place where users logon to the Grid Computing Element (CE) Computing Element (CE): A batch queue on a site’s computers where the user’s job is executed Storage Element (SE) Storage Element (SE): provides (large-scale) storage for files Resource Broker (RB) (Workload Management System (WMS) Resource Broker (RB) (Workload Management System (WMS): Matches the user requirements with the available resources on the Grid Main components Information System Information System: Characteristics and status of CE and SE File and replica catalog File and replica catalog: Location of grid files and grid file replicas Logging and Bookkeeping (LB) Logging and Bookkeeping (LB): Log information of jobs All built upon authorisation, authentication, security

Enabling Grids for E-sciencE EGEE-II INFSO-RI How can I get access to EGEE? Obtain a certificate from a recognized CA: – – Find the official CA of your countrywww.gridpma.org  1 year long, renewable certificates  Accepted in every EGEE VO –GILDA CA – two weeks long, renewable certificate  Accepted only in GILDA training VO (VO to be used today) Find and register at a VO –List of VOs with Usage rules: CIC Operations portal:  Scientific discipline  Geographical region Use the VO services –Through (low level) command line tools of gLite (Not today) –Through high level tools  E.g. P-GRADE Portal, GENIUS, GANGA,...  Access mechanism varies from tool to tool CA VO manager Obtaining certificate: Annually VOMS database Grid sites VO Membership Service Joining VO: Once

Enabling Grids for E-sciencE EGEE-II INFSO-RI Application developer’s questions I have a computational intensive problem How does it relate to this scenario? –What is a grid job for me? –How many jobs do I have, how they relate to each other and to my data? –What is the input / output data for each job? –How to write a job to access input / output data? –How to submit, monitor the job? How to access their results? –Do I need to use additional services to my the application demands? Answers –Now (sometimes specifically on P-GRADE Portal) –Or any time later for general purpose from Grid Application Support group (GASuC)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Functional Vs Data parallelism Functional Decomposition (Functional Parallelism) –Decomposing the problem into different jobs which can be distributed to different CEs for simultaneous execution  Different executables run on different CEs (and may or may not process the same data) –Good to use when  When the data cannot be partitioned  there is not static structure or fixed determination of number of calculations to be performed

Enabling Grids for E-sciencE EGEE-II INFSO-RI Functional decomposition Job 1 on Computing Element #1 Job 2 on Computing Element #2 time The problem Job 3 on Computing Element #3 Job 4 on Computing Element #4 Job submission Job monitoring Result download

Enabling Grids for E-sciencE EGEE-II INFSO-RI Functional decomposition in practice: workflow time The problem Job submission Job monitoring Result transfer Data dependency Job submission Job monitoring Job submission Job monitoring Result download Workflow manager e.g. P-GRADE Portal server

Enabling Grids for E-sciencE EGEE-II INFSO-RI Functional Vs Data parallelism Data Decomposition (Data Parallelism) –Partitioning the problem's data domain and distributing portions to multiple instances of the same job for simultaneous execution –Same executable runs on different CEs and process different data –Good to use for problems where:  data is static (e.g. factoring, solving large matrix or finite difference calculations, parameter studies)  dynamic data structure tied to single entity where entity can be subsetted (large multi-body problems)  domain is fixed but computation within various regions of the domain is dynamic (fluid vortices models) > 90% of grid applications employ data parallelism (parameter study, parametric study)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Data decomposition Job 1 on Computing Element #1 Job 2 on Computing Element #2 Job 3 on Computing Element #3 Job 4 on Computing Element #4 The problem Data segment 1 time Job submission Job monitoring Result download Data segment 2 Data segment 3 Data segment 4 Algorithm

Enabling Grids for E-sciencE EGEE-II INFSO-RI Data decomposition in practice: Master-slave Master job Slave job Final result Inputs Results Generate inputs Spawn slaves Monitor slaves Collect results Generate final result Job submit Get job output Master process, e.g. P-GRADE Portal server

Enabling Grids for E-sciencE EGEE-II INFSO-RI Multi-level master-slave Master job Slave job Input Results Generate inputs Spawn slaves Monitor slaves Collect results Job submit Check job status Get job output Final result Master job Slave job Input Results Generate inputs Spawn slaves Monitor slaves Collect results Job submit Check job status Get job output Generate final result

Enabling Grids for E-sciencE EGEE-II INFSO-RI Complex master-slave Final result Master job Slave job input results Generate inputs Spawn slaves Monitor slaves Collect results Slave job input results Generate inputs Spawn slaves Monitor slaves Collect results Slave job input results Generate inputs Spawn slaves Monitor slaves Collect results Generate final result

Enabling Grids for E-sciencE EGEE-II INFSO-RI Complex master-slave = Parameter study workflow Final result Master job Slave job input results Generate local inputs Spawn slaves Monitor slaves Collect local results Slave job input results Generate local inputs Spawn slaves Monitor slaves Collect local results Slave job input results Generate local inputs Spawn slaves Monitor slaves Collect local results Generate result Workflow manager 3 file 9 file 3 input 9 input 27 output 3 x 9 = 27 WF

Enabling Grids for E-sciencE EGEE-II INFSO-RI Defining a job Executable (EGEE runs Scientific Linux v3 or v4) –Script:  No compilation is necessary  Can invoke real executable which is statically installed on the CE (VOBox) –Binary:  Must be compiled on the User Interface  binary compatibility with EGEE is guaranteed  Statically linked  to avoid errors caused by library versions Input / output data –Input files  Smaller than 20 MByte? If YES transfer them from client side (“InputSandbox” ) If NOT upload them into Storage element before job submission –Output files  Smaller than 20 MByte? If YES transfer them back to client side (“OutputSandbox”) if NOT upload them into Storage element from Computing Element

Enabling Grids for E-sciencE EGEE-II INFSO-RI Distribution of large datasets Puts large files into Storage Elements and register them in Logical File Catalog (LFC) (covered already during previous sessions) Large files do not go through the broker Master job Slave job Logical File Names Generate local inputs Spawn slaves Monitor slaves Collect local results Generate result Job submit Check job status Get job output LFC & SEs Inputs Results Logical File Names Broker

Enabling Grids for E-sciencE EGEE-II INFSO-RI File services in gLite Storage Element 3 sfn://trigriden01.unime.it/flatfiles/SE00/gilda/generated/ /filec79a9e3c a2a5-235f Storage Element 2 srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda/generated/ /filea21ab3e2-8ff6-4a44-82a7-f2 Users’ files are stored on Storage Elements A file on a SE is identified by a Storage URL (e.g. sfn://grid005.iucc.ac.it/flatfiles/SE00/gilda/generated/ /filec79a9e3c a2a5-235f) User refer to files by Logical File Names (LFN) LFC = directory structure of LFNs + pointers to SURLs (Files can have replicas) lfn:/grid/gilda/kozlovszky/run2/ input1 input2 input3 Storage Element 1 sfn://grid005.iucc.ac.il/storage/gilda/generated/ /fileb233d43f-5bc6-4ede-a5fe-611d48be2ba5 LFC Storage Element 4 sfn://grid005.iucc.ac.it/flatfiles/SE00/gilda/generated/ /filec79a9e3c a2a5-235f

Enabling Grids for E-sciencE EGEE-II INFSO-RI Name conventions Users primarily access and manage files through “logical filenames” Defined by the userLFC Namespace LFC has a directory tree structure lfn:/grid/ / Today: lfn:/grid/gilda/parisXX/...

Enabling Grids for E-sciencE EGEE-II INFSO-RI Managing a workload with gLite command line tools Login to the User Interface machine Write your jobs. Operations in a job: –Access LFC, resolve LFN –Access SE, get file content –Process file –Write result to SE –Register file in LFC (Compile your jobs to get the executables) Write a job description for each job using Job Description Language (JDL) –Text file –Specifies Executable, Input and Output LFNs –Specifies resource requirements and preferences (Which CE) Write the description of your workload –Workflow JDL or parametric job JDL (No parametric workflow!)  myworkload.jdl Use shell commands to –Submit the workload: glite-wms-job-submit myworkload.jdl  wlID –Monitor the status: glite-wms-job-status wlID –Get the output sandbox:glite-wms-job-output wlID Write a program (e.g. script) to –Register input files in LFC before the workload is started –Resubmit failed jobs –Download result files from Storages when wokrload is finished

Enabling Grids for E-sciencE EGEE-II INFSO-RI Managing a workload with gLite command line tools Login to the User Interface machine Write your jobs. Operations in a job: –Access LFC, resolve LFN –Access SE, get file content –Process file –Write result to SE –Register file in LFC (Compile your jobs to get the executables) Write a job description for each job using Job Description Language (JDL) –Text file –Specifies Executable, Input and Output LFNs –Specifies resource requirements and preferences (Which CE) Write the description of your workload –Workflow JDL or parametric job JDL (No parametric workflow!)  myworkload.jdl Use shell commands to –Submit the workload: glite-wms-job-submit myworkload.jdl  wlID –Monitor the status: glite-wms-job-status wlID –Get the output sandbox:glite-wms-job-output wlID Write a program (e.g. script) to –Register input files in LFC before the workload is started –Resubmit failed jobs –Download result files from Storages when wokrload is finished `

Enabling Grids for E-sciencE EGEE-II INFSO-RI Further information, references EGEE – gLite middleware – gLite manuals, documentation – (gLite user guide) Recommended External Software Packages for EGEE Communities (RESPECT) – P-GRADE Grid Portal – P-GRADE Grid Portal (Here to login…) –