The EDGI (European Desktop Grid Initiative) infrastructure and its usage for the European Grid user communities József Kovács (MTA SZTAKI) jozsef.kovacs@sztaki.hu http://edgi-project.eu Start date: 2010-06-01 Duration: 24 months EDGI is supported by the FP7 Capacities Programme under contract nr RI-261556
EU FP7 projects on desktop grids: EDGeS → EDGI and DEGISCO further developed by EDGI ARC, Unicore, Clouds QoS with Clouds Data intensive apps SG→DG direction support EDGeS DG ↔SG integration: gLite → BOINC, XtremWeb BOINC, XtremWeb → gLite Compute intensive applications supported by DEGISCO Disseminate and support EDGeS results world-wide Green IT aspects
Focus of EDGI
Vision EDGeS scope only for compute intensive applications for EGEE (gLite) EDGI scope for both compute and data intensive applications for EMI/EGI (gLite, ARC, Unicore) Extend Desktop Grids with Clouds for QoS
Volunteer (Global, Public) Desktop Grid Types of Desktop Grids Volunteer (Global, Public) Desktop Grid Aim is to collect resources for grand-challenge scientific problems Examples: SETI@home, Folding@home, Shakemovie@home, LHC@home Community World Grid, IberCivis, SZTAKI Desktop Grid Institutional (Local, Non-public) Desktop Grid Aim is to enable the quick, easy and inexpensive creation of grid for any community (company, university, etc.) to solve their own applications Example: SZTAKI Desktop Grid (SZDG) local version (used within EDGeS, EDGI, DEGISCO), University of Westminster Institutional DG
Institutional DGs in practice – University of Westminster as an example 6 1 2 5 New Cavendish Street 576 nodes Marylebone Campus 559 nodes Regent Street 395 nodes Wells Street 31 nodes Little Tichfield Street 66 nodes Harrow Campus 254 nodes Total: 1881 nodes 4 3 Lifecycle of a node: PCs basically used by students/staff If unused, switch to Desktop Grid mode No more work from DG server -> shutdown (green solution)
EDGI main areas/topics Middleware development: gLite, Arc and Unicore bridge (to DG) Infrastructure development: bridging services Application porting activity: already more than 30 validated scientific applications: application repository Cloud development: DG → Cloud bridge, using clouds as resources Improving DGs with QoS Supporting Data Intensive Applications Consolidation of EDGeS results: improving scalability, repository, etc. Supporting user communities, developers, operators, etc. European chapter of the International Desktop Grid Federation Dissemination/support of EDGI results world-wide through EDGI and DEGISCO communication channels
The EDGI infrastructure upload ARC grid Eucalyptus/ Amazon Attic FS UI ARC MCE attic monitor AR DG client submit attic 3GBridge down- load AR User IF Bridge IF cloud attic DG Pro- ject Volunteer/ Institutional Resources monitor DG client attic gLite grid CREAM MCE attic monitor AR submit Monitor UI inspect Developments have been finalised by end of March 2011 Production system will be operational from 1st of May 2011 UNICORE extension from Autumn 2011
A high level scenario for grid users Step 1: Select an application you want to execute on Grids Use any from the EDGI repository Bring your own application Step 2: EDGI project performs the necessary preparations for you and as a result the application appears in the EDGI application repository application porting registering the application in the AR and in several DGs infrastructure setup for your access (by the operators of the connecting grids) Step 3: Go to our EDGI AR and collect info for your submission Step 4: Prepare your input files, create and submit jdl Step 5: Query the status and download the outputs
Step 3: Collect information from EDGI AR (authorized vos, ces)
Step 3 (cont.): Collect information from the EDGI AR (files)
Step 4, 5: submit jdl and get results Create your my.jdl (as usual): Executable = "dsp"; Arguments = "-f 22 -i 22 -p 723 -n pools.txt"; InputSandbox = { "gsiftp://dev17-portal.cpc.wmin.ac.uk:2811/srv/edgi/1001/1102/dsp" , "pools.txt", OutputSandbox = {"cost.txt"}; SubmitTo = "cr1.edgi-grid.eu:8443/cream-pbs-edgidemo"; Submit, Status query, Output downloading can be performed by the well- known gLite commands: glite-wms-job-submit, glite-wms-job-status, glite- wms-job-logging-info, glite-wms-job-output, etc.
Latest development in EDGI to improve scalability: MetaJob concept upload ARC grid Eucalyptus/ Amazon Attic FS UI ARC MCE attic monitor AR DG client submit Single job attic MetaJob as a single job Unfolding 3GBridge down- load AR User IF Bridge IF cloud Huge number of jobs attic DG Pro- ject Single job Volunteer/ Institutional Resources monitor DG client attic gLite grid CREAM MCE attic monitor AR submit Monitor UI inspect Demonstrated at EGI UF, 12th of April, 2011 at both #10 with 10.000 jobs through gLite
Step 4b: create and submit jdl Create your jdl: Executable = "dsp"; Arguments = "-f 22 -i 22 -p 723 -n pools.txt"; InputSandbox = { "gsiftp://dev17-portal.cpc.wmin.ac.uk:2811/srv/edgi/1001/1102/dsp" , "pools.txt", "_3gb-metajob-dsp-10000" }; OutputSandbox = {"cost.txt"}; SubmitTo = "cr1.edgi-grid.eu:8443/cream-pbs-edgidemo"; Submit: glite-wms-job-submit -o id edgi-metajob-10000.jdl MetaJob definition as Extra input file
MetaJob: input files and metajob definition Upload your individual input files to web server: http://somewhere.com/pools1.txt … http://somewhere.com/pools10000.txt Create the description of your metajob: %Comment pools1.txt Arguments = "-i 0 -n pools.txt -f 22 -p 723“ Input = pools.txt=http://somewhere.com/pools1.txt=7b7eb86bf50c58cbf92dc12ff5adf7f4=9652 Queue %Comment pools10000.txt Input = pools.txt=http://somewhere.com/pools10000.txt=7b7eb86bf50c58cbf92dc12ff5adf7f4=9652
MetaJob: Query the status and logging Location of the status description
MetaJob: Query the detailed status Metajob handling rules Actual status of jobs Location of the mapping
MetaJob: Download the results Download result as usual: glite-wms-job-output -i id --dir outputs ./outputs/cost.txt Extract it: tar zxvf cost.txt ./outputs/<subjobid1>/cost.txt ./outputs/<subjobid2>/cost.txt … ./outputs/<subjobid10000>/cost.txt See the mapping between your individual job definition and the jobids (which gives the name of directories storing the output files of your app) one subjob id
SG/DG Applications on Production infrastructure (ported by EDGeS, EDGI and DEGISCO) community number of applications academic industry Bioscience 9 8 1 Healthcare 2 Physics Audio and video processing 4 3 Business Applied mathematics Engineering Total 31 28
DesktopGrid applications available @ EGI AppDB
Key issue: SUSTAINABILITY The International Desktop Grid Federation (IDGF) brings together: Desktop Grid developers Desktop Grid operators Application developers Everyone else interested in Desktop Grid computing Open membership http://desktopgridfederation.org
Members of
Thank you for your attention… Contacts: Peter Kacsuk (kacsuk@sztaki.hu) (coordinator of EDGI, EDGeS) Jozsef Kovacs (jozsef.kovacs@sztaki.hu) (deputy coordinator, technical leader) Websites: www.edgi-project.eu www.edges-grid.eu Acknowledgements: EDGI EU FP7 project (RI-261556) EDGeS EU FP7 project (INFSO-RI-211727)