Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.

Slides:



Advertisements
Similar presentations
Workload Management David Colling Imperial College London.
Advertisements

EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
The Grid Constantinos Kourouyiannis Ξ Architecture Group.
Job Submission The European DataGrid Project Team
Riccardo Bruno, INFN.CT Sevilla, 10-14/09/2007 GENIUS Exercises.
E-infrastructure shared between Europe and Latin America 12th EELA Tutorial for Users and System Administrators Architecture of the gLite.
SEE-GRID-SCI Hands-On Session: Workload Management System (WMS) Installation and Configuration Dusan Vudragovic Institute of Physics.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
Querétaro (Mexico), E2GRIS – Job Description Language JDL 1.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
Glite WMS overview Alessandra Forti Computing Seminar Manchester 20th November 2008.
Grid Initiatives for e-Science virtual communities in Europe and Latin America The Job Description Language JDL 1.
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals GILDA Tutors INFN Catania ICTP/INFM-Democritos Workshop on Porting Scientific.
Computational grids and grids projects DSS,
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
M. Sgaravatto – n° 1 The EDG Workload Management System: release 2 Massimo Sgaravatto INFN Padova - DataGrid WP1
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Nadia LAJILI User Interface User Interface 4 Février 2002.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Security and Job Management.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
EGEE is a project funded by the European Union under contract IST Job Description Language - more control over your Job Assaf Gottlieb University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM and ICE Massimo Sgaravatto – INFN Padova.
INFSO-RI Enabling Grids for E-sciencE Workflow Management in Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos Workshop on Porting.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Enabling Grids for E-sciencE Workload Management System on gLite middleware - commands Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Alexandre Duarte CERN IT-GD-OPS UFCG LSD 1st EELA Grid School.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Workload management in gLite 3.x - MPI P. Nenkova, IPP-BAS, Sofia, Bulgaria Some of.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMPROXY usage Álvaro Fernández IFIC (CSIC)
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
EGEE is a project funded by the European Union under contract IST Job Description Language – How to control your Job Nadav Grossaug IsraGrid.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Porting an application to the EGEE Grid & Data management for Application Rachel Chen.
Enabling Grids for E-sciencE Sofia, 17 March 2009 INFSO-RI Introduction to Grid Computing, EGEE and Bulgarian Grid Initiatives –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMS tricks & tips – further scripting Giuseppe.
User Interface UI TP: UI User Interface installation & configuration.
LCG2 Tutorial Viet Tran Institute of Informatics Slovakia.
Presentation of the results khiat abdelhamid
First South Africa Grid Training June 2008, Catania (Italy) OVERVIEW of the gLite COMPONENTS Marcello Iacono Manno FIRST.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Create an script to print “hello world” in an output file with also the information of an input file. The input file should be previously register in the.
Workload Management System on gLite middleware
Workload Management System ( WMS )
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
Job Submission in the DataGrid Workload Management System
Introduction to Grid Technology
Workload Management System
gLite Job Management Mario Reale GARR
5. Job Submission Grid Computing.
The gLite Workload Management System
gLite Job Management Christos Theodosiou
Job Description Language
GENIUS Grid portal Hands on
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan

 Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 2 Outline

3 API Access Job Mgmt. Services Computing Element Workload Management Metadata Catalog Data Services Storage Element Data Movement File & Replica Catalog Authorization Security Services Authentication Information & Monitoring Information & Monitoring Services Service Discovering Accounting Auditing Job Provenance Package Manager CLI Network Monitoring Overview of gLite Middleware

How to work

Compute Element 5 Condor-G Globus client gLite WMS User CREAM CEMon ICE CREAM or BES client EGEE authZ, InfoSys, Accounting In production Existing prototype gLite component non-gLite component Batch System LCG-CE (GT2/4 + add-ons) Condor-C BLAH User / Resource User Interface Computing Element GIP Workload Manager

 Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 6 Outline

Workload Management System Ref gLite-3.2 User Guide 7  The purpose of the Workload Management System (WMS):  To accept user jobs  To assign them to the most appropriate Computing Element  To record their status  To retrieve their output  The WMS used to be called Resource Broker (RB).  The service is called gLite-WMS.

Job Workflow in gLite-WMS 8 WMS/ Workload Management system File catalog IS/ Information system SE/ Storage Element CE/ Computing Element WN/ Worker Node UI JDL Input Sandbox Output Sandbox U I/ User Interface

UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage Input Sandbox files Job submitted WMS glite-wms-job-submit myjob.jdl WMProxy is responsible for accepting incoming requests

10 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted WM: responsible to take the appropriate actions to satisfy the request Job WMS

11 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted Match- Maker/ Broker Where must this job be executed ? WMS Matchmaker: responsible to find the “best” CE where to submit a job

12 WMS UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted Information supermarket Responsible of resource information available to Matchmaker Match- Maker/ Broker

13 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted Match- Maker/ Broker WMS Information supermarket CE choice

UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage JC: responsible for the actual job management operations (done via CondorG) Job submitted waiting ready WMS Task Queue

15 WMS UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage Job Input Sandbox files submitted waiting ready scheduled Task Queue

16 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element RB storage Input Sandbox submitted waiting ready scheduled running “Grid enabled” data transfers/ accesses Job WMS Task Queue

17 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element RB storage Output Sandbox files submitted waiting ready scheduled running done WMS Task Queue

18 UI NS Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element RB storage Output Sandbox files submitted waiting ready scheduled running done cleared WMS Task Queue glite-wms-job-output

19 UI Logging & Bookkeeping WMProxy Job Contr. - CondorG Workload Manager Computing Element LB: receives and stores job events; processes corresponding job status Log of job events Job status glite-wms-job-status glite-wms-job-logging-info WMS LB proxy

Job state machine 20

gLite-WMS Job States Ref gLite-3.2 User Guide 21 StatusDescription SUBMMITEDsubmission logged in the LB WAITjob match making for resources READYjob being sent to executing CE SCHEDULEDjob scheduled in the CE queue manager RUNNIGjob executing on a WN of the selected CE queue DONEjob terminated without grid errors CLEAREDjob output retrieved ABORTjob aborted by middleware, check reason

 Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 22 Outline

23  CREAM: Web Service Computing Element  Cream WSDL allows defining custom user interface  C++ CLI interface allows direct submission  Lightweight  Fast notification of job status changes  via CEMon  Improved security  no “fork-scheduler”  Will support for bulk jobs on the CE  optimization of staging of input sandboxes for jobs with shared files  ICE: Interface to Cream Environment  being integrated in WMS for submissions to CREAM Computing Resource Execution And Management

Job Stat Machine Ref gLite-3.2 User Guide 24

CREAM Job States 25 StatusDescription REGISTEREDthe job has been registered but it has not been started yet. PENDINGthe job has been started, but it has still to be submitted to the LRMS abstraction layer module (i.e. BLAH). IDLEthe job is idle in the Local Resource Management System (LRMS). RUNNINGthe job wrapper, which "encompasses" the user job, is running in the LRMS. REALLY-RUNNINGthe actual user job (the one specified as Executable in the job JDL) is running in the LRMS. HELDthe job is held (suspended) in the LRMS. CANCELLEDthe job has been cancelled. DONE-OKthe job has successfully been executed. DONE-FAILEDthe job has been executed, but some errors occurred. ABORTEDerrors occurred during the "management" of the job, e.g. the submission to the LRMS abstraction layer software (BLAH) failed. UNKNOWNthe job is an unknown status.

Job Control Command Ref gLite-3.2 User Guide 26 gLite WMSgLite CREAM Delegate proxy glite-wms-job-delegate-proxy -d delegID glite-ce-job-delegate-proxy -e endpoint -d delegID Submit glite-wms-job-submit [-d delegID] [-a] [-o joblist] jdlfile glite-ce-job-submit [-d delegID] [-a] [-o joblist] -r ceIDs jdlfile Status glite-wms-job-status -i joblist | jobIDs glite-ce-job-status -i joblist | jobIDs Logging glite-wms-job-logging-info -i joblist | jobIDs Output glite-wms-job-output [-dir outdir] -i joblist | jobIDs Cancel glite-wms-job-cancel -i joblist | jobID glite-ce-job-cancel -i joblist | jobID Compatible resources glite-wms-job-list-match [-d delegID] [-a] jdlfile

 Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 27 Outline

Job Description Language for WMS Ref gLite-3.2 User Guide 28 wms]$ ls checkHost.sh Host_wms.jdl wms]$ cat Host_wms.jdl JobType = "Normal"; CPUNumber = 1; Executable = "checkHost.sh”; StdOutput = "std.out"; StdError = "std.err”; InputSandbox = {"checkHost.sh"}; OutputSandbox = {"std.out", "std.err", "Host.log"}; RetryCount = 5; Requirements = other.GlueCEUniqueID == "as-ce01.euasiagrid.org:8443/cream-pbs-euasia"; wms]$ cat checkHost.sh #!/bin/sh echo "HOST: `hostname`" >> Host.log printenv >> Host.log

Example for WMS Ref gLite-3.2 User Guide 29 wms]$ glite-wms-job-submit -a Host_wms.jdl ====================== glite-wms-job-submit Success ====================== The job has been successfully submitted to the WMProxy Your job identifier is: ========================================================================== wms]$ glite-wms-job-status ======================= glite-wms-job-status Success ===================== BOOKKEEPING INFORMATION: Status info for the Job : 7qE54xk0Sw Current Status: Scheduled Status Reason: unavailable Destination: as-ce01.euasiagrid.org:8443/cream-pbs-euasia Submitted: Sat Feb 2 16:35: UTC ==========================================================================

Example for WMS Ref gLite-3.2 User Guide 30 wms]$ glite-wms-job-output --dir. ================================================================================ JOB GET OUTPUT OUTCOME Output sandbox files for the job: have been successfully retrieved and stored in the directory: /home/hkw00/HAII/ce/wms/hkw00_FtH87_dKEfp-7qE54xk0Sw ======================================================== wms]$ ls /home/hkw00/HAII/ce/wms/hkw00_FtH87_dKEfp-7qE54xk0Sw Host.log std.err std.out

 Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 31 Outline

Job Description Language for CREAM Ref gLite-3.2 User Guide 32 cream]$ ls checkHost.sh Host_cream.jdl cream]$ cat Host_cream.jdl JobType = "Normal"; CPUNumber = 1; Executable = "checkHost.sh”; StdOutput = "std.out"; StdError = "std.err”; InputSandBox = {"/home/hkw00/HAII/ce/cream/checkHost.sh"}; OutputSandBox = {"Host.log"}; OutputSandboxDestURI = {"gsiftp://as- ds01.euasiagrid.org/dpm/euasiagrid.org/home/euasia/hkw00/"}; RetryCount = 5; cream]$ cat checkHost.sh #!/bin/sh echo "HOST: `hostname`" >> Host.log printenv >> Host.log

Example CREAM Ref gLite-3.2 User Guide 33 cream]$ lcg-infosites --vo euasia ce # CPU Free Total Jobs Running Waiting ComputingElement as-ce01.euasiagrid.org:8443/cream-pbs-euasia ce-qamar.utmgrid.utm.my:8443/cream-pbs-euasia ce.utmgrid.utm.my:8443/cream-pbs-euasia (...) cream]$ glite-ce-job-submit -r as-ce01.euasiagrid.org:8443/cream-pbs- euasia -a Host_cream.jdl cream]$ glite-ce-job-status ce01.euasiagrid.org:8443/CREAM ****** JobID=[ Status = [DONE-OK] ExitCode = [0]

Example CREAM Ref gLite-3.2 User Guide 34 cream]$ lcg-ls srm://as- ds01.euasiagrid.org/dpm/euasiagrid.org/home/euasia/hkw00/ /dpm/euasiagrid.org/home/euasia/hkw00//Host.log (...) cream]$ lcg-cp srm://as- ds01.euasiagrid.org/dpm/euasiagrid.org/home/euasia/hkw00/Host.log file:`pwd`/Host.log cream]$ ls checkHost.sh Host_cream.jdl Host.log