Presentation is loading. Please wait.

Presentation is loading. Please wait.

Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.

Similar presentations


Presentation on theme: "Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI."— Presentation transcript:

1 Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI

2 2 Outline Status of KISTI integration with Geant4 resources The production system Some more details on DIANE For more information see: Andrea Dotti 2012 J. Phys.: Conf. Ser. 396 032033

3 3 Status As of February 18, 2013 jobs are running at KISTI via darthvader All nodes occupied at 100% Full production performed in about 48hrs What has been done: Installed missing libraries Performed simple testing (starting application locally) Performed remote testing (small scale): start one job at the time remotely from CERN (using full infrastructure) Performed full production test (queue of 2200 jobs): submit maximum number of jobs and monitor cluster is 100% occupied on several hours, check output

4 4 Results: from production monitoring Jobs configurations Jobs configurations Jobs queue Total 2.4M events Jobs queue Total 2.4M events Output at CERN repository Failures die to misconfiguration (my-fault) Stable production mode: no problems observed over several hours Rate of produced events strongly depends on configuration, expect to simulate All events in 48 hrs

5 5 Production system System based on four components 1. CernVM-FS: to distribute (read-only) the Geant4 software 2. DIANE/GANGA: to submit jobs to the grid and retrieve the output 3. SimplifiedCalorimeter: Geant4 application (LHC calorimeters) to extensively test all aspects of physics simulations 4. Results DataBase: to store summaries from 3., logging information of jobs status, include web-application to produce plots

6 6 Architecture Python wrapper Application DIANE and GANGA OS / GRID middleware CernVM-FS Recognized as The most critical (DIANE not anymore supported) Includes interaction w/ DB and analysis Of results (not discussed here)

7 7 Deployment GANGA session DIANE CERN Repo Node Squid HTTP proxy Squid HTTP proxy Failover Job: “connect to DIANE server and get work” Download: work config Upload: results Communication: CORBA KISTI

8 8 DIANE master Python application It defines a queue of tasks A task is defined by: Command line to execute Command line arguments Input and output files (if any) # tell DIANE that we are just running executables # the ExecutableApplication module is a standard DIANE test application from diane_test_applications import ExecutableApplication as application # the run function is called when the master is started # input.data stands for run parameters def run(input,config): d = input.data.task_defaults # this is just a convenience shortcut # all tasks will share the default parameters (unless set otherwise in individual task) d.input_files = ['hello.sh'] d.output_files = ['message.out'] d.executable = 'hello' # here are tasks differing by arguments to the executable for i in range(20): t = input.data.newTask() t.args = [str(i)] # tell DIANE that we are just running executables # the ExecutableApplication module is a standard DIANE test application from diane_test_applications import ExecutableApplication as application # the run function is called when the master is started # input.data stands for run parameters def run(input,config): d = input.data.task_defaults # this is just a convenience shortcut # all tasks will share the default parameters (unless set otherwise in individual task) d.input_files = ['hello.sh'] d.output_files = ['message.out'] d.executable = 'hello' # here are tasks differing by arguments to the executable for i in range(20): t = input.data.newTask() t.args = [str(i)] User provides a “run” function that defines tasks hello.sh: #!/bin/bash echo $1 > message.out

9 9 DIANE master and workers T1 T2 T3 T4 Diane- master Diane- worker corba A second small python application: Needs Corba IOR address of master 1.Get a task (i.e. command line and parameters to execute) 2.Get intput files (G4 macro file, analysis support files, execution script) 3.Execute task 4.Send results (ROOT files, log-files) 5.Repeat if more tasks exist

10 10 Some notes Diane-worker is not a GRID job We use GANGA to start the diane-workers on remote sites But we can use SSH / QSUB / whatever To start a worker the only information needed is the CORBA address of the master Corba (omniORB) is used to create a point-to-point communication channel between master and workers Machine where the master runs need some ports open Multiple diane-masters are allowed as long as each one listens on his own port

11 11 Possible work-plan A possible work-activity Develop an alternative solution to DIANE Requirements: Should retrieve output and store results in a central repository. Output size 10-100GB / month Should allow several users to use the system at the same time Should be possible to use a GRID submission systems (e.g. GANGA) to submit jobs Should integrate with LCG resources and OSG Support for batch submission and direct SSH What about clouds?


Download ppt "Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI."

Similar presentations


Ads by Google