Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.

Slides:

Advertisements

Similar presentations

Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.

Advertisements

LUNARC, Lund UniversityLSCS 2002 Transparent access to finite element applications using grid and web technology J. Lindemann P.A. Wernberg and G. Sandberg.

S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.

S/W meeting 18 October 2007RSD 1 Remote Software Deployment Nick West.

K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.

The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin

1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu

DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration.

DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.

The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

LOGO Scheduling system for distributed MPD data processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.

K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.

INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.

:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.

Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.

1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.

Belle MC Production on Grid 2 nd Open Meeting of the SuperKEKB Collaboration Soft/Comp session 17 March, 2009 Hideyuki Nakazawa National Central University.

LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.

Grid job submission using HTCondor Andrew Lahiff.

Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.

David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.

Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.

Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.

 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.

Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.

DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.

NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.

Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.

Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.

+ AliEn site services and monitoring Miguel Martinez Pedreira.

INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.

Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.

Feedback from CMS Andrew Lahiff STFC Rutherford Appleton Laboratory Contributions from Christoph Wissing, Bockjoo Kim, Alessandro Degano CernVM Users Workshop.

D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.

Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.

INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.

Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.

Breaking the frontiers of the Grid R. Graciani EGI TF 2012.

Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA Grid2Win : gLite for Microsoft Windows Elisa Ingrà - INFN.

SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,

Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015 The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka,

1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.

BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.

Eleonora Luppi INFN and University of Ferrara - Italy

Blueprint of Persistent Infrastructure as a Service

U.S. ATLAS Grid Production Experience

IW2D migration to HTCondor

Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory

CRAB and local batch submission

Haiyan Meng and Douglas Thain

Module 01 ETICS Overview ETICS Online Tutorials

Job Application Monitoring (JAM)

Production Manager Tools (New Architecture)

Presentation transcript:

Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI

2 Outline Status of KISTI integration with Geant4 resources The production system Some more details on DIANE For more information see: Andrea Dotti 2012 J. Phys.: Conf. Ser

3 Status As of February 18, 2013 jobs are running at KISTI via darthvader All nodes occupied at 100% Full production performed in about 48hrs What has been done: Installed missing libraries Performed simple testing (starting application locally) Performed remote testing (small scale): start one job at the time remotely from CERN (using full infrastructure) Performed full production test (queue of 2200 jobs): submit maximum number of jobs and monitor cluster is 100% occupied on several hours, check output

4 Results: from production monitoring Jobs configurations Jobs configurations Jobs queue Total 2.4M events Jobs queue Total 2.4M events Output at CERN repository Failures die to misconfiguration (my-fault) Stable production mode: no problems observed over several hours Rate of produced events strongly depends on configuration, expect to simulate All events in 48 hrs

5 Production system System based on four components 1. CernVM-FS: to distribute (read-only) the Geant4 software 2. DIANE/GANGA: to submit jobs to the grid and retrieve the output 3. SimplifiedCalorimeter: Geant4 application (LHC calorimeters) to extensively test all aspects of physics simulations 4. Results DataBase: to store summaries from 3., logging information of jobs status, include web-application to produce plots

6 Architecture Python wrapper Application DIANE and GANGA OS / GRID middleware CernVM-FS Recognized as The most critical (DIANE not anymore supported) Includes interaction w/ DB and analysis Of results (not discussed here)

7 Deployment GANGA session DIANE CERN Repo Node Squid HTTP proxy Squid HTTP proxy Failover Job: “connect to DIANE server and get work” Download: work config Upload: results Communication: CORBA KISTI

8 DIANE master Python application It defines a queue of tasks A task is defined by: Command line to execute Command line arguments Input and output files (if any) # tell DIANE that we are just running executables # the ExecutableApplication module is a standard DIANE test application from diane_test_applications import ExecutableApplication as application # the run function is called when the master is started # input.data stands for run parameters def run(input,config): d = input.data.task_defaults # this is just a convenience shortcut # all tasks will share the default parameters (unless set otherwise in individual task) d.input_files = ['hello.sh'] d.output_files = ['message.out'] d.executable = 'hello' # here are tasks differing by arguments to the executable for i in range(20): t = input.data.newTask() t.args = [str(i)] # tell DIANE that we are just running executables # the ExecutableApplication module is a standard DIANE test application from diane_test_applications import ExecutableApplication as application # the run function is called when the master is started # input.data stands for run parameters def run(input,config): d = input.data.task_defaults # this is just a convenience shortcut # all tasks will share the default parameters (unless set otherwise in individual task) d.input_files = ['hello.sh'] d.output_files = ['message.out'] d.executable = 'hello' # here are tasks differing by arguments to the executable for i in range(20): t = input.data.newTask() t.args = [str(i)] User provides a “run” function that defines tasks hello.sh: #!/bin/bash echo $1 > message.out

9 DIANE master and workers T1 T2 T3 T4 Diane- master Diane- worker corba A second small python application: Needs Corba IOR address of master 1.Get a task (i.e. command line and parameters to execute) 2.Get intput files (G4 macro file, analysis support files, execution script) 3.Execute task 4.Send results (ROOT files, log-files) 5.Repeat if more tasks exist

10 Some notes Diane-worker is not a GRID job We use GANGA to start the diane-workers on remote sites But we can use SSH / QSUB / whatever To start a worker the only information needed is the CORBA address of the master Corba (omniORB) is used to create a point-to-point communication channel between master and workers Machine where the master runs need some ports open Multiple diane-masters are allowed as long as each one listens on his own port

11 Possible work-plan A possible work-activity Develop an alternative solution to DIANE Requirements: Should retrieve output and store results in a central repository. Output size GB / month Should allow several users to use the system at the same time Should be possible to use a GRID submission systems (e.g. GANGA) to submit jobs Should integrate with LCG resources and OSG Support for batch submission and direct SSH What about clouds?