Simulation in a Distributed Computing Environment

Slides:



Advertisements
Similar presentations
Maria Grazia Pia Simulation in a Distributed Computing Environment Simulation in a Distributed Computing Environment S. Guatelli 1, A. Mantero 1, P. Mendez.
Advertisements

28 April, 2005ISGC 2005, Taiwan The Efficient Handling of BLAST Applications on the GRID Hurng-Chun Lee 1 and Jakub Moscicki 2 1 Academia Sinica Computing.
Monte Carlo simulation for radiotherapy in a distributed computing environment S. Chauvie 2,3, S. Guatelli 2, A. Mantero 2, J. Moscicki 1, M.G. Pia 2 CERN.
Other GEANT4 capabilities Event biasing Parameterisation (fast simulation) Persistency Parallelisation and integration in a distributed computing environment.
DIANE Project Seminar on Innovative Detectors, Siena Oct 2002 Distributed Computing in Physics Parallel Geant4 Simulation in Medical.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
IEEE Nuclear Science Symposium and Medical Imaging Conference Short Course The Geant4 Simulation Toolkit Sunanda Banerjee (Saha Inst. Nucl. Phys., Kolkata,
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Geant4 Acceptance Suite for Key Observables CHEP06, T.I.F.R. Mumbai, February 2006 J. Apostolakis, I. MacLaren, J. Apostolakis, I. MacLaren, P. Mendez.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DIANE Project CHEP 03 DIANE Distributed Analysis Environment for semi- interactive simulation and analysis in Physics Jakub T. Moscicki,
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Technological Transfer from HEP to Medical Physics How precise Brachytherapy MonteCarlo simulations can be applied in Clinics Reality Problem: How to achieve.
S. Guatelli, A. Mantero, J. Moscicki, M. G. Pia Geant4 medical simulations in a distributed computing environment 4th Workshop on Geant4 Bio-medical Developments.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Benchmarks of medical dosimetry simulation on the grid S. Chauvie 1, A. Lechner 4, P. Mendez Lorenzo 5, J. Moscicki 5, M.G. Pia 6 G.A.P. Cirrone 2, G.
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
Geant4 Training 2004 Short Course Katsuya Amako (KEK) Gabriele Cosmo (CERN) Giuseppe Daquino (CERN) Susanna Guatelli (INFN Genova) Aatos Heikkinen (Helsinki.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Excel Translator™ Ultimate Risk Solutions, Inc.:
Workload Management Workpackage
Introduction to Load Balancing:
F. Foppiano3, S. Guatelli2, J. Moscicki1, M.G. Pia2 CERN1 INFN Genova2
Design rationale and status of the org.glite.overlay component
GWE Core Grid Wizard Enterprise (
Operating Systems (CS 340 D)
Sujata Ray Dey Maheshtala College Computer Science Department
PROOF – Parallel ROOT Facility
INFN-GRID Workshop Bari, October, 26, 2004
Simulation in a Distributed Computing Environment
Experience in ALICE – Analysis Framework and Train
The COMPASS event store in 2002
Grid Computing.
US CMS Testbed.
TYPES OFF OPERATING SYSTEM
Performance Evaluation of Adaptive MPI
Grid Canada Testbed using HEP applications
Support for ”interactive batch”
Two Patterns in Adaptive, Distributed Real-Time, Embedded Middleware
Other GEANT4 capabilities
Short Course Siena, 5-6 October 2006
Sujata Ray Dey Maheshtala College Computer Science Department
Prof. Leonardo Mostarda University of Camerino
Short Course IEEE NSS/MIC 2003 Katsuya Amako (KEK) Makoto Asai (SLAC)
Simulation in a Distributed Computing Environment
rvGAHP – Push-Based Job Submission Using Reverse SSH Connections
Database System Architectures
Distributed Simulation with Geant4
Comparison of data distributions: the power of Goodness-of-Fit Tests
DBOS DecisionBrain Optimization Server
MapReduce: Simplified Data Processing on Large Clusters
Computational issues Issues Solutions Large time scale
Presentation transcript:

Simulation in a Distributed Computing Environment S. Guatelli1, P. Mendez Lorenzo2, J. Moscicki2, M.G. Pia1 1INFN Genova, Italy 2CERN, Geneva, Switzerland IEEE Nuclear Science Symposium San Diego, 30 October – 4 November 2006

Speed of Monte Carlo simulation Speed of execution is often a concern in Monte Carlo simulation Often a trade-off between precision of the simulation and speed of execution Typical use cases Semi-interactive response Detector design Optimisation Oncological radiotherapy Very long execution time High statistics simulation High precision simulation Fast simulation Variance reduction techniques (event biasing) Inverse Monte Carlo methods Parallelisation Methods for faster simulation response

Requirements Architectural requirements Transparent execution in sequential/parallel mode Transparent execution on a PC farm and on the Grid Semi-interactive simulation High statistics simulation e.g.Geant4 brachytherapy Execution time for 20 M events: 5 hours Goal: execution time ~ few minutes e.g. Geant4 medical_linac Execution time for 109 events: ~10 days Goal: execution time ~ few hours Reference: sequential mode on a Pentium IV, 3 GHz

Features of this study Geant4 in a distributed computing environment Architecture Implications on Geant4 simulation applications Environments PC farm GRID Real life use case Geant4 brachytherapy Advanced Example Test on a single CPU Test on a dedicated farm (60 CPUs) Test on a farm shared with other users Test on the GRID (LCG)

Parallel simulation execution: local cluster / GRID Both applications have the same computing model A job consists of a number of independent tasks which may be executed in parallel Result of each task is a small data packet (few kb), which is merged as the job runs In a local cluster Computing resources are used for parallel execution Input data for the job must be available on site Typically there is a shared file system and a queuing system Network is fast GRID computing: resources from multiple computing centres Typically there is no shared file system (Parts of) input data must be replicated in remote sites Network connection is slower than within a local cluster

architectural pattern Strategy to minimise the cost of migrating a Geant4 simulation to a distributed environment Master-Worker architectural pattern DIANE http://cern.ch/DIANE DIANE is a layer which provides applications with a easy and convinient way of execution in the distributed, cluster environment. THE DESIGN GOALS: - easy customization and adaptability to different needs - hide details of underlying technology (allow for easy migration) - location independant – accessible from anywhere in the network - limited to Master-Worker model which covers most of the typical jobs and needs in HEP The Geant4 application developer is shielded from the complexity of underlying technology via DIANE

Distributed Simulation Interface class which binds together Geant4 application and Master-Worker framework UML Deployment Diagram for Geant4 applications Original Geant4 application source code unmodified G4Simulation class responsible of managing the simulation random number seeds Geant4 initialisation termination

Practical example: Geant4 simulation with analysis Each task produces a file with histograms The job result is the sum of histograms produced by tasks Master-worker model client starts a job workers perform tasks and produce histograms master integrates the results Distributed Processing for Geant4 Applications task = N events job = M tasks tasks may be executed in parallel tasks produce histograms/ntuples task output is automatically combined (add histograms, append ntuples) Responsibilities Master steers the execution of job, splits the job and merges the results Worker initializes the Geant4 application and executes macros Client gets the results

Overhead at initialisation/termination Test on a single dedicated CPU (Intel ®, Pentium IV, 3.00 GHz) Study execution via DIANE w.r.t. sequential execution run 1 event Standalone application 4.6  0.2 s Application via DIANE, simulation only 8.8  0.8 s Application via DIANE, with analysis integration 9.5  0.5 s Overhead: ~ 5 s, negligible in a high statistics job

Farm: execution time and efficiency Dedicated farm : 30 identical bi-processors (Pentium IV, 3 GHz) Thanks to Regional Operation Centre (ROC) Team, Taiwan Thanks to Hurng-Chun Lee (Academia Sinica Grid Computing Center, Taiwan) Load balancing: optimisation of the number of tasks and workers

Optimizing the number of tasks The job ends when all the tasks are executed in the workers If the job is split into a higher number of tasks, the chance that the workers finish the tasks at the same time is a higher Note: the overall time of the job is determined by the last worker to finish the last task Worker number Time (seconds) Worker number Time (seconds) Example of a job that can be improved from a performance point of view Example of a good job balancing

Farm shared with other users Real-life case: farm shared with other users Execution in parallel mode on 5 workers of CERN LSF DIANE used as intermediate layer Preliminary! The load of the cluster changes quickly in time The conditions of the test are not reproducible Highly variable performance

Results in real life use case Required production of Brachytherapy: 20 M events 20 M events in sequential mode:16646 s (~ 4h 38’) on an Intel ® Pentium IV, 3.00 GHz The same simulation runs in 5’ in parallel on 56 CPUs appropriate for clinical usage

How the GRID load changes Execution time of Brachytherapy in two different conditions of the GRID Worker number Time (seconds) Very different result! 20 M events, 60 workers initialized, 360 tasks The load of the GRID changes quickly in time The conditions of the test are not reproducible

Test results (LCG) with/without DIANE Execution on the GRID, without DIANE Execution on the GRID through DIANE, 20 M events,180 tasks, 30 workers Worker number Time (seconds) Without DIANE: 2 jobs not successful due to set-up problems of workers Through DIANE: - All the tasks are executed successfully on 22 workers

Farm/GRID execution Preliminary indication Brachytherapy application, 20 M events, 180 tasks Taipei cluster: 29 machines, 734 s ~ 12 minutes GRID: 27 machines, 1517 s ~ 25 minutes Preliminary indication The conditions are not reproducible

Lessons learned DIANE as intermediate layer Load balancing Transparency Good separation of the subsystems Good management of CPU resources Negligible overhead Load balancing A relatively large number of tasks increases the efficiency of parallel execution Trade-off between optimisation of task splitting and overhead introduced Controlled and real life situation is quite different in a farm Need dedicated farm for critical usage (i.e. hospital) Grid Highly variable environment Not mature for critical usage yet Work in progress, details still to be understood quantitatively

Conclusions General solution for Geant4 simulation in a distributed computing environment transparent sequential/parallel application transparent execution on a local farm or on the Grid user code is the same Quantitative results on-going work to understand details Acknowledgments to: M. Lamanna, L. Moneta, A. Pfeiffer (CERN) LCG teams at CERN Hurng-Chun Lee (ASGC, Taiwan) Regional Operation Centre Team of Taiwan