Monte Carlo simulation for radiotherapy in a distributed computing environment S. Chauvie 2,3, S. Guatelli 2, A. Mantero 2, J. Moscicki 1, M.G. Pia 2 CERN 1 INFN 2 S. Croce e Carle Hospital Cuneo 3 Monte Carlo April 2005 Chattanooga, TN, USA
Monte Carlo methods in radiotherapy Monte Carlo methods have been explored for years as a tool for precise dosimetry, in alternative to analytical methods de facto, Monte Carlo simulation is not used in clinical practice (only side studies) speed The major limiting factor is the speed
The reality Treatment planning is performed by means of commercial software The software calculates the dose distribution delivered to the patient Open issues Disadvantages Advantages analytical methods Commercial systems are based on analytical methods Fails in calculate dose in heterogeneities and for small or complex field Quick response specific to one radiotherapic technique Each treatment planning software is specific to one radiotherapic technique
Project Develop a dosimetric system for radiotherapy treatments based on Monte Carlo methodsProject Geant4 as Simulation Toolkit Parallelisation Access to distributed computing resources Calculation precision Quick response
Pilot project: distributed simulation for brachytherapy Explore Geant4-based Monte Carlo simulations in a distributed computing environment –Parallel execution in a local PC farm –Geographically distributed execution on the GRID Pilot project based on an existing simulation for brachytherapy Focus on architectural design –Transparent execution on a single machine, in parallel on a local farm or on the GRID Preliminary evaluation of performance Application to other radiotherapy simulations currently in progress
Brachytherapy Simulation of the energy deposited by a radioactive source in a phantom Requirement from clinical practice: real time response Bebig Isoseed I-125 source source Plan containing the radioactive source Dose Distribution Talk: “A general purpose dosimetric system for brachytherapy”, 20 th April, MC 2005, Room 5
Performance in sequential mode Endocavitary brachytherapy 1M events 61 minutes Interstitial brachytherapy 1M events 67 minutes Superficial brachytherapy 1M events 65 minutes on an “average” PIII machine Monte Carlo simulation is not practically conceivable for clinical application, even if more precise
Speed adequate for clinic use Transparent configuration in sequential or parallel mode Transparent access to the GRID through an intermediate software layer Parallelisation Access to distributed computing resources
Access to distributed computing speed OK but expensive hardware investment + maintenance Hospital LAN SWITCHSWITCH Node01 Node02 Node03 Node04 IMRT Geant4 Simulation and Anaphe analysis on a dedicated Beowulf Cluster S. Chauvie et al., IRCC Torino, Siena 2002
Access to distributed computing Alternative strategy DIANE ParallelisationAccess to the GRID Transparent access to a distributed computing environment
Active Workflow Framework for Parallel Jobs Applications run inside an Active Workflow Framework For applications: –underlying environment is transparent –code changes to use the framework are minimal The Framework provides: –Automatic Communication and Synchronization of tasks –Error recovery –Optimization
DIANE DIstributed ANalysis Environment prototype for an intermediate layer between applications and the GRID Parallel cluster processing – make fine tuning and customisation easy – transparently using GRID technology – application independent Hide complex details of underlying technology Developed by J. Moscicki, CERN
DIANE architecture Master-Worker model Parallel execution of independent tasks Very typical in many scientific applications Usually applied in local clusters R&D in progress for Large Scale Master- Worker Computing
Master - Worker Computing Workers are started up and register to Master Client connects to Master and starts up the job Master controls the execution, dispatches tasks to Workers and combines the result Client receives notifications about the current status of the job and collects the final result
Running in a distributed environment Not affecting the original code of application same code – standalone and distributed case is the same code Good separation of the subsystems – the application does not need to know that it runs in distributed environment – the distributed framework (DIANE) does not need to care about what actions an application performs internally The application developer is shielded from the complexity of underlying technology via DIANE
Distributed environments Different distributed environments: local computing farm GRID
Parallel mode: local cluster / GRID Both applications have the same computing model –a job consists of a number of independent tasks which may be executed in parallel –result of each task is a small data packet (few kilobytes), which is merged as the job runs In a cluster: –computing resources are used for parallel execution –user connects to a possibly remote cluster –input data for the job must be available on the site –typically there is a shared file system and a queuing system –network is fast GRID computing uses resources from multiple computing centres –typically there is no shared file system –(parts of) input data must be replicated in remote sites –network connection is slower than within a cluster
Development costs Strategy to minimise the cost of migrating a Geant4 simulation to a distributed environment for users DIANE Active Workflow framework –provides automatic communication/synchronization mechanisms –application is “glued” to the framework using a small Python module; in most cases no code changes to the original application are required –load balancing and error recovery policies may be plugged in form of simple python functions Transparent adaptation for Clusters/GRIDs, shared/local file systems, shared/private queues Cost in the runtime phase: –near zero (except for loading networking libraries for the first time) Development/modification of application code –original source code unmodified –addition of an interface class which binds together application and M-W framework
Interfacing a Geant4 simulation to DIANE UML Deployment Diagram for Geant4 applications
Practical example: G4 simulation with analysis Each task produces a file with histograms The job result is the sum of histograms produced by tasks Master-worker model – client starts a job – workers perform tasks and produce histograms – master integrates the results Distributed Processing for Geant4 Applications –task = N events –job = M tasks –tasks may be executed in parallel –tasks produce histograms/ntuples –task output is automatically combined (add histograms, append ntuples) Master-Worker Model –Master steers the execution of job, automatically splits the job and merges the results –Worker initializes the Geant4 application and executes macros –Client gets the results
DIANE Prototype and Testing Scalability tests –70 worker nodes –140 milion Geant 4 events
Performance : parallel mode 1M events 4 minutes 34’’ 5M events 4 minutes 36’’ 1M events 4 minutes 25’’ on up to 50 workers, LSF at CERN, PIII machine, MHz Performance adequate for clinical application, but… it is not realistic to expect any hospital to own and maintain a PC farm Endocavitary brachytherapy Interstitial brachytherapy Superficial brachytherapy preliminary: further optimisation in progress
Parallel mode: distributed resources Distributed Geant 4 Simulation: DIANE framework and generic GRID middleware
Grid Wave of interest in grid technology as a basis for “revolution” in e-Science and e-Commerce An infrastructure and standard interfaces capable of providing transparent access to geographically distributed computing power and storage space in a uniform way Ian Foster and Carl Kesselman's book: ”A computational Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities”". US projects European projects Many GRID R&D projects, many related to HEP
Large distributed computing resource
Running on the GRID Via DIANE Same application code as running on a sequential machine or on a dedicated cluster –completely transparent to the user A hospital is not required to own and maintain extensive computing resources to exploit the scientific advantages of Monte Carlo simulation for radiotherapy Any hospital even small ones, or in less wealthy countries, that cannot afford expensive commercial software systems – may have access to advanced software technologies and tools for radiotherapy
Traceback from a run on CrossGrid testbed Current #Grid setup (computing elements): 5000 events, 2 workers, 10 tasks (500 events each) - aocegrid.uab.es:2119/jobmanager-pbs-workq - bee001.ific.uv.es:2119/jobmanager-pbs-qgrid - cgnode00.di.uoa.gr:2119/jobmanager-pbs-workq - cms.fuw.edu.pl:2119/jobmanager-pbs-workq - grid01.physics.auth.gr:2119/jobmanager-pbs-workq - xg001.inp.demokritos.gr:2119/jobmanager-pbs-workq - xgrid.icm.edu.pl:2119/jobmanager-pbs-workq - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-infinite - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-long - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-medium - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-short - ce01.lip.pt:2119/jobmanager-pbs-qgrid Spain Poland Greece Portugal Resource broker running in Portugal matchmaking CrossGrid computing elements
Study in progress Capability of transparent execution of the radiotherapy simulation on the GRID has been demonstrated Quantitative evaluation of performance speed and stability currently in progress A comprehensive study will be submitted for publication in the coming weeks Optimisation of load balancing, error handling and other issues concerning access to distributed resources currently under study
Application to IMRT simulations Determine the dose distribution in a phantom generated by the head of a linear accelerator Requirement from clinical practice: fast response Without parallelisation: events 100 CPU days on Pentium IV 3 GHz Talk: “ Geant4 Simulation of an Accelerator Head for Intensity Modulated RadioTherapy ”, 19 th April, MC 2005, Room 6 Lateral profile 6MV, 5x5cm field, 15mm depth
Conclusions Fast performance –parallel processing Access to geographically distributed computing resources –GRID Demonstrated with Geant4 simulation applications + DIANE More information – cern.ch/diane – – – aida.freehep.org