Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney aka The Full Monte! Optimisation of Monte Carlo codes for High Performance Computing in Radiotherapy Applications Dr Iwan Cornelius, M.B. Flegg, C.M. Poole, Prof Christian Langton Faculty of Science and Technology Queensland University of Technology Queensland Cancer Physics Collaborative
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Outline Introduction Development of a LINAC Monte Carlo model using GEANT4 Optimisation Future Directions Conclusions
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Introduction: Radiotherapy LINAC: produce highly controllable source of MeV photons –Energy –Gantry angle –Patient position
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Introduction: Radiotherapy LINAC: produce highly controllable source of MeV photons –Multi Leaf Collimators (MLCs) to define arbitrary shaped fields
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Introduction: Radiotherapy Planning –Patient imaged –PTV OAR Contoured –Optimisation of fields to conform Dose to tumour and spare healthy tissue Delivery –Fractionated Based on analytical calculations –Can be inaccurate in regions of high heterogeneity
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Monte Carlo What is it? How is it used in radiotherapy? –Treatment plan verification –Support new dosimetry measurements used in QA What tools exist? –EGSnrc/BEAMnrc, PENELOPE, MCNPX, GEANT4 Challenges to overcome –Reduce Computation times (maintain accuracy) Code optimisation Variance reduction High Performance Computing (HPC) –Usability
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney High Performance Computing Monte Carlo: trivial to parallelise –Launch identical application with unique random number generator seed –Collate results Centralised Clusters –Multiple machines, Beowulf –Multiple CPU, Shared memory (SGI Altix) Cons –Look better on paper –Sharing resource with other users –Often limited to # of processors, wait in queue Single machine, multiple processors –Dual quad core –Hyperthreading can get 16 cores
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney High Performance Computing: GPGPU General Purpose Graphics Processing Units –hundreds of processors on a chip –NVIDIA Tesla C1060: PCIx 240 cores per card 4GB memory CUDA –Compute Unified Device Architecture –Write ‘kernel’ in ‘C for CUDA’ to run on the GPU –Copy from main memory to device memory –Kernel executes on GPU –Copies result back to main memory –Great for loops How to ‘Accelerate’ Monte Carlo codes with GPUs –Re-engineer entire code into C for CUDA kernels –Re-write computationally intensive portions of code into ‘kernels’ using CUDA –Calculation time doesn’t scale with # of processors
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney GEANT4 Toolkit of C++ classes –Primary beam, geometry, physics processes, scoring –User must create their own application based on these Very powerful general purpose Monte Carlo tool –High energy physics, space physics, medical physics, optics, radiation protection, astrophysics
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney GEANT4 Pros –Extremely flexible –Time dependent geometries –Radioactive decay, Neutron transport –Various visualisation tools Cons –Extremely flexible –Requires proficiency with C++ programming –Steep learning curve –Deterrent for first time users –Hospital based Medical Physicists with limited research time
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney The Full Monte! Create generic LINAC application using GEANT4 –Capable of modelling Elekta, Varian, Siemens LINACs –Do for GEANT4 what BEAMnrc did for EGSnrc (just text inputs) –Accurate. Verify against experimental data. Optimise for HPC environments (Desktop Supercomputer) –Distribute over available CPUs –Port to the GPU User interface –Simple text-file based interface –Graphical User Interface Interface with TPS –Able to routinely verify treatment plans
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Geometry Varian 2100 Clinac –Dimensions, material composition from Varian Docs Target Primary Collimator Vacuum window
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Geometry Flattening filter –Compensate for forward peaked distribution of bremsstrahlung photons Ionisation chamber –Monitor total Dose delivery
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Geometry Jaws –Define square fields
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Geometry Multi-Leaf Collimators (MLCs) –Interleaved Tungsten leaves –Varian Millenium –Brad Oborn (UoW)
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Primary Beam Monoenergetic electron beam Normally incident on target Gaussian spread radially
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Physics Photons –Photoelectric effect –Compton –GammaConversion Electrons –Multiple scatter –Ionisation –Bremmstrahlung Positrons –Ditto –Annihilation
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Scoring Water Phantom –50 cm x 50 cm x 50 cm –Score in voxelised geometry
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Validation / Commissioning Comparison with ionisation chamber measurements in a water phantom –Scanning with x,y,z Dose along beam axis
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Validation: Tune Electron Beam Energy Tuning of electron beam energy for best match –10 cm x 10 cm field –Compare between –10-30cm depths
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Results: Tune Electron Beam Energy Comparison with ionisation chamber measurements in water Tuning of electron beam energy for best match –10 cm x 10 cm field –Compare between –10-30cm depths
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Results: 5.85 MeV, 10 cm x 10 cm Within 2% agreement between 0.5cm and 38cm
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Results: 5.85 MeV, 10 cm x 10 cm Within 2% agreement between 0.5cm and 38cm
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Results: 5.85 MeV, 5 cm x 5 cm
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Results: 5.85 MeV, 20 cm x 20 cm
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Results: 5.85 MeV, 40 cm x 40 cm
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Optimisations No Optimisation –Many photons produced will never reach the sensitive region of the geometry
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Optimisations Kill zones –Nothing fancy-pants –Terminate histories that are unlikely to contribute to observable –Above target –Around primary collimator Relative Computation Time: 78 %
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Optimisations Phase space files –Some aspects of geometry don’t change –Create pre-calculated radiation field at plane –Sample this population to conserve computation times Relative Computation Time: 38 % 380 hrs, O(10 10 )
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney HPC: GPU/CPU Desktop Supercomputer Purchase of Xenon T5 Desktop Supercomputer –“The Terminator” –4 x C1060 Tesla card = 960 cores! –2 x quad core processors hyper-threading Linux ‘sees’ 16 processors NVIDIA Professorial partnership grant –Awarded 3 x C1060 Tesla cards Research team learning CUDA –Mark Harris, local CUDA guru
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Optimisations: Parallelise on CPUs Message Passing Interface (MPI) –Run identical simulation on different core with unique random number –Geant4 MPImanager class –Time scales roughly linearly with number of processors –Simulations in 24 hrs, O(10 10 )
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney The GPU Dilemma 1. Re-write entire code into C for CUDA? –C for CUDA doesn’t support sophisticated data types (classes) – O(10^6) lines of code, dozens of developers –Wait for CUDA to catch up (?) 2. Create C++ wrapper classes for certain methods –First step, random number generator –Incorporated into GEANT4 framework via inheritance –Implementing Mersenne Twister algorithm (hack example from CUDA SDK) to generate cache of random numbers –Improvement of only a few percent
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Profiling! Great first step when optimising code Linux gprof require to re-compile with flags set MacOSX –Profiling tool doesn’t require recompile
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Conclusions GEANT4 LINAC application has been developed –Specific to Varian Clinac –Many parameters hard-coded –Work commenced on textfile based UI commands –Preliminary validation promising Optimisation –Phase space files –Kill zones –MPI for parallel processing on CPUs –Porting random number generator to GPU
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Future Directions Validation –Verify dose distributions in heterogeneous phantoms –Verify model of MLCs (irregular fields) –Develop interface to Treatment Planning System Optimisation –Re-write part of GEANT4 to run on GPU Interface –User friendly text-file based commands Treatment Plan interface –Implement DICOM-RT interface
Queensland University of Technology CRICOS No J AstroMed09, 14-16th December, The University of Sydney Acknowledgements QUT –Scott Crowe, Tanya Kairn, Andrew Fielding discussion on Varian LINAC model, Experimental data –Mark Barry, Mark O Dwyer discussion on CPU optimisation, High Performance Computing Mater Hospital, Brisbane –Radiation Oncology Group UoW –Brad Oborn Millenium MLC model GEANT4 Collaboration –Joseph Perl (SLAC) discussion on visualisation / profiling NVIDIA –Mark Harris