IceCube simulation with PPC Dmitry Chirkin, UW Madison photon propagation code.

Slides:

Advertisements

Similar presentations

Introduction to TFCompanion © 2004 Semiconsoft, Inc. TFCompanion ver. 3.0.

Advertisements

SPICE Mie [mi:] Dmitry Chirkin, UW Madison. Updates to ppc and spice PPC: Randomized the simulation based on system time (with us resolution) Added the.

Development of a track trigger based on parallel architectures Felice Pantaleo PH-CMG-CO (University of Hamburg) Felice Pantaleo PH-CMG-CO (University.

Linked to ORCA/PINGU J. Brunner. Calibrations Main stream External input Simulation Reconstruction.

1 Scintillating Fibre Tracker Simulation Malcolm Ellis Imperial College London Tuesday 9 th March 2004.

HiRes Usage. Outline ● Shower energy ( Size, dE/dx ) ● Atmospheric profile ( stdz76, radiosonde) ● Rayleigh Scattering ● Aerosols Model ( density, variability.

KM3NeT detector optimization with HOU simulation and reconstruction software A. G. Tsirigotis In the framework of the KM3NeT Design Study WP2 - Paris,

Photon reconstruction and calorimeter software Mikhail Prokudin.

2012/06/22 Contents  GPU (Graphic Processing Unit)  CUDA Programming  Target: Clustering with Kmeans  How to use.

1 I-Logix Professional Services Specialist Rhapsody IDF (Interrupt Driven Framework) CPU External Code RTOS OXF Framework Rhapsody Generated.

CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA

IceCube simulation with PPC on GPUs Dmitry Chirkin, UW Madison photon propagation code graphics processing unit.

DelayRatio: A Gravitational Wave Event Physical Likelihood Estimator Based on Detection Delays and SNR Ratios Amber L. Stuver LIGO Livingston ObservatoryCalifornia.

Tracking at LHCb Introduction: Tracking Performance at LHCb Kalman Filter Technique Speed Optimization Status & Plans.

IceCube: String 21 reconstruction Dmitry Chirkin, LBNL Presented by Spencer Klein LLH reconstruction algorithm Reconstruction of digital waveforms Muon.

I3PropagatorMMC module Dmitry Chirkin, LBNL IceCube meeting, Uppsala, 2004.

Implementing a dual readout calorimeter in SLIC and testing Geant4 Physics Hans Wenzel Fermilab Friday, 2 nd October 2009 ALCPG 2009.

NuGen III K.Hoshina May IceCube spring collaboration meeting in Madison.

GPU Architecture and Programming

Photon propagation and ice properties Bootcamp UW Madison Dmitry Chirkin, UW Madison r air bubble photon.

Report of the HOU contribution to KM3NeT TDR (WP2) A. G. Tsirigotis In the framework of the KM3NeT Design Study WP2 Meeting - Erlangen, May 2009.

MMC and UCR icetray modules Dima Chirkin, LBNL Presented by Spencer Klein Tau 2-bang Coincident showers.

NESTOR SIMULATION TOOLS AND METHODS Antonis Leisos Hellenic Open University Vlvnt Workhop.

Some key aspects of NVIDIA GPUs and CUDA. Silicon Usage.

What is in my contribution area Nick Sinev, University of Oregon.

Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.

Photon propagation and ice properties Bootcamp UW Madison Dmitry Chirkin, UW Madison r air bubble photon.

Standard Candle, Flasher, and Cascade Simulations in IceCube Michelangelo D’Agostino UC Berkeley PSU Analysis Meeting June 21-24, 2006.

Ice Investigation with PPC Dmitry Chirkin, UW (photon propagation code)

IceCube simulation with PPC Dmitry Chirkin, UW Madison, 2010.

Overview of Operating Systems Introduction to Operating Systems: Module 0.

Ice model update Dmitry Chirkin, UW Madison IceCube Collaboration meeting, Calibration session, March 2014.

CBM ECAL simulation status Prokudin Mikhail ITEP.

IceCube simulation with PPC Photonics: 2000 – up to now Photon propagation code PPC: now.

Parallelization Geant4 simulation is an embarrassingly parallel computational problem – each event can possibly be treated independently 1.

Comparison of different km3 designs using Antares tools Three kinds of detector geometry Incoming muons within TeV energy range Detector efficiency.

Review of Ice Models What is an “ice model”? PTD vs. photonics What models are out there? Which one(s) should/n’t we use? Kurt Woschnagg, UCB AMANDA Collaboration.

IceCube simulation with PPC Photonics: 2000 – up to now Photon propagation code PPC: now.

GPU Photon Transport Simulation Studies Mary Murphy Undergraduate, UW-Madison Dmitry Chirkin IceCube at UW-Madison Tareq AbuZayyad IceCube at UW-River.

All lepton generation and propagation with MMC Dmitry Chirkin, UCB/LBNL AMANDA meeting, Uppsala, 2004.

IceCube simulation with PPC Dmitry Chirkin, UW Madison, 2010 effective scattering coefficient (from Ryan Bay)

DirectFit reconstruction of the Aya’s two HE cascade events Dmitry Chirkin, UW Madison Method of the fit: exhaustive search simulate cascade events with.

Performed by:Liran Sperling Gal Braun Instructor: Evgeny Fiksman המעבדה למערכות ספרתיות מהירות High speed digital systems laboratory.

Muon Energy reconstruction in IceCube and neutrino flux measurement Dmitry Chirkin, University of Wisconsin at Madison, U.S.A., MANTS meeting, fall 2009.

Photon propagation and ice properties Bootcamp UW Madison Dmitry Chirkin, UW Madison r air bubble photon.

Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.

Light Propagation in the South Pole Ice

Scaling behavior of lateral distribution of electrons in EAS

CS427 Multicore Architecture and Parallel Computing

South Pole Ice model Dmitry Chirkin, UW, Madison.

IceCube Collaboration Meeting Ghent, October 9, 2007

Interactions of hadrons in the Si-W ECAL

South Pole Ice (SPICE) model

MUPAGE: A fast muon generator

Chapter 2: System Structures

Karen Andeena, Katherine Rawlinsb, Chihwa Song*a

Ice Investigation with PPC

AMANDA-II Experiment Located at the Geographic South Pole

Experimental setup (SPICE)

Memory Management Tasks

CORSIKA bug and other updates

Operating Systems Lecture 3.

Search for coincidences and study of cosmic rays spectrum

Summary of yet another Photonics Workshop AMANDA/IceCube Collaboration Meeting Berkeley, March 19, 2005.

System calls….. C-program->POSIX call

Photonics Workshop AMANDA/IceCube Collaboration Meeting Berkeley, March 19, 2005 Going the last mile…

Overview Activities from additional UP disciplines are needed to bring a system into being Implementation Testing Deployment Configuration and change management.

SCT Wafer Distortions (Bowing)

6- General Purpose GPU Programming

Presentation transcript:

IceCube simulation with PPC Dmitry Chirkin, UW Madison photon propagation code

Direct photon tracking with PPC simulating flasher/standard candle photons same code for muon/cascade simulation using precise scattering function: linear combination of HG+SAM using tabulated (in 10 m depth slices) layered ice structure employing 6-parameter ice model to extrapolate in wavelength tilt in the ice layer structure is properly taken into account transparent folding of acceptance and efficiencies precise tracking through layers of ice, no interpolation needed precise simulation of the longitudinal development of cascades and angular distribution of particles emitting Cherenkov photons photon propagation code

Updates to ppc since last meeting PPC: LONG: simulate longitudinal cascade development ANGW: smear cherenkov cone due to shower development Corrected ice density to average at detector center Made the code scalable with the number of GPU multiprocessors The flasher simulation now uses the wavelength profile read from file wv.dat Randomized the simulation based on system time (with us resolution) Modified code to run CPU and GPU parts concurrently Added option to disable a multiprocessor Added the implementation of the simple approximate Mie scattering function Added a configuration file "cfg.txt" New oversized DOM treatment (designed for minimum bias compared to oversize=1):  oversize only in direction perpendicular to the photon  time needed to reach the nominal (non-oversized) DOM surface is added  re-use the photon after it hits a DOM and ensure the causality in the flasher simulation nominal DOM oversized DOM oversized ~ 5 times photon

Timing of oversized DOM MC xR=1 default do not track back to detected DOM do not track after detection no ovesize delta correction! do not check causality del=(sqrtf(b*b+(1/(e.zR*e.zR-1)*c)-D)*e.zR-h del=e.R-OMR Flashing    xR=1 default

Photon angular profile from thesis of Christopher Wiebusch

New ice density: mwe handbook of chemistry and physics T.Gow's data of density near the surface T= *d+5.822e-6*d (fit to AMANDA data) Fit to (1-p 1 *exp(-p 2 *d))*f(T(d))*(1+0.94e-12*9.8*917*d)

Simplified Mie Scattering Single radius particles, described better as smaller angles by SAM Also known as the Liu scattering function Introduced by Jon Miller

New approximation to Mie f SAM

ppc icetray module at uses a wrapper: private/ppc/i3ppc.cxx, which compiles by cmake system into the libppc.so it is necessary to compile an additional library libxppc.so by running make in private/ppc/gpu:  “make glib” compiles gpu-accelerated version (needs cuda tools)  “make clib” compiles cpu version (from the same sources!) link to libxppc.so and libcudart.so (if gpu version) from build/lib directory this library file must be loaded before the libppc.so wrapper library  These steps are automated with a resouces/make.sh script

ppc example script run.py if(len(sys.argv)!=6): print "Use: run.py [corsika/nugen/flasher] [gpu] [seed] [infile/num of flasher events] [outfile]" sys.exit() … det = "ic86" detector = False … os.putenv("PPCTABLESDIR", expandvars("$I3_BUILD/ppc/resources/ice/mie")) … if(mode == "flasher"): … str=63 dom=20 nph=8.e9 tray.AddModule("I3PhotoFlash", "photoflash")(…) os.putenv("WFLA", "405") # flasher wavelength; set to 337 for standard candles os.putenv("FLDR", "-1") # direction of the first flasher LED … # Set FLDR=x+(n-1)*360, where 0 0 to simulate n LEDs in a # symmetrical n-fold pattern, with first LED centered in the direction x. # Negative or unset FLDR simulates a symmetric in azimuth pattern of light. tray.AddModule("i3ppc", "ppc")( ("gpu", gpu), ("bad", bad), ("nph", nph*0.1315/25), # corrected for efficiency and DOM oversize factor; eff(337)= ("fla", OMKey(str, dom)), # set str=-str for tilted flashers, str=0 and dom=1,2 for SC1 and 2 ) else:

ppc-pick and ppc-eff ppc-pick: restrict to primaries below MaxEpri load("libppc-pick") tray.AddModule("I3IcePickModule ","emax")( ("DiscardEvents", True), ("MaxEpri", 1.e9*I3Units.GeV) ) ppc-eff: reduce efficiency from 1.0 to eff load("libppc-eff") tray.AddModule("AdjEff", "eff")( ("eff", eff) )

Todo list from the last meeting need to:  verify that it works for V of simulation  add code to treat high-efficient DOMs correctly  verify that it works for IC59  improve flasher simulation (interface with photoflash)  figure out the best way to compile All done! Done?

ppc homepage

GPU scaling Original:1/2.081/2.70 CPU c++: Assembly: GTX 295: GTX/Ori: C1060: C2050: GTX 480: On GTX 295: GHz Running on 30 MPs x 448 threads Kernel uses: l=0 r=35 s=8176 c=62400 On GTX 480: GHz Running on 15 MPs x 768 threads Kernel uses: l=0 r=40 s=3960 c=62400 On C1060: GHz Running on 30 MPs x 448 threads Kernel uses: l=0 r=35 s=3992 c=62400 On C2050: GHz Running on 14 MPs x 768 threads Kernel uses: l=0 r=41 s=3960 c=62400 Uses cudaGetDeviceProperties() to get the number of multiprocessors, Uses cudaFuncGetAttributes() to get the maximum number of threads

Kernel time calculation Run 3232 (corsika) IC86 processing on cuda002 (per file): GTX 295: Device time: (in-kernel: ) [ms] GTX 480: Device time: (in-kernel: ) [ms] If more than 1 thread is running using same GPU: Device time: (in-kernel: ) [ms] 3 counters:1. time difference before/after kernel launch in host code 2. in-kernel, using cycle counter:min thread time 3.max thread time Also, real/user/sys times of top: gpus6 cpus1 cores8 files693 Real749m4.693s User3456m10.888s sys39m50.369s Device time: [ms] files: 693 real: user: gpu: kernel: [seconds] 81%-91% GPU utilization

Concurrent execution time CPUGPUCPUGPU Thread 1: CPUGPUCPUGPU Thread 2: CPU GPU CPU GPU CPU GPU CPU GPU One thread: Create track segments Copy track segments to GPU Process photon hits Copy photon hits from GPU Need 2 buffers for track segments and photon hits However: have 2 buffers: 1 on host and 1 on GPU! Just need to synchronize before the buffers are re-used

BAD multiprocessors (MPs) clist cudatest cuda cuda cuda #badmps cuda cuda cuda Disable 3 bad GPUs out of 24: 12.5% Disable 3 bad MPs out of 720: 0.4%! Configured: xR=5 eff=0.95 sf=0.2 g=0.943 Loaded 12 angsens coefficients Loaded 6x170 dust layer points Loaded random multipliers Loaded 42 wavelenth points Loaded 171 ice layers Loaded 3540 DOMs (19x19) Processing f2k muons from stdin on device 2 Total GPU memory usage: photons: hits: 991 Error: TOT was a nan or an inf 1 times! Bad MP #20 photons: hits: 393 photons: hits: 570 photons: hits: 501 photons: hits: 832 photons: hits: 717 CUDA Error: unspecified launch failure Total GPU memory usage: photons: hits: 938 Error: TOT was a nan or an inf 9 times! Bad MP #20 #20 #20 #20 photons: hits: 442 photons: hits: 627 CUDA Error: unspecified launch failure gpu]$ cat mmc.1.f2k | BADMP=20./ppc 2 > /dev/null Configured: xR=5 eff=0.95 sf=0.2 g=0.943 Loaded 12 angsens coefficients Loaded 6x170 dust layer points Loaded random multipliers Loaded 42 wavelenth points Loaded 171 ice layers Loaded 3540 DOMs (19x19) Processing f2k muons from stdin on device 2 Not using MP #20 Total GPU memory usage: photons: hits: 871 … photons: hits: 114 Device time: (in-kernel: ) [ms] Failure rates:

Typical run times corsika: run 3232: sec files ic86/spx/3232 on cuda00[123] (53.4 seconds per job) 1.2 days of real detector time in 6.5 days nugen: run 2972: event files; E^-2 weighted ic86/spx/2972 on cudatest (25.1 seconds per job) entire 10k set of files in 2.9 days  this is enough for an atmnu/diffuse analysis! Considerations: Maximize GPU utilization by running only mmc+ppc parts on the GPU nodes still, IC40 mmc+ppc+detector was run with ~80% GPU utilization run with 100% DOM efficiency, save all ppc events with at least 1 MC hit apply a range of allowed efficiencies (70-100%) later with ppc-eff module