Intelligent trigger for Hyper-K with GPUs

Intelligent trigger for Hyper-K with GPUs
Akitaka Ariga University of Bern, Switzerland

Recent changes in design
Conventional design 10 compartments Noise rate in each of them is about SK scale Recently coming back to SK style For cost optimization 1 (or a few) large detector Longer gate width Larger number of PMT per detector Large noise rate to cope

Noise rate in Hyper-K SK -> HK : Smaller signal and larger background Detector size -> larger -> gate width longer 200ns ->500ns # of sensors -> larger N 12k -> 20k ~ 80k Noise rate -> larger N 4kHz -> 10kHz Photo coverage -> smaller  smaller S 40% -> 15% ~ 20% SK: 200ns x 12,000PMTs x 4kHz = 10 hits/gate (SK threshold = 33 hits) HK:500ns x 20,000PMTs x 10kHz = 100 hits/gate Direct impact on low energy neutrino physics, supernova and partially on proton decay

Signal / background Signal: 6 hits/MeV (SK,40%), 3 hits/MeV (HK,20%)
Noise level: expected number of hits in a gate SK: 200ns x 12,000PMTs x 4kHz = 10 hits/gate HK:500ns x 20,000PMTs x 10kHz = 100 hits/gate Noise hits will be dominant at low energy (E<30MeV) Solar neutrino Signal in SK (40%) Supernova Signal in HK (20%) Noise level in HK Noise level in SK

Detectable energy Detectable : Signal+Noise > Noise + noise fluctuation Noise issue is essential to access low energy physics below 20 MeV, where most of supernova, solar neutrino, some of proton decay signals exist. Signal + noise in SK Solar neutrino Supernova detectable Signal + noise in HK Noise + 5s fluctuation = realistic threshold

Need to improve trigger quality
Be intelligent! Use of 4D information hits, (x,y,z,t) Many ideas Exploit TOF information to narrow gate width  next page Vertex calculation: 2 hits can make a hyperbolic surface, 4 hits can make unique identification of vertex position Ring pattern fitting C Hyperbolic by B, C A B Hyperbolic by A, B

One of many ideas: Sub-volume triggering
The largest factor of noise increase is gate width due to large detector  Let’s make it small. Sub-volume triggering Divide detector into several sub-volumes In each sub-volume, perform inversion of hit-time using distance from hit-positions  smaller gate width, canceling detector size increase Large computing power required triggering in O(100) sub-volumes projected params A’ center of sub-volume V A t t’

Intelligent trigger with GPUs
To profit of 4D data, need more computing power GPU is an ideal solution: Expertise in LHEP-Bern GPU: Graphic Processing Unit Parallel processing with O(1000) processing cores Triggering code can be highly parallelized

Parallel processing GPU allow you a parallel processing with thousands of processing cores. Serial process CPU Parallel process GPU task 1 task 2 .

High computing power = 8 TFLOPS = 5-10 TFLOPS NVIDIA Geforce Titan Z
1 full tower of CPU based computing cluster = 5-10 TFLOPS FLOPS = floating-point operations per second

Experience of LHEP-Bern 1: High speed emulsion reconstruction
Custom-made real-time scanning microscope CMOS camera 0.5 – 2.4 Gbyte/s (Real time) 3D track reconstruction with GPUs x90 faster Geforece GTX TITAN x 3 2688 cores, 6GB memory, 4.5 TFLOPs in each JINST 9 P04002 (2014), GTC2014, GPU in high energy physics (2014)

Experience of LHEP-Bern 2: Reconstruction of LAr-TPC
LAr detector (ArgonTube at LHEP-Bern) Hough transform with GPU x 50 faster processing achieved x 50 faster

Possible hardware for HK
Data will be distributed to several nodes equipped with GPUs O(100) processes run with O(100,000) GPU cores Processing machine CPU CPU GPU Processing machine CPU 2.5 Gbyte/s CPU GPU 4U processing server 2 CPU x 10 cores 8 GPUs (24,000 cores) Processing machine CPU CPU GPU

Improve WIT? One of the bottlenecks with current algorithm is number of combinations. To calculate a vertex with 4 hits nC4 quickly increase like n4 10C4 = 210 (SK level), 100C4 = 3.9x106 (HK level) (according to Michael Smy, a hit selection can reduce n4 -> n3, which is implemented in WIT) A comparison of processing time is quickly studied.

Vertexing by 4-hits combination
Using a WCSim-simulated data provided by Yano H 100m, D 69m, electrons start from center Only signal hits are used, 5000 events. Implement code in CPU and GPU Equivalent result is, of course, obtained in GPU CPU GPU Vertices are reconstructed at center of detector (0,0,0), as it should be.

First comparison in speed
Basic optimization done for CPU code Factor 35 faster with GPU In my experience, it can be additional factor 2-5 faster with further optimization. 20 MeV (about 1.6 million combinations / event) cpu 788 sec gpu sec 15 MeV (about 500,000 combinations / event) 13 11 7 9 3MeV 5

Sub-volume triggering
z y x In each sub-volume, perform inversion of hit-time using distance from hit-positions  smaller gate width, canceling detector size increase Test with simulated data H 100m, D 50m electron emitted from center to x direction (0,0,0) projected params A’ center of sub-volume V A t t’

Sub-volume triggering
z y x time back-calculation to predefined vertices along x x axis = [500, 1500] ns, 10 ns binning, blue histogram = event related projected params A’ V Center predefined vertex A 100 m height, 69 m diameter, 19 k PMTs, 9 MeV

Subvolume triggering z y x time back-calculation to predefined vertices along Z x axis = [500, 1500] ns, 10 ns binning, blue histogram = event related projected params A’ V Center predefined vertex A 軸方向にvertexを並べたときに比べてピークが局在化。高い値を持つ領域は楕円球状に存在するtrackingできる、そしていくつかのsubvolumeの連続することを要求すればＢＧも落とせる。 100 m height, 69 m diameter, 19 k PMTs, 9 MeV

Signal/BG Separation Predefine vertices every 5m in detector volume(~3000 vertices) Find vertex which has highest entry in one of time bin 9 MeV electron from center x 5000 events Predefine vertex every 5m noise only noise + signal s=7.0 s=2.7 Simply counting # of hits in 500 ns gate width Number of hits in 10 ns in the most probable predefined vertex (time-space) 数字上2.7から7シグマに向上するが思ったよりセパレーションがよくない。。。そもそもガウシアンではない。Noise onlyに対しても3000個のVertex で最大値を取るとchance coincidenceで高く出てしまうことが原因。要改良。

Processing time Including noise hits Less dependent on number of hits
CPU divided by 10 30 times faster GPU Overhead Including noise hits Less dependent on number of hits 30 times faster in GPU (x2-5 speed up by further optimization)

Summary Noise rate is a crucial issue for low energy neutrino, supernova and proton decay We are investigating an intelligent trigger by exploiting 4D data from detector Larger computing power of >O(100) could be necessary  An use of GPUs is a promising solution

Intelligent trigger for Hyper-K with GPUs

Similar presentations

Presentation on theme: "Intelligent trigger for Hyper-K with GPUs"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Intelligent trigger for Hyper-K with GPUs

Similar presentations

Presentation on theme: "Intelligent trigger for Hyper-K with GPUs"— Presentation transcript:

Similar presentations

About project

Feedback