Machine Learning at the Edge High velocity data inferencing Audrey Corbeil Therrien Omar Quijano Averell Gatton Ryan Coffee
High velocity data - Timetool 10-100 us latency Schematic of timetool to laser sync 10-100 us latency Image of timetool waveform?
High velocity data - CookieBox Schematic cookiebox Downstream veto + time binning Image of cookie box single pulse, double pulse
Workflow Diagram of relationship between FPGA-GPU-CPU and their roles FPGA fast inference - ultimately ASIC? GPU for training the model CPU for the physics simulations providing ground truth - understanding physics of corner case Send corner case back for training Smart initializing - Model library Confidence metric - selects corner cases for further trainig, recommends reloading a new model to scientists when off track
Workflow Diagram of relationship between FPGA-GPU-CPU and their roles FPGA fast inference - ultimately ASIC? GPU for training the model CPU for the physics simulations providing ground truth - understanding physics of corner case Send corner case back for training Smart initializing - Model library Confidence metric - selects corner cases for further trainig, recommends reloading a new model to scientists when off track
Hardware CPU Simulation GPU Simulation FPGA 2TB NVMe NVIDIA Tesla P4 Dual CPU: Intel Xeon Gold 6148 20 Physical 40 Logical 2.4 GH at TDP 150W 3.7 GHz TB RAM: 764 GB DDR4 GPU Simulation NVIDIA Tesla P4 (Inferencing Accelerator) 5.5 TF 15x IL 60X EF NVIDIA Tesla P40 (GPU Accelerator) 3,840 cores 24 GB GDDR5 NVIDIA Tesla V100 640 Tensor Cors 5,120 Cores 32 GB FPGA KCU 1500 Hardware Accelerator 2TB NVMe
Software
Timeline Present CPU Simulation In Progress GPU Simulation CUDA 10 cuDNN 7.4 NCCL 2.4 TensorRT 5.0 In Progress FPGA JDK 1.8 Scala 2.12 Spatial sbt Milestone Pod Integration Fast Communication NVMe Cache Visualization Left: spatially multiplexed 2D timetool signal (January 2018) Right: Result of the convolution with the simulated waveform. 50% RAM / 80 CPU Data visualzation for interactivr mode
Conclusion Detectors provide data, users need information ML can convert data into information with low latency and act on the information to veto bad shots bin specific cases protect detectors We are building the software and hardware architecture for this objective Technology extendable to other detectors Poster presentation - Using FPGA for fast inferencing