Evaluation of 'OpenCL for FPGA' for DAQ and Acceleration in HEP applications CHEP 2015 Srikanth S Marie Curie Fellow: ICE-DIP

Slides:



Advertisements
Similar presentations
DEVELOPMENT OF ONLINE EVENT SELECTION IN CBM DEVELOPMENT OF ONLINE EVENT SELECTION IN CBM I. Kisel (for CBM Collaboration) I. Kisel (for CBM Collaboration)
Advertisements

Reconfigurable Computing After a Decade: A New Perspective and Challenges For Hardware-Software Co-Design and Development Tirumale K Ramesh, Ph.D. Boeing.
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
Autonomic Systems Justin Moles, Winter 2006 Enabling autonomic behavior in systems software with hot swapping Paper by: J. Appavoo, et al. Presentation.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Design Methodology for High-Level Model Based on an Eight Bit Entertainment System Alejandro Lizaola, Ricardo D. Castro, Gilberto Beltran. Manuel Salim.
Acceleration of the Smith– Waterman algorithm using single and multiple graphics processors Author : Ali Khajeh-Saeed, Stephen Poole, J. Blair Perot. Publisher:
© 2004, D. J. Foreman 1 O/S Organization. © 2004, D. J. Foreman 2 Topics  Basic functions of an OS ■ Dev mgmt ■ Process & resource mgmt ■ Memory mgmt.
Hardware accelerator for PPC microprocessor Final presentation By: Instructor: Kopitman Reem Fiksman Evgeny Stolberg Dmitri.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Interface of DSP to Peripherals of PC Spring 2002 Supervisor: Broodney, Hen | Presenting: Yair Tshop Michael Behar בס " ד.
Hardware accelerator for PPC microprocessor By: Instructor: Kopitman Reem Fiksman Evgeny Stolberg Dmitri.
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
1 Chapter 7 Design Implementation. 2 Overview 3 Main Steps of an FPGA Design ’ s Implementation Design architecture Defining the structure, interface.
Computing Platform Benchmark By Boonyarit Changaival King Mongkut’s University of Technology Thonburi (KMUTT)
Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11 Date: Technion – Israel Institute of Technology Faculty of Electrical Engineering High Speed.
General Purpose FIFO on Virtex-6 FPGA ML605 board midterm presentation
© 2011 Xilinx, Inc. All Rights Reserved Intro to System Generator This material exempt per Department of Commerce license exception TSU.
Department of Electrical Engineering National Cheng Kung University
Digital signature using MD5 algorithm Hardware Acceleration
CHAPTER 13: I/O SYSTEMS Overview Overview I/O Hardware I/O Hardware I/O API I/O API I/O Subsystem I/O Subsystem Transforming I/O Requests to Hardware Operations.
TM Efficient IP Design flow for Low-Power High-Level Synthesis Quick & Accurate Power Analysis and Optimization Flow JAN Asher Berkovitz Yaniv.
ICE-DIP project Parallel processing on Many-Core processors ICE-DIP introduction at Intel › 22/7/2014.
Niko Neufeld, CERN/PH-Department
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem.
1 Lecture 20: I/O n I/O hardware n I/O structure n communication with controllers n device interrupts n device drivers n streams.
Architectural Optimizations David Ojika March 27, 2014.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
TEMPLATE DESIGN © Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan.
F. Gharsalli, S. Meftali, F. Rousseau, A.A. Jerraya TIMA laboratory 46 avenue Felix Viallet Grenoble Cedex - France Embedded Memory Wrapper Generation.
ICE-DIP Mid-term review Data Transfer WP4a - ESR4: Aram Santogidis › 16/1/2015.
Background Physicist in Particle Physics. Data Acquisition and Triggering systems. Specialising in Embedded and Real-Time Software. Since 2000 Project.
A Critical Analysis of the Windows mLAN Driver
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
XStream: Rapid Generation of Custom Processors for ASIC Designs Binu Mathew * ASIC: Application Specific Integrated Circuit.
Harmony: A Run-Time for Managing Accelerators Sponsor: LogicBlox Inc. Gregory Diamos and Sudhakar Yalamanchili.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
Guido Haefeli CHIPP Workshop on Detector R&D Geneva, June 2008 R&D at LPHE/EPFL: SiPM and DAQ electronics.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
Connecting EPICS with Easily Reconfigurable I/O Hardware EPICS Collaboration Meeting Fall 2011.
VAPRES A Virtual Architecture for Partially Reconfigurable Embedded Systems Presented by Joseph Antoon Abelardo Jara-Berrocal, Ann Gordon-Ross NSF Center.
Barcelona 1 Development of new technologies for accelerators and detectors for the Future Colliders in Particle Physics URL.
Survey of multicore architectures Marko Bertogna Scuola Superiore S.Anna, ReTiS Lab, Pisa, Italy.
June, 1999©Vanu, Inc. Vanu Bose Vanu, Inc. Programming the Physical Layer in Wireless Networks.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
ICHEC Presentation ESR2: Reconfigurable Computing and FPGAs ICE-DIP Srikanth Sridharan 9/2/2015.
EU-Russia Call Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
4/27/2000 A Framework for Evaluating Programming Models for Embedded CMP Systems Niraj Shah Mel Tsai CS252 Final Project.
ROM. ROM functionalities. ROM boards has to provide data format conversion. – Event fragments, from the FE electronics, enter the ROM as serial data stream;
Creation and Utilization of a Virtual Platform for Embedded Software Optimization: An Industrial Case Study Sungpack Hong, Sungjoo Yoo, Sheayun Lee, Sangwoo.
April 15, 2013 Atul Kwatra Principal Engineer Intel Corporation Hardware/Software Co-design using SystemC/TLM – Challenges & Opportunities ISCUG ’13.
CHEP 2010, October 2010, Taipei, Taiwan 1 18 th International Conference on Computing in High Energy and Nuclear Physics This research project has.
Off-Detector Processing for Phase II Track Trigger Ulrich Heintz (Brown University) for U.H., M. Narain (Brown U) M. Johnson, R. Lipton (Fermilab) E. Hazen,
Accelerating particle identification for high-speed data-filtering using OpenCL on FPGAs and other architectures for FPL 2016 Srikanth Sridharan CERN 8/31/2016.
FPGAs for next gen DAQ and Computing systems at CERN
Dynamo: A Runtime Codesign Environment
APPLICATION OF DESIGN PATTERNS FOR HARDWARE DESIGN
THE PROCESS OF EMBEDDED SYSTEM DEVELOPMENT
Effective Data-Race Detection for the Kernel
Next steps – with notes from during the meeting 17 March 2016
FPGAs in AWS and First Use Cases, Kees Vissers
Next steps - with notes from during the meeting 24 March 2016
Highly Efficient and Flexible Video Encoder on CPU+FPGA Platform
Konstantis Daloukas Nikolaos Bellas Christos D. Antonopoulos
I/O Systems I/O Hardware Application I/O Interface
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
NetFPGA - an open network development platform
Presentation transcript:

Evaluation of 'OpenCL for FPGA' for DAQ and Acceleration in HEP applications CHEP 2015 Srikanth S Marie Curie Fellow: ICE-DIP › 13/04/2015

13/04/2015 Srikanth Sridharan – ICE-DIP Project 2

The Grand Challenge for proposed upgrade of LHCb experiment › 500 Data sources each generating data at 100 Gbps › Requires FPGAs for  DAQ modules ̵ Adaptive, high performance  Low Level Trigger decisions ̵ Algorithm acceleration › FPGAs are great  High Performance  Flexibility - Build anything  But… 13/04/2015 Srikanth Sridharan – ICE-DIP Project 3

FPGA pains › Programmability, Programmability, Programmability  Using Hardware Description Languages (HDLs)  At Register transfer level (RTL) abstraction  Create Custom Memory Hierarchies  Manage Communication with PC  Create PC side SW that plays nice with all this 13/04/2015 Srikanth Sridharan – ICE-DIP Project 4

OpenCL for FPGA › An unified programming model for accelerating algorithms on heterogeneous systems  C based programming language  System level design (Host CPU+FPGA)  Memory Hierarchy auto generated  Host CPU-FPGA communication abstracted away  Potential to integrate existing VHDL/Verilog IP 13/04/2015 Srikanth Sridharan – ICE-DIP Project 5

OpenCL for FPGA 13/04/2015 Srikanth Sridharan – ICE-DIP Project 6

End to end DAQ + Acceleration system with ‘OpenCL for FPGA’ ? 13/04/2015 Srikanth Sridharan – ICE-DIP Project 7 › OpenCL spec targets  Algorithm Acceleration  Not FPGA design › But, How about ‘OpenCL for FPGA’  More than Acceleration?  A DAQ system with OpenCL? › Only one way to find out  Go ahead and implement one

An existing FPGA DAQ system 13/04/2015 Srikanth Sridharan – ICE-DIP Project 8 Source Emulator Header Generator › A DAQ module to packetize streaming data › Attempt to implement the functionality in OpenCL kernels › But was it enough? RequirementOpenCL Std specAltera OpenCL extension IO, moving data between kernels X Pipes in v2.0 Channels (IO -> Kernel, Kernel -> Kernel, Kernel ->IO) Bit level manipulation, non std data width XBitfields & Other mechanism Control signalsXX

OpenCL Algorithm Acceleration › Hough Transform  Used for particle track reconstruction in LHCb for the future VertexDetector  Find tracks from the Hit data of the detector  Exisiting OpenCL GPU code ported to FPGA 13/04/2015 Srikanth Sridharan – ICE-DIP Project 9

› Erratum: on this slide a comparison measurement was shown which is erroneous and has been retracted. We will show a corrected measurement in the paper. xx/xx/2014 First Name and Family Name – ICE-DIP Project 10

Conclusions › FPGAs have always been capable devices  Performance, Versatility, Flexibility › But so have been limited by the usability  Separate and HW oriented programming model › OpenCL is a game changer  Unlocking the potential of FPGAs much easier  Sure, there are some road blocks  But future of OpenCL for FPGA is definitely bright 13/04/2015 Srikanth Sridharan – ICE-DIP Project 11

Thank You! 13/04/2015 Srikanth Sridharan – ICE-DIP Project 12