Calliope-Louisa Sotiropoulou C OGNITIVE I MAGING U SING FTK H ARDWARE M EETING ON M EDICAL I MAGING.

Slides:



Advertisements
Similar presentations
14. Aug Towards Practical Lattice-Based Public-Key Encryption on Reconfigurable Hardware SAC 2013, Burnaby, Canada Thomas Pöppelmann and Tim Güneysu.
Advertisements

TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
Sumitha Ajith Saicharan Bandarupalli Mahesh Borgaonkar.
Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.
Development of a track trigger based on parallel architectures Felice Pantaleo PH-CMG-CO (University of Hamburg) Felice Pantaleo PH-CMG-CO (University.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Digital Signal Processing and Field Programmable Gate Arrays By: Peter Holko.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
Conversion Between Video Compression Protocols Performed by: Dmitry Sezganov, Vitaly Spector Instructor: Stas Lapchev, Artyom Borzin Cooperated with:
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Mahesh Sukumar Subramanian Srinivasan. Introduction Face detection - determines the locations of human faces in digital images. Binary pattern-classification.
Field Programmable Gate Array (FPGA) Layout An FPGA consists of a large array of Configurable Logic Blocks (CLBs) - typically 1,000 to 8,000 CLBs per chip.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
1 DSP Implementation on FPGA Ahmed Elhossini ENGG*6090 : Reconfigurable Computing Systems Winter 2006.
FPGA Based Fuzzy Logic Controller for Semi- Active Suspensions Aws Abu-Khudhair.
General Purpose FIFO on Virtex-6 FPGA ML605 board midterm presentation
Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf
General Purpose FIFO on Virtex-6 FPGA ML605 board Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf 1 Semester: spring 2012.
FTK poster F. Crescioli Alberto Annovi
Calliope-Louisa Sotiropoulou on behalf of the FTK Pixel Clustering Team A H IGH P ERFORMANCE M ULTI -C ORE FPGA I MPLEMENTATION FOR 2D P IXEL C LUSTERING.
Jon Turner (and a cast of thousands) Washington University Design of a High Performance Active Router Active Nets PI Meeting - 12/01.
Real-Time Human Posture Reconstruction in Wireless Smart Camera Networks Chen Wu, Hamid Aghajan Wireless Sensor Network Lab, Stanford University, USA IPSN.
Efficient FPGA Implementation of QR
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
Implementing Codesign in Xilinx Virtex II Pro Betim Çiço, Hergys Rexha Department of Informatics Engineering Faculty of Information Technologies Polytechnic.
Accelerating image recognition on mobile devices using GPGPU
Tracking with CACTuS on Jetson Running a Bayesian multi object tracker on an embedded system School of Information Technology & Mathematical Sciences September.
WRM FUTURE DEVELOPMENT DANIELE FELICI (ER1), ALI ABDALLAH (ESR1) WP2 EDUSAFE MEETING CERN, JUNE 2015.
Accelerating Homomorphic Evaluation on Reconfigurable Hardware Thomas Pöppelmann, Michael Naehrig, Andrew Putnam, Adrian Macias.
Lecture 16: Reconfigurable Computing Applications November 3, 2004 ECE 697F Reconfigurable Computing Lecture 16 Reconfigurable Computing Applications.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Rinoy Pazhekattu. Introduction  Most IPs today are designed using component-based design  Each component is its own IP that can be switched out for.
AMB HW LOW LEVEL SIMULATION VS HW OUTPUT G. Volpi, INFN Pisa.
Floating-Point Divide and Square Root for Efficient FPGA Implementation of Image and Signal Processing Algorithms Xiaojun Wang, Miriam Leeser
Design Criteria and Proposal for a CBM Trigger/DAQ Hardware Prototype Joachim Gläß Computer Engineering, University of Mannheim Contents –Requirements.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
G. Volpi - INFN Frascati ANIMMA Search for rare SM or predicted BSM processes push the colliders intensity to new frontiers Rare processes are overwhelmed.
Automated Maze System Development Group 9 Tanvir Haque Sidd Murthy Samar Shah Advisors: Dr. Herbert Y. Meltzer, Psychiatry Dr. Paul King, Biomedical Engineering.
Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA FIREBIRD BOARD Framegrabber PCI Bus Host Data Packing Design.
Hardened IDS using IXP Didier Contis, Dr. Wenke Lee, Dr. David Schimmel Chris Clark, Jun Li, Chengai Lu, Weidong Shi, Ashley Thomas, Yi Zhang  Current.
Philipp Gysel ECE Department University of California, Davis
Introduction to Field Programmable Gate Arrays (FPGAs) EDL Spring 2016 Johns Hopkins University Electrical and Computer Engineering March 2, 2016.
Applicazione con al centro dispositivi moderni elettronici FPGA combinati con ASIC specifici Le Memorie Associative 2 possibili applicazioni  Smart Systems.
Back-end Electronics Upgrade TileCal Meeting 23/10/2009.
Future evolution of the Fast TracKer (FTK) processing unit C. Gentsos, Aristotle University of Thessaloniki FTK FP7-PEOPLE-2012-IAPP FTK executive.
System on a Programmable Chip (System on a Reprogrammable Chip)
Calliope-Louisa Sotiropoulou FTK: E RROR D ETECTION AND M ONITORING Aristotle University of Thessaloniki FTK WORKSHOP, ALEXANDROUPOLI: 10/03/2014.
GUIDO VOLPI – UNIVERSITY DI PISA FTK-IAPP Mid-Term Review 07/10/ Brussels.
MADEIRA Valencia report V. Stankova, C. Lacasta, V. Linhart Ljubljana meeting February 2009.
Training Phase Implementation Simulation and Implementation Results Hardware Setup * Abstract We present an innovative and.
Sridhar Rajagopal Bryan A. Jones and Joseph R. Cavallaro
The Associative Memory Chip
The Jülich Digital Readout System for PANDA Developments
Presenter: Darshika G. Perera Assistant Professor
Hiba Tariq School of Engineering
WP2 – Testing campaign and beyond
Dynamo: A Runtime Codesign Environment
Calliope-Louisa Sotiropoulou on behalf of the IMPART Project
An online silicon detector tracker for the ATLAS upgrade
Pending technical issues and plans to address and solve
A WRM-based Application
Anne Pratoomtong ECE734, Spring2002
RECONFIGURABLE PROCESSING AND AVIONICS SYSTEMS
CS294-1 Reading Aug 28, 2003 Jaein Jeong
Wavelet “Block-Processing” for Reduced Memory Transfers
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Final Project presentation
Presentation transcript:

Calliope-Louisa Sotiropoulou C OGNITIVE I MAGING U SING FTK H ARDWARE M EETING ON M EDICAL I MAGING

A New Approach to Image Processing The motivation: Requirement of the FTK IAPP research project  Impact of IAPP outside HEP community Very attractive research target for the embedded system designers and the engineers of the group Interdisciplinary research is required by almost all calls for funding But in fact… It is a very good idea to use existing HW for pattern matching for HEP for generic image processing 2

A New Approach to Image Processing The FTK Processor is a very fast, powerful and efficient pattern matching machine The particle tracking problem is in fact an image processing problem 3 HEP Before Filtering After Filtering

A New Approach to Image Processing The concept is to… Use existing resources and ideas (e.g. the AMchip, AMboards) for image processing and pattern matching for different application fields A great number of interdisciplinary applications – not limited to image processing - is available: biomedical imaging (e.g. MRI), cognitive imaging (Del Viva’s algorithm 1 ), security (real time face recognition, personnel tracking), DNA analysis, smart cameras, data mining (e.g. data servers, search engines) etc. 4 1 M. Del Viva, G. Punzi, and D. Benedetti. Information and Perception of Meaningful Patterns. PloSone 8.7 (2013): e69154.

FTK for Image Filtering Image Filtering  Finding Contours is the way human brain operates: New approach for image processing hardware Major difference between HEP and Human Brain: For HEP we are aware a-priori of the relevant patterns The Human Brain is trained in real time (we are not aware of the patterns beforehand and we always adjust to new environments) Target: Design and prototype a pattern matching machine that executes both training and filtering in real time using high performance embedded systems (FPGAs, ASICs and combination of the two - SiP / System In Package) 5

FTK for Image Filtering We are developing a system that can execute real-time contour identification, taking advantage the AM chip technology The system will include a real-time training implementation that will allow the selection of the “relevant” patterns to adapt to the external environment (e.g. use of smart cameras for security applications, robot vision etc.) The training phase is a “cognitive image processing” approach. Identifying the relevant patters resembles the way the human brain operates  interesting field of research, what more information can we obtain real-time with the cognitive approach? 6

Hardware configuration: Use of AMchip 05 Test system 7 A Virtex-6 LX240T Development board with a system implemented to test the AMchip Functions that exist: Pattern loading Pattern comparison Ethernet connection IPbus controls GTX transceivers etc.

Traning: performed by FPGA + Ex. Memory Pattern Filtering: performed by AM Chip Hardware configuration: Use of AMchip 05 Test system 8

Adapt the existing system to load images and execute the training phase Identify all existing patterns, generate Probability Density Histograms Identify useful patterns Adapt the existing system to execute pattern matching Load the useful patterns to the AMchip Execute pattern matching for various input images Extract contours or/and execute a post processing step (e.g. clustering) 9

Hardware Implementation: Training Phase – Calculate the frequency of each possible pattern 10 i. e. pattern = 11,11,11-11,11,11, 01,11,11 Each pattern = its own address 4 Gray levels 2 18 = 256 Kpatterns (B/W movement: 2 27 = 128 Mpatterns) Pattern = ADDRESS 32 b × 256 k = 500 kB (movement 32 b ×128 M = 500MB) Ext Memory 2 GB NumOfAppearances = NoA(pattern) Each Location contains the updated number of appearances of each pattern. When a pattern is identified the specific pattern address is read, contents are increased by one (+1), and written back in the same location - ACCUMULATION 18 (or 27) bits 16 bits

Hardware Implementation: Training Phase – Calculate the frequency of each possible pattern The Accumulator is optimized to have a throughput of one pattern address per cycle even if the address is the same as the previous one (special design to avoid data corruption) 11

Hardware Implementation: Training Phase – Updating Frequencies Patterns that are efficient carriers of information given the bandwidth (W) & memory limits (N) Log(p) W/N<p W/N>p f(p) = -Wlog (p)f(p) = -pNlog (p) DSP pi RAM -log(pi) -Wlog (p) if p>W/N -pNlog(p) if p<W/N Compare To THR Yes Or No N of appearances (per pattern) 12

Hardware Implementation: Running Phase – Filtering Patterns 13 Running Phase Bus_layer0 Bus_layer1 Bus_layer2 …….. Bus_layer7 layer0 layer1 layer2 layer7 pattern0 pattern1 pattern2 pattern3 layer0 layer1 layer2 layer7 Patt0 Patt1 Patt2 Patt7 ……… Each comparison we (1) compare 8 patterns (2) read for each match the bitmap (3) give INIT

Future Plans: Use of an Ultrascale evaluation board and System on Chip (SiP) systems More resources available on the FPGA device  New technology, new design approaches, greater challenges (Ultrascale) Merge the devices on a single chip (SiP) Targeting the development of a portable and flexible hardware accelerator (possibly with a PCI type interface) to be easily interconnected with a PC 14

Status Development of the Training Phase Implementation Accumulator ready Calculation of frequency in progress Loading of patterns in progress 15

Post Processing Step: 2D Pixel Clustering A real-time implementation developed for the pixel sensors of the ATLAS detector but generic enough to be adapted to various applications The cluster of pixels is replaced with the center of a bounding box The post processing step can adapted to suit the application Versatility  different clustering engines can work independently in parallel on different data The implementation is fully tested and functioning 16 Min_col = 0 Min_col_pix_center = 0.5 Max_col = 4 Max_col_pix_center = 3.5 Min_row = 0 Min_row_pix_center = 0.5 Max_row= 5 Max_row_pixel_center = 4.5 Bounding Box Center coordinate

Post Processing Step: 2D Pixel Clustering 17

Conclusions The expertise and know-how for image processing is strong Parts of the hardware that belong to the FTK system are already available and can be adapted for generic image processing (AM chip, AM board, AM chip testbed, Pixel Clustering) We are looking for suitable applications for an interdisciplinary approach 18

Thank you… 19

Hardware configuration: Use of AMchip 05 Test system 20