Wang Chen, Dr. Miriam Leeser, Dr. Carey Rappaport Goal Speedup 3D Finite-Difference Time-Domain.

Slides:



Advertisements
Similar presentations
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Advertisements

Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Study of propagative and radiative behavior of printed dielectric structures using the finite difference time domain method (FDTD) Università “La Sapienza”,
The 3D FDTD Buried Object Detection Forward Model used in this project was developed by Panos Kosmas and Dr. Carey Rappaport of Northeastern University.
An Introduction to Reconfigurable Computing Mitch Sukalski and Craig Ulmer Dean R&D Seminar 11 December 2003.
FPGA Implementation of Closed-Loop Control System for Small-Scale Robot.
K-means clustering –An unsupervised and iterative clustering algorithm –Clusters N observations into K clusters –Observations assigned to cluster with.
The Improved 3D Matlab_based FDFD Model and Its Application Qiuzhao Dong(NU), Carey Rapapport(NU) (contact: This.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
The Design Process Outline Goal Reading Design Domain Design Flow
Computer Graphics Hardware Acceleration for Embedded Level Systems Brian Murray
Department of Electrical and Computer Engineering Texas A&M University College Station, TX Abstract 4-Level Elevator Controller Lessons Learned.
Data Partitioning for Reconfigurable Architectures with Distributed Block RAM Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
Tejas Bhatt and Dennis McCain Hardware Prototype Group, NRC/Dallas Matlab as a Development Environment for FPGA Design Tejas Bhatt June 16, 2005.
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
A Parameterized Floating Point Library Applied to Multispectral Image Clustering Xiaojun Wang Dr. Miriam Leeser Rapid Prototyping Laboratory Northeastern.
HW/SW CODESIGN OF THE MPEG-2 VIDEO DECODER Matjaz Verderber, Andrej Zemva, Andrej Trost University of Ljubljana Faculty of Electrical Engineering Trzaska.
HW/SW CODESIGN OF THE MPEG-2 VIDEO DECODER Matjaz Verderber, Andrej Zemva, Andrej Trost University of Ljubljana Faculty of Electrical Engineering Trzaska.
Field Programmable Gate Array (FPGA) Layout An FPGA consists of a large array of Configurable Logic Blocks (CLBs) - typically 1,000 to 8,000 CLBs per chip.
FPGA Based Fuzzy Logic Controller for Semi- Active Suspensions Aws Abu-Khudhair.
Fernando Ortiz EM Photonics, Inc. Newark, DE
1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University.
University of Veszprém Department of Image Processing and Neurocomputing Emulated Digital CNN-UM Implementation of a 3-dimensional Ocean Model on FPGAs.
Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY.
Floating Point vs. Fixed Point for FPGA 1. Applications Digital Signal Processing -Encoders/Decoders -Compression -Encryption Control -Automotive/Aerospace.
Efficient FPGA Implementation of QR
Sherman Braganza, Miriam Leeser, W.C. Warger II, C.M. Warner, C. A. DiMarzio Goal Accelerate the performance of the.
1 of 23 Fouts MAPLD 2005/C117 Synthesis of False Target Radar Images Using a Reconfigurable Computer Dr. Douglas J. Fouts LT Kendrick R. Macklin Daniel.
RiceNIC: A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Dr. Scott Rixner Rice Computer Architecture:
Variable Precision Floating Point Division and Square Root Albert Conti Xiaojun Wang Dr. Miriam Leeser Rapid Prototyping Laboratory Northeastern University,
FPGA FPGA2  A heterogeneous network of workstations (NOW)  FPGAs are expensive, available on some hosts but not others  NOW provide coarse- grained.
Follow-up Courses. ECE Department MS in Electrical Engineering MS EE MS in Computer Engineering MS CpE COMMUNICATIONS & NETWORKING SIGNAL PROCESSING CONTROL.
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
Qiuzhao Dong(NU), Carey Rappapport(NU) (contact: This work was supported in part by CenSSIS, the Center for Subsurface.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
Hyperspectral Imaging is the process by which image data is obtained simultaneously in dozens or hundreds of narrow, adjacent spectral bands. These bands.
An environmental issue that has reached interest in the academic and industry is groundwater contamination. Moreover, government agencies are concerned,
RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.
Design Objectives The design should fulfill the functional requirements listed below Functional Requirements Hardware design – able to calculate transforms.
1 Reconfigurable Acceleration of Microphone Array Algorithms for Speech Enhancement Ka Fai Cedric Yiu, Yao Lu, Xiaoxiang Shi The Hong Kong Polytechnic.
An FPGA Implementation of the Ewald Direct Space and Lennard-Jones Compute Engines By: David Chui Supervisor: Professor P. Chow.
Research Overview Bahaa Saleh Carey Rappaport David Castañón Badri Roysam Miguel Velez-Reyes David Kaeli Research Overview Bahaa Saleh Carey Rappaport.
Floating-Point Divide and Square Root for Efficient FPGA Implementation of Image and Signal Processing Algorithms Xiaojun Wang, Miriam Leeser
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
Hardware Accelerator for Combinatorial Optimization Fujian Li Advisor: Dr. Areibi.
Implementing algorithms for advanced communication systems -- My bag of tricks Sridhar Rajagopal Electrical and Computer Engineering This work is supported.
-BY KUSHAL KUNIGAL UNDER GUIDANCE OF DR. K.R.RAO. SPRING 2011, ELECTRICAL ENGINEERING DEPARTMENT, UNIVERSITY OF TEXAS AT ARLINGTON FPGA Implementation.
Soil Moisture Radar – Ongo-02d ABSTRACT During times of increased flood problems, soil moisture becomes a paramount concern among geologists due to the.
Implementing Multiuser Channel Estimation and Detection for W-CDMA Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro and Behnaam Aazhang Rice.
Copyright © 2004, Dillon Engineering Inc. All Rights Reserved. An Efficient Architecture for Ultra Long FFTs in FPGAs and ASICs  Architecture optimized.
Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA FIREBIRD BOARD Framegrabber PCI Bus Host Data Packing Design.
Backprojection and Synthetic Aperture Radar Processing on a HHPC Albert Conti, Ben Cordes, Prof. Miriam Leeser, Prof. Eric Miller
Sherman Braganza, Miriam Leeser Goal Accelerate the performance of the minimum L P Norm phase unwrapping algorithm.
1 EEE 431 Computational Methods in Electrodynamics Lecture 7 By Dr. Rasime Uyguroglu
Cray XD1 Reconfigurable Computing for Application Acceleration.
The Image Space Reconstruction Algorithm (ISRA) is an iterative method used to solve the abundance estimation problem in the analysis of hyperspectral.
1 An FPGA Implementation of the Two-Dimensional Finite-Difference Time-Domain (FDTD) Algorithm Wang Chen Panos Kosmas Miriam Leeser Carey Rappaport Northeastern.
Philipp Gysel ECE Department University of California, Davis
Sridhar Rajagopal Bryan A. Jones and Joseph R. Cavallaro
Programmable Logic Devices
Fast & Accurate Biophotonic Simulation for Personalized Photodynamic Cancer Therapy Treatment Planning Investigators: Vaughn Betz, University of Toronto.
Backprojection Project Update January 2002
GPR Simulations for pipeline oil drainage
Dynamo: A Runtime Codesign Environment
Hardware Implementation of CTIS Reconstruction Algorithms
Dynamo: A Runtime Codesign Environment
Reza Firoozabadi, Eric L. Miller, Carey M. Rappaport and Ann W
Presentation transcript:

Wang Chen, Dr. Miriam Leeser, Dr. Carey Rappaport Goal Speedup 3D Finite-Difference Time-Domain (FDTD) Algorithm through the use of Field Programmable Gate Arrays (FPGAs). We have implemented the 2D FDTD on the FIREBIRD™/PCI board before. Now the New WILDSTAR™-II PRO/PCI board from Annapolis Micro Systems, Inc. is the target for our 3D FDTD hardware implementation. Reconfigurable Hardware Performance Result FDTD Hardware Design Structure Current Work We quantize the double floating-point precision data to fix-point data for hardware implementation according to data analysis. The Forward Model simulates the whole electromagnetic space and wave propagation in the model space with Ground Penetrating Radar, dispersive soil and rough air-soil surface. We Compare the relative error between floating-point Fortran code and fixed-point C code. We Choose the suitable bit-width considering the trade-off between accuracy and area. Abstract Understanding and predicting electromagnetic behavior is needed more and more in modern technology. The Finite- Difference Time-Domain (FDTD) method is a powerful computational electromagnetic technique for modelling electromagnetic space. However, the computation of this method is complex and time consuming. Implementing this algorithm in hardware will greatly increase its computational speed and widen its usage. We present the first fixed-point 3D FDTD FPGA accelerator, which supports a wide range of materials including dispersive media. By analyzing the performance of fixed- point arithmetic in both soil-based media and human tissue media, we choose the right fixed-point representation to minimize the relative error between fixed-point and floating point results. The FPGA accelerator supports the UPML absorbing boundary conditions which have better performance in dispersive soil and human tissue media than PML boundary conditions. The 3D FDTD design is implemented on a WildStarII-Pro FPGA board and experimental results is provided. The speedup is due to pipelining, parallelism, use of fixed point arithmetic, and careful memory architecture design. Acceleration of the 3D FDTD Algorithm in Fixed- point Arithmetic using Reconfigurable Hardware WILDSTAR™-II PRO/PCI  Fixed-point components is faster in hardware design  Data range of the FDTD algorithm is good for the fixed-point representation Block Diagram Architecture Electric Field Updating Pipeline Features of the WILDSTAR™-II PRO/PCI boards: Uses two Xilinx® Virtex-II Pro™ FPGAs XC2V70 (33088 slices and 5904Kb BlockRAM) 12 ports of DDR II SRAM totally 48MBytes, 2 ports of DDR SDRAM totally 256 MBytes 11 GBytes/sec memory bandwidth FDTD Application Models The 3D FDTD Buried Object Detection Forward Model and Breast Cancer Detection Forward Model were developed by Panos Kosmas and Dr. Carey Rappaport of Northeastern University. 3D UPML FDTD Hardware Implementation Schneider et. al. implement the 1D FDTD on hardware, but the architecture is too simple. Durbano et. al. implement the 3D FDTD on hardware, but their design use floating-point representation which sacrifice the speed for the precision. Memory Interface This work was supported in part by CenSSIS, the Center for Subsurface Sensing and Imaging Systems, under the Engineering Research Centers Program of the National Science Foundation (Award Number EEC ). This work is a part of CenSSIS Research Thrust R3A. As we know, forward modeling of large complex scattering geometries is too slow for real-time applications or iterative solution of inverse problems. Our goal is to develop hardware/software implementation of forward modeling processing to achieve real-time inversion. Research Level 1 Thrust R3A Our 3D FDTD implementation has 16 times speedup compared to 3.0G PC, using fixed-point representation and support dispervice media and UPML boundary conditions. State of the Art [1] Ryan N. Schneider et. al., ``Application of FPGA Technology to Accelerate the Finite-Difference Time-Domain (FDTD) Method'', Proceedings of the FPGA 2002, pp [2] J. P. Durbano et. al., ``FPGA-Based Acceleration of the 3D Finite-Difference Time- Domain Method”, Proceeding of the FCCM 2004, pp Publications Acknowledging NSF Support [1] W. Chen, P. Kosmas, M. Leeser, C. Rappaport, "An FPGA Implementation of the Two-Dimensional Finite-Difference Time- Domain (FDTD) Algorithm", Proceedings of the 2004 ACM International Symposium on Field-Programmable Gate Arrays, February 2004, Monterey, CA, USA, pp [2] Kosmas, P., Wang, Y., and Rappaport, C., ``Three-Dimensional FDTD Model for GPR Detection of Objects Buried in Realistic Dispersive Soil'', SPIE Aerosense Conference, Orlando, FL, April 2002, pp R2 Fundamental Science Validating TestBEDs L1 L2 L3 R3 S1 S4 S5 S3S2 Bio-Med Enviro-Civil R1  Breast Cancer Detection Forward Model  Spiral Antenna Model  Buried Object Detection Forward Model Accurate computational modelling of microwaves in human tissue with the FDTD method is very helpful for breast cancer detection research. This model uses the modified 3D FDTD algorithm and the modified UPML ABC for better performance in dispersive human tissue Use the FDTD method to simulate the radiation of the Archimedean spiral antenna.  Optimize 3D FDTD Implementation  Two FPGA parallel computing on board  3D UPML FDTD accelerator for general FDTD problems.  More Generic Hardware Design  Support More Complex Sources  Better User Interface