FPGA Implementations for Volterra DFEs

Slides:



Advertisements
Similar presentations
DSPs Vs General Purpose Microprocessors
Advertisements

ECE 506 Reconfigurable Computing ece. arizona
Architecture-Specific Packing for Virtex-5 FPGAs
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Reconfigurable Computing (EN2911X, Fall07) Lecture 04: Programmable Logic Technology (2/3) Prof. Sherief Reda Division of Engineering, Brown University.
1 Reconfigurable Computing Lab UCLA FPGA Polyphase Filter Bank Study & Implementation Raghu Rao Matthieu Tisserand Mike Severa Prof. John Villasenor Image.
H.264 Intra Frame Coder System Design Özgür Taşdizen Microelectronics Program at Sabanci University 4/8/2005.
Lecture 7 FPGA technology. 2 Implementation Platform Comparison.
FPGAs for HIL and Engine Simulation
A Survey of Logic Block Architectures For Digital Signal Processing Applications.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
Institute of Applied Microelectronics and Computer Engineering © 2014 UNIVERSITY OF ROSTOCK | College of Computer Science and Electrical Engineering.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
DATA TRANSMISSION SYSTEM MIKE REVNELL. OUTLINE Top level specifications Basic architecture Fiber plant DTS module Digitizers Formatter Deformatter Transition.
Computes the partial dot products for only the diagonal and upper triangle of the input matrix. The vector computed by this architecture is added to the.
1 Improving Chromatic Dispersion Tolerance in Long-Haul Fibre Links using Coherent OOFDM M. A. Jarajreh, Z. Ghassemlooy, and W. P. Ng Optical Communications.
Programmable logic and FPGA
Optical Network Link Budgets EE 548 Spring Reference Model.
1. 2 FPGAs Historically, FPGA architectures and companies began around the same time as CPLDs FPGAs are closer to “programmable ASICs” -- large emphasis.
Uli Schäfer 1 FPGAs for high performance – high density applications Intro Requirements of future trigger systems Features of recent FPGA families 9U *
DSP in FPGA.
High-Speed Circuits & Systems Laboratory Electronic Circuits for Optical Systems : Transimpedance Amplifier (TIA) Jin-Sung Youn
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
GPGPU platforms GP - General Purpose computation using GPU
FPGA Based Fuzzy Logic Controller for Semi- Active Suspensions Aws Abu-Khudhair.
1 DIGITAL DESIGN I DR. M. MAROUF FPGAs AUTHOR J. WAKERLY.
Viterbi Decoder Project Alon weinberg, Dan Elran Supervisors: Emilia Burlak, Elisha Ulmer.
Equalization in a wideband TDMA system
1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University.
1HSSPG Georgia Tech High Speed Image Acquisition System for Focal-Plane-Arrays Doctoral Dissertation Presentation by Youngjoong Joo School of Electrical.
Efficient FPGA Implementation of QR
A RISC ARCHITECTURE EXTENDED BY AN EFFICIENT TIGHTLY COUPLED RECONFIGURABLE UNIT Nikolaos Vassiliadis N. Kavvadias, G. Theodoridis, S. Nikolaidis Section.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
J. Christiansen, CERN - EP/MIC
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Radix-2 2 Based Low Power Reconfigurable FFT Processor Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Gin-Der Wu and Yi-Ming Liu Department.
Principles of Linear Pipelining
A Physical Resource Management Approach to Minimizing FPGA Partial Reconfiguration Overhead Heng Tan and Ronald F. DeMara University of Central Florida.
A High-Speed Hardware Implementation of the LILI-II Keystream Generator Paris Kitsos...in cooperation with Nicolas Sklavos and Odysseas Koufopavlou Digital.
Improving NoC-based Testing Through Compression Schemes Érika Cota 1 Julien Dalmasso 2 Marie-Lise Flottes 2 Bruno Rouzeyre 2 WNOC
Implementing Multiuser Channel Estimation and Detection for W-CDMA Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro and Behnaam Aazhang Rice.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
By: Daniel Barsky, Natalie Pistunovich Supervisors: Rolf Hilgendorf, Ina Rivkin Characterization Sub Nyquist Implementation Optimization 11/04/2010.
® Virtex-E Extended Memory Technical Overview and Applications.
WARP PROCESSORS ROMAN LYSECKY GREG STITT FRANK VAHID Presented by: Xin Guan Mar. 17, 2010.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
Chapter 6 Discrete-Time System. 2/90  Operation of discrete time system 1. Discrete time system where and are multiplier D is delay element Fig. 6-1.
Fast VLSI Implementation of Sorting Algorithm for Standard Median Filters Hyeong-Seok Yu SungKyunKwan Univ. Dept. of ECE, Vada Lab.
EEL 5722 FPGA Design Fall 2003 Digit-Serial DSP Functions Part I.
L9 : Low Power DSP Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
Fang Fang James C. Hoe Markus Püschel Smarahara Misra
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Channel Equalization Techniques
Instructor: Dr. Phillip Jones
Electronics for Physicists
Spartan FPGAs مرتضي صاحب الزماني.
Equalization in a wideband TDMA system
DESIGN AND IMPLEMENTATION OF DIGITAL FILTER
Anne Pratoomtong ECE734, Spring2002
CprE / ComS 583 Reconfigurable Computing
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
CprE / ComS 583 Reconfigurable Computing
Multiplier-less Multiplication by Constants
Presentation transcript:

FPGA Implementations for Volterra DFEs Andreas Emeretlis George Theodoridis

Outline Volterra Decision Feedback Equalizers Hardware Architecture PCI 2014 Outline Volterra Decision Feedback Equalizers Hardware Architecture Implementation Considerations Experimental Results and Comparisons Conclusions

Electronic Equalization in Optical Systems PCI 2014 Electronic Equalization in Optical Systems Limited capacity of optical fibers Channel impairements Chromatic Dispersion (CD) Polarization Mode Dispersion (PMD) Reduction of costly optical equalization Implementation of complex DSP algorithms Intersymbol Interference (ISI)

Decision Feedback Equalizer (DFE) PCI 2014 Decision Feedback Equalizer (DFE) General Form: Feed-Forward Filter (FFF) Pre-cursor ISI Feedback filter (FBF) Post-cursor ISI Quantizer Symbol Decision Adder Implementation challenges Pipelining the feedback loop Parallelism of quantizer loop Non-linear filters Increased complexity Hardware resources

Linear vs Non-linear DFEs PCI 2014 Linear vs Non-linear DFEs Linear DFEs Non-linear DFEs

Volterra Decision Feedback Equalizer PCI 2014 Volterra Decision Feedback Equalizer Direct Detection  Non-linear distortion 2nd order Volterra filters (VDFE) Sensitivity to sampling phase Fractional Spacing  Processing of 2 samples/symbol

Outline Volterra Decision Feedback Equalizers Hardware Architecture PCI 2014 Outline Volterra Decision Feedback Equalizers Hardware Architecture Implementation Considerations Experimental Results and Comparisons Conclusions

Feed-forward Transformations PCI 2014 Feed-forward Transformations Parallelism Unrolling the filter equation Pipelining Registers between filter elements Synchronization registers

Feedback Transformations PCI 2014 Feedback Transformations Loop Precomputation Computational units in the FF part Multiplexer loop Loop Pipelining Lookahead Loop Unrolling FB Input FB Output Î(n-1) Î(n-2) Î(n-3) yB(n) 1 b33= bp1 b22= bp2 b22+ b23+ b33= bp3 b11= bp4 b11+ b13+ b33= bp5 b11+ b12+ b22= bp6 b11+ b12+ b13+ b22 + b23+ b33= bp7 J0(n)=Î(n-1) Î(n-1) J1(n)=Î(n-1) Î(n-2) J2(n)=Î(n-1) Î(n-3)

Feedback Architectures – Area Reduction PCI 2014 Feedback Architectures – Area Reduction Straightforward Approach Incremental Processing Approach L-3 stages N L L-2 stages L-1 stages L-1 stages L-N

Outline Volterra Decision Feedback Equalizers Hardware Architecture PCI 2014 Outline Volterra Decision Feedback Equalizers Hardware Architecture Implementation Considerations Experimental Results and Comparisons Conclusions

Employed FPGA Platform (1/2) PCI 2014 Employed FPGA Platform (1/2) Configurable Logic Architecture Configurable Logic Blocks (CLB) CLBs are interconnected via Switch Matrix CLB  2 Slices Slice  4 Look-Up-Tables, Carry Computation Chain, 8 Flip-Flops Drawbacks Predefined geometry High routing delay No 100% occupation of each slice

Employed FPGA Platform (2/2) PCI 2014 Employed FPGA Platform (2/2) Hardcore DSP Logic Architecture On-chip hardwired modules Low area occupation High-speed implementation of DSP algorithms Dedicated high-speed interconnection resources DSP48E1 Slice 25 × 18 bits multiplier 48 bits accumulator Bypass multiplexers SIMD adder Internal pipeline registers Cascading I/O ports

Implementation Considerations: Wordlength PCI 2014 Implementation Considerations: Wordlength Input: 7 bits 6 bits fractional Volterra inputs: 9 bits 8 bits fractional Coefficients: 13 bits 12 bits fractional Datapath: 14 bits 13 bits fractional

Implementation Considerations PCI 2014 Implementation Considerations FIR filters Pipelined Mul-Add modules Adder cascades Volterra Kernel Pipelined standalone adders Fabric interconnection of DSP slices Pre-computation stage Manual SIMD mode (3 × 14 bits)

Outline Volterra Decision Feedback Equalizers Hardware Architecture PCI 2014 Outline Volterra Decision Feedback Equalizers Hardware Architecture Implementation Considerations Experimental Results and Comparisons Conclusions

Experimental Results Straightforward Approach PCI 2014 Experimental Results Straightforward Approach Incremental Processing Approach Speed [Gb/s] Parallel/ Pipeline Level Freq. [MHz] Area Slices DSPs 5 12/37 417 3,781 744 V6 7 18/43 405 5,812 1,116 10 25/50 400 9,941 1,550 11/27 463 3,021 682 V7 16/30 443 4,962 992 24/31 428 8,047 1,488 Speed [Gb/s] Parallel/ Pipeline Level Freq. [MHz] Area Slices DSPs 5 12/35 419 3,537 744 V6 7 17/49 418 4,646 1,054 10 24/70 417 6,324 1,488 10/42 503 3,068 620 V7 15/38 467 4,543 930 24/41 430 6,911

Experimental Results: Performance Comparison PCI 2014 Experimental Results: Performance Comparison Straightforward Approach Incremental Processing Approach

Experimental Results: DSP Utilization Comparison PCI 2014 Experimental Results: DSP Utilization Comparison Straightforward Approach Incremental Processing Approach

Outline Volterra Decision Feedback Equalizers Hardware Architecture PCI 2014 Outline Volterra Decision Feedback Equalizers Hardware Architecture Implementation Considerations Experimental Results and Comparisons Conclusions

Conclusions Not predictable performance PCI 2014 Conclusions Not predictable performance Important progress of reconfigurable technology Efficiency of hardwired modules FPGA: suitable platform for high-speed communications 10 Gb/s with ~50% of DSPs 17 Gb/s with ~100% of DSPs

Thank you for your attention Questions? PCI 2014 Thank you for your attention Questions?