Download presentation
Presentation is loading. Please wait.
1
An FPGA Based Adaptive Viterbi Decoder Sriram Swaminathan Russell Tessier Department of ECE University of Massachusetts Amherst
2
6/26/2015 Overview Introduction Objectives Background Adaptive Viterbi Algorithm Architecture and Implementation Issues Results Related Work Summary and Future Work
3
6/26/2015 Introduction A Digital Data Communication System Channel Encoder Sink Source Decoder Source Encoder Channel Decoder Source Noise informationBitstream Bitstream with redundancy Bitstream Modulator DeModulator Convolutional encoder Viterbi
4
6/26/2015 Goals Implement Adaptive Viterbi Algorithm on hardware Constraints Data rate (or throughput) - 20 Kbps Probability of Error or Bit Error Rate (BER) < 10 -5 # of errors / Length of Sequence Minimize Design-time area
5
6/26/2015 Convolutional Encoder Accepts information bits as a continuous stream Operates on the current b-bit input, where b ranges from 1 to 6 and some number of immediately preceding b-bit inputs to produce V output bits, V > b FF + + 1 01 0 0 b =1, V =2
6
6/26/2015 Definitions Constraint Length Number of successive b-bit groups of information bits for each encoding operation Denoted by K Code Rate (or) Rate b/V Typical values K : 7 Rate : 1/2, 1/3
7
6/26/2015 The Viterbi Algorithm Finds a bit-sequence in the set of all possible transmitted bit-sequences that most closely resembles the received data. Maximum likelihood algorithm Each bit received by decoder associated with a measure of correctness. Practical for short constraint length convolutional codes
8
6/26/2015 00 10 11 01 0/00 1/11 1/01 1/10 0/01 0/11 1/00 0/10 State diagram State Encoder memory Branch k/ij, where i and j represent the output bits associated with input bit k
9
6/26/2015 Trellis Diagram 00 01 10 11 00 11 10 01 10 01 00 10 T=0T=1T=2T=3 ENC IN : 0 1 0 ENC OUT : 00 11 10 RECEIVED: 00 11 11 Accumulated metric 2+2,3+0 : 3 0+1,3+1 : 1 2+0,3+1 : 2 0+1,3+1 : 1 00 3 2 2 31 3 02 1 K = 3 Rate ½ Total number of states = 2 K-1
10
6/26/2015 Adaptive Viterbi Algorithm Motivation Extremely large memory and logic for Viterbi Algorithm Fewer number of paths retained Reduced memory and computation Definitions Path – Bit sequence Path metric or cost – Accumulated error metric of a path Survivor – Path which is retained for the subsequent time step
11
6/26/2015 Adaptive Viterbi Algorithm Criterion for path survival 1. A threshold T is introduced such that a path is retained if and only if current path metric is less than d m +T, where d m is the minimum cost among all survivors of the previous time step. 2. The total number of survivors per time step is limited to a critical number called N max selected by user. Only best N max paths have to be retained at any time.
12
6/26/2015 Trellis Diagram for AVA
13
6/26/2015 Parameters in the algorithm Constraint length K Truncation length, T L Rate R Threshold T Maximum # of paths per time N max
14
6/26/2015 Influence of Threshold T and N max Threshold T Smaller T, low average # of survivors, increased BER Larger T, high average # of survivors, reduced BER Nmax Smaller N max Possibility of discarding the best path => high BER Smaller area Larger N max Reduced BER Larger area Selection of N max and T crucial
15
6/26/2015 Variation of BER with T and N max for K = 9 & 14 K = 9, SNR = 3.1 db, T L =45 K = 14, SNR = 2.5 db, T L =70 T=24 N max = 41 T=18 N max = 9
16
6/26/2015 Optimal values of Nmax, T and T L for different K’s K T L N max T 4 20 4 14 5 25 7 14 6 30 8 18 7 35 8 17 8 40 8 17 9 45 9 18 10 50 21 20 11 55 25 23 12 60 25 23 14 70 41 24
17
6/26/2015 Simplified View of Adaptive Viterbi Decoder Branch metric generator Add Compare Select Survivor Memory Logic for d i < d m + T Symbols from channel Branch metrics Decision Bits Decoded output
18
6/26/2015 Survivor Memory Truncation length N max Store all possible bit- sequences(paths) before making a decision Size of memory for Viterbi : Rows : N max Columns : Truncation Length - (3-5) * K Two schemes Traceback Large Latency, small area, low power Register Exchange Fast, Large area, large power
19
6/26/2015 Practical Considerations Serial Implementation Same ACS repeatedly used for all states Small area, Inexpensive Slow, Low throughput (data rate) Parallel Implementation Each State has its own ACS (2 K-1 ACS) Fast, High throughput (data rate) Large area, bottleneck for large K values
20
6/26/2015 Architecture
21
6/26/2015 Architecture (contd.) Add b1 sum1 b2 sum2 d i < d m + T Count paths Count < N max T = T-2 yes no Update memory yes Elimination of sorting
22
6/26/2015 System Model Test-bench
23
6/26/2015 FPGA Implementation FPGA can exploit the parallelism Dynamic reconfiguration for performance enhancement Implementation platform WildOne-XL FPGA board from Annapolis Microsystems Inc. 2 XC4036 FPGAs, one for user application Simulation on Virtex XCV1000
24
6/26/2015 Hardware implementation RTL description in VHDL HDL Simulation Synthesis FPGA Mapping, place and route Cadence Affirma tools Synplicity Synplify Pro Xilinx Foundation 2.1i FPGA XC4036XL-08
25
6/26/2015 XC4036XL FPGA Resource utilization K T L N max T 4 20 4 14 5 25 7 14 6 30 8 18 7 35 8 17 8 40 8 17 9 45 9 18 10 50 21 20 11 55 25 23 12 60 25 23 14 70 41 24 K CLBs LUTs FFs 4i/p 3i/p 6 1206 2081 482 724 7 1215 2087 537 756 8 1284 2119 654 788 9 1296 2213 615 820 4 553 978 196 278 5 1194 2046 340 540
26
6/26/2015 Decoding rate on XC4036 FPGA Overheads 32-bit, 33 MHz PCI bus Execution of Wildone API using VC++ Slowdown 1.5-2 times FPGA freq.(MHz) 40.455 20.089 19.857 19.674 17.576 17.316
27
6/26/2015 Issues in Reconfiguration Reconfigurable Units Number of ACS units (depends on number of survivors) Run-time survivor memory Reconfiguration types Fine-grained - infeasible Coarse-grained - feasible Motivation Performance improvement Tradeoff Small SNR (noisy channel), Large K, slow decoding Large SNR (less noisy channel), Small K, fast decoding Maintain approx. same BER
28
6/26/2015 Coarse-timescale reconfiguration 20.9 % performance improvement over static Less Noisy channel
29
6/26/2015 Coarse-timescale reconfiguration – Experimental Approach Vary channel noise during transmission Noise changes ~ 250,000 bits or ~1.5 to 2.5 seconds If noise change is detected Download new decoder configuration content to the FPGA on WildOne board Reconfiguration overhead ~40 mS PCI bus transfer + Noise change detection + download bitstream
30
6/26/2015 Comparison with microprocessor Intel Celeron 366 MHz, 128 MB RAM Speed-up Up to 7.5X for XC4036 (incl. overheads)
31
6/26/2015 Conclusions and future work A new adaptive Viterbi decoder dynamically reconfigurable ~21 % improvement over static Scales linearly Speed-up up to 7.5X over a microprocessor Future Research Extend present concept to Power-aware dynamic reconfiguration
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.