Download presentation
Presentation is loading. Please wait.
Published byGarry Stevens Modified over 9 years ago
1
Reconfigurable acceleration of robust frequency-domain echo cancellation C. H. Ho 1, K.F.C.Yiu 2, J. Huo 3, S. Nordholm 3 and W. Luk 1 1.Department of Computing, Imperial College London 2.Department of Industrial and Manufacturing Systems Engineering, The University of Hong Kong 3.Western Australian Telecommunications Research Institute, The University of Western Australia
2
2 Introduction Echo: affects many communication systems hands-free telephony VoIP Adaptive filters employed to cancel echo Computationally intensive involves large number of data
3
3 Achievements 1.Novel reconfigurable architecture for two- path frequency-domain echo cancellation 2.Bit-width optimisations with fixed-point saturation arithmetic 3.Single core: 12.5 times faster than 3.2GHz Pentium-4 machine
4
4 Background: transmitting signals Given input signal x(n), speech v(n), impulse response h(n), then return signal y(n) is x(n) h(n) v(n) y(n)
5
5 Echo filtering Search for filter coefficients to eliminate the input signals,
6
6 Delayless sub-band filtering downsampling input signals and error signals compute the coefficient of each sub-band apply to with weighted transform x(n) A(z) D h 0 (n) h 1 (n) h M-1 (n) Weighted Transform A(z) D - y(n) e(n)
7
7 Robust two-path adaptive filtering Two filter coefficient foreground coefficients background coefficients Power level to select the coefficients high power foreground coefficients low power background coefficients
8
8 Hardware architecture Support core operations signals transform and filtering Fast Fourier transform transform the input and error signal to frequency domain with multi sub-band Complex number multiplication perform the adaptive filtering Inverse Fourier transform transform the filtered signal to time domain
9
9 Datapath Hf, Hb: coefficients buf: interface between core and hosts X: signals in frequency domain
10
10 Design optimisation Optimise number representation explore quantisation error for different bitwidths Avoid overflow fixed-point number format + saturation arithmetic Compare results double-precision floating point arithmetic Use of pre-placed FFT core increase the throughput of FFT Multiple instances of adaptive filter in an FPGA increase the throughput of the overall systems
11
11 Bitwidth optimisation Perform filtering without introduce any near-end signal expected result all echo-signal is filtered Choose the smallest bitwidth which can perform the filtering effectively
12
12 Results: fixed point optimisation i = 10 f = 10 i = 10 f = 14 i = 10 f = 18 i = 10 f = 54
13
13 Filter performance near-end signals mixed signals filtered signals using double precision arithmetic filtered signals using optimised fixed point arithmetic
14
14 FPGA Filter implementations FPGA chipsXC4VSX55XC3S5000 Slices4372(17%)5255 (15%) DSP48/MULT52 (10%)48 (46%) Block RAM24 (7%)24 (23%) Frequency180.0MHz98.8MHz Throughput (samples per second) 12.5M6.87M Faster than Pentium 4 at 3.2GHz 13.2 times7.2 times
15
15 Multiple instances on XC4VSX55
16
16 Current and future work Embedded system Bit-width optimisation Power and energy consumption Run-time reconfiguration Adaptive filtering
17
17 Summary 1.Novel reconfigurable architecture for two- path frequency-domain echo cancellation 2.Bit-width optimisations with fixed-point saturation arithmetic 3.Single core: 12.5 times faster than 3.2GHz Pentium-4 machine
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.