Download presentation
Presentation is loading. Please wait.
Published bySarah Watson Modified over 9 years ago
1
Signal Processing Algorithms for Wireless Acoustic Sensor Networks Alexander Bertrand Electrical Engineering Department (ESAT) Katholieke Universiteit Leuven 06-07-2010, University of Oldenburg, MEDI-AKU-SIGNAL Kolloquium
2
Outline 1.Introduction 2.Multi-channel Wiener filter (MWF) 3.Example: distributed MWF in binaural hearing aids 4.DANSE in fully connected WASN 5.Tree-DANSE 6.Multi-speaker VAD Tracking of speech power Noise reduction
3
Outline 1.Introduction 2.Multi-channel Wiener filter (MWF) 3.Example: distributed MWF in binaural hearing aids 4.DANSE in fully connected WASN 5.Tree-DANSE 6.Multi-speaker VAD
4
4 Traditional sensor array DSP centralized processing known / fixed sensor positions Sensor array DSP Long distance (SNR drops 6dB for each doubling of distance) Sharp angle #microphones is limited
5
5 Distributed sensor arrays Wireless acoustic sensor network (WASN) More spatial information More sensors Subset: high SNR recordings
6
6 Challenges 3) Distributed processing 1) Unknown/changing positions, link failure ADAPTIVE 2) Bandwidth efficiency 4) Subset selection Distributed sensor arrays
7
Outline 1.Introduction 2.Multi-channel Wiener filter (MWF) 3.Example: distributed MWF in binaural hearing aids 4.DANSE in fully connected WASN 5.Tree-DANSE 6.Multi-speaker VAD
8
Multi-channel Wiener Filtering (MWF) - Goal: estimate speech component in 1 of the N microphones - Output = sum of filtered microphone signals: W1 W2 W3 W4 + Clean speech
9
Multi-channel Wiener Filtering (MWF) - Goal: estimate speech component in 1 of the N microphones - Output = sum of filtered microphone signals: W1 W2 W3 W4 + Clean speech
10
Multi-channel Wiener Filtering (MWF) - Goal: estimate speech component in 1 of the N microphones - Output = sum of filtered microphone signals: - Needs: - N x N noise+speech correlation matrix R yy - N x 1 clean speech correlation (column of R dd ) - R dd can be estimated using R dd = R yy - R nn using voice activity detection (VAD) mechanism W1 W2 W3 W4 + Clean speech
11
Multi-channel Wiener Filtering (MWF) RECAP - Given: N microphone signals - Choose one (arbitrary) reference microphone - MWF computes optimal filters such that sum of outputs is as close as possible to speech component in target microphone
12
Noise frame: destructive interference Noise = electro music F1 F2 F3 F4 +
13
Noise = electro music F1 F2 F3 F4 + Speech frame: constructive interference
14
Outline 1.Introduction 2.Multi-channel Wiener filter (MWF) 3.Example: distributed MWF in binaural hearing aids 4.DANSE in fully connected WASN 5.Tree-DANSE 6.Multi-speaker VAD 7.Subset selection 8.Conclusions
15
15 Example: binaural hearing aids MWF leftMWF right Binaural link large bandwidth needed full matrix inversion = 2-node WASN
16
16 Example: binaural hearing aids w 11 Binaural link g 12 + g 21 w 22 + Converges to optimum if single desired source (Doclo et al., 2007)
17
17 Motivation for DANSE > 2 nodes ? e.g. supporting external sensor nodes or multiple hearing aid users.
18
18 Motivation for DANSE > 2 nodes ? e.g. supporting external sensor nodes or multiple hearing aid users.
19
19 Motivation for DANSE > 2 nodes ? e.g. supporting external sensor nodes or multiple hearing aid users.
20
20 Motivation for DANSE > 2 nodes ? e.g. supporting external sensor nodes or multiple hearing aid users.
21
21 Motivation for DANSE > 2 nodes Multiple desired sources e.g. conversation monitoring.
22
22 Motivation for DANSE > 2 nodes Multiple desired sources e.g. conversation monitoring.
23
Outline 1.Introduction 2.Multi-channel Wiener filter (MWF) 3.Example: distributed MWF in binaural hearing aids 4.DANSE in fully connected WASN 5.Tree-DANSE 6.Multi-speaker VAD
24
24 DANSE Previous requires more general framework: Distributed adaptive node-specific signal estimation (DANSE) Allows for multiple nodes (fully connected topology) Allows for multiple target sources: Estimating K sources requires communication of K-channel signals (DANSE K )
25
DANSE Considered here: Fully connected WSN Multi-channel sensor signal observations Goal: each node estimates node-specific signal, but common latent signal subspace (dimension= # targets)
26
26 3 nodes, fully connected
27
27 Binaural hearing aids (revisited) w 11 Binaural link g 12 + g 21 w 22 +
28
28 w 11 (2) Binaural link g 12 (2) ++ w 11 (1) g 12 (1) w 22 (2)g 21 (2) w 22 (1) g 21 (1) Converges to optimum if #desired sources ≤ 2 J=2, DANSE 2 (K=2) auxiliary channels (capture signal space) Binaural hearing aids (revisited)
29
29 Binaural link ++ J=2, DANSE K Converges to optimum if K= # desired sources Binaural hearing aids (revisited)
30
Sequential updating Sequential round-robin update
31
31 DANSE with simultaneous updating - Simultaneous updating: parallel computing - Sometimes convergence to optimal solution, but not always - Solution: relaxation yields convergence and optimality:
32
32 Without relaxation (S-DANSE) 4 nodes, 3-6 sensors/node DANSE with simultaneous updating
33
33 With relaxation (rS-DANSE) 4 nodes, 3-6 sensors/node DANSE with simultaneous updating
34
34 DANSE audio demo (tracking omitted) Unfiltered rS-DANSE Centralized MWF
35
35 Robust DANSE - Theory: DANSE == centralized MWF, but…
36
36 Robust DANSE - Numerical errors due to: -Estimation errors in R dd (especially at low SNR nodes) ripple effect -Reference microphones are close to each other ill-conditioned basis for signal subspace - Solution: estimate speech component in communicated signals, preferably from high SNR nodes (= Robust DANSE or R-DANSE) - Convergence is proven under certain dependency conditions
37
Outline 1.Introduction 2.Multi-channel Wiener filter (MWF) 3.Example: distributed MWF in binaural hearing aids 4.DANSE in fully connected WASN 5.Tree-DANSE 6.Multi-speaker VAD
38
What if not fully connected?
39
Nodes must pass on information from other nodes 1) Nodes act as relays (virtually fully connected): - huge increase in bandwidth if limited connections - routing problem 2) Nodes broadcast the sum of all filtered inputs: - no increase in bandwidth - no routing problem (?)
40
40 What if not fully connected?
41
FEEDBACK !! What if not fully connected?
42
- Intuition - Theoretical analysis - Conclusion: feedback causes major problems - Direct feedback (one edge) vs. indirect feedback (loops)
43
Direct feedback cancellation Transmitter feedback cancellation
44
Receiver feedback cancellation Direct feedback cancellation
45
What if not fully connected? - Intuition - Theoretical analysis - Conclusion: feedback causes major problems - Direct feedback (one edge) vs. indirect feedback (loops) - Prune to tree topology T-DANSE (= still optimal output!!)
46
Outline 1.Introduction 2.Multi-channel Wiener filter (MWF) 3.Example: distributed MWF in binaural hearing aids 4.DANSE in fully connected WASN 5.Tree-DANSE 6.Multi-speaker VAD
47
47 Multi-speaker VAD - Goal : Track individual speech power of multiple simultaneous speakers or other non-stationary sources ( VAD) - Exploit spatial diversity from WASN speaker microphone
48
48 Multi-speaker VAD Ad-hoc microphone array Assumptions: 1.Speakers in near-field 2.Speakers are independent 3.Limited noise/reverberance 4.Sources to track are well-grounded (= they attain zero-values) Advantages: Array geometry unknown Speaker positions unknown Energy-based low data rate synchronization not crucial WASN’s !
49
Data model
51
Non-negative blind source separation - Theorem (Plumbley, 2002): “An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.”
52
Exploiting non-negativity and well- groundedness (J=N=2 example) s1s1 s2s2 s1s1 s2s2 y=As
53
Exploiting non-negativity and well- groundedness (J=N=2 example) s1s1 s2s2 Orthogonal transformation preserves uncorrelatedness simple decorrelation (whitening) of measurements gives original up to a rotation whiten s1s1 s2s2 ?
54
Exploiting non-negativity and well- groundedness (J=N=2 example) - Well-grounded source signals y=As s1s1 s2s2 s1s1 s2s2
55
Exploiting non-negativity and well- groundedness (J=N=2 example) - Well-grounded source signals s1s1 s2s2 whiten s1s1 s2s2 !
56
Exploiting non-negativity and well- groundedness (J=N=2 example) - Well-grounded source signals s1s1 s2s2 s1s1 s2s2
57
Non-negative blind source separation - Theorem (Plumbley, 2002): “An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.” - Two different techniques: 1.- Whitening, ignoring non-negativity constraints (=easy) - Search for rotation matrix that restores non-negativity (=hard) 2. Whitening with non-negativity constraints (=hard) - 1 st approach (Oja & Plumbley) = NPCA (Non-negative principal component analysis) - 2 nd approach (Bertrand & Moonen) = MNICA (Multiplicative non- negative independent component analysis)
58
MNICA: results
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.