Download presentation
Presentation is loading. Please wait.
1
ICCS-NTUA : WP1+WP2 Prof. Petros Maragos NTUA, School of ECE URL: http://cvsp.cs.ntua.grhttp://cvsp.cs.ntua.gr Computer Vision, Speech Communication and Signal Processing Research Group HIWIRE
2
ICCS - NTUA HIWIRE Meeting, Granada June 2005 Group Leader : Prof. Petros Maragos Ph.D. Students / Graduate Research Assistants : D. Dimitriadis (speech: recognition, modulations) V. Pitsikalis (speech: recognition, fractals/chaos, NLP) A. Katsamanis (speech: modulations, statistical processing, recognition) G. Papandreou (vision: PDEs, active contours, level sets, AV-ASR) G. Evangelopoulos (vision/speech: texture, modulations, fractals) S. Leykimiatis (speech: statistical processing, microphone arrays) HIWIRE Involved CVSP Members
3
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ICCS-NTUA in HIWIRE: 1 st Year Evaluation DatabasesCompleted BaselineCompleted WP1 Noise Robust FeaturesResults 1 st Year Audio-Visual ASR Baseline + Visual Features Multi-microphone arrayExploratory Phase VADPrelim. Results WP2 Speaker NormalizationBaseline Non-native Speech DatabaseCompleted
4
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ICCS-NTUA in HIWIRE: 1 st Year Evaluation DatabasesCompleted BaselineCompleted WP1 Noise Robust FeaturesResults 1 st Year Modulation Features Results 1 st Year Fractal Features Results 1 st Year Audio-Visual ASR Baseline + Visual Features Multi-microphone arrayExploratory Phase VADPrelim. Results WP2 Speaker NormalizationBaseline Non-native Speech DatabaseCompleted
5
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 WP1: Noise Robustness Platform: HTK Baseline + Evaluation: Aurora 2, Aurora 3, TIMIT+NOISE Modulation Features AM-FM Modulations Teager Energy Cepstrum Fractal Features Dynamical Denoising Correlation Dimension Multiscale Fractal Dimension Hybrid-Merged Features up to +62 % (Aurora 3) up to +36% (Aurora 2) up to +61 % ( Aurora 2)
6
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ICCS-NTUA in HIWIRE: 1 st Year Evaluation DatabasesCompleted BaselineCompleted WP1 Noise Robust FeaturesResults 1 st Year Speech Modulation FeaturesResults 1 st Year Fractal Features Results 1 st Year Audio-Visual ASR Baseline + Visual Features Multi-microphone arrayExploratory Phase VADPrelim. Results WP2 Speaker NormalizationBaseline Non-native Speech DatabaseCompleted
7
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Speech Modulation Features Filterbank Design Short-Term AM-FM Modulation Features Short-Term Mean Inst. Amplitude IA-Mean Short-Term Mean Inst. Frequency IF-Mean Frequency Modulation Percentages FMP Short-Term Energy Modulation Features Average Teager Energy, Cepstrum Coef. TECC
8
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Modulation Acoustic Features Speech Nonlinear Processing Demodulation Robust Feature Transformation/ Selection Regularization + Multiband Filtering Statistical Processing V.A.D. Energy Features: Teager Energy Cepstrum Coeff. TECC AM-FM Modulation Features: Mean Inst. Ampl. IA-Mean Mean Inst. Freq. IF-Mean Freq. Mod. Percent. FMP
9
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 TIMIT-based Speech Databases TIMIT Database: Training Set: 3696 sentences, ~35 phonemes/utterances Testing Set: 1344 utterances, 46680 phonemes Sampling Frequency 16 kHz Feature Vectors: MFCC+C0+AM-FM+1 st +2 nd Time Derivatives Stream Weights: (1) for MFCC and (2) for ΑΜ-FM 3-state left-right HMMs, 16 mixtures All-pair, Unweighted grammar Performance Criterion: Phone Accuracy Rates (%) Back-end System: HTK v3.2.0
10
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: TIMIT+Noise Up to +106%
11
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Aurora 3 - Spanish Connected-Digits, Sampling Frequency 8 kHz Training Set: WM (Well-Matched): 3392 utterances (quiet 532, low 1668 and high noise 1192 MM (Medium-Mismatch): 1607 utterances (quiet 396 and low noise 1211) HM (High-Mismatch): 1696 utterances (quiet 266, low 834 and high noise 596) Testing Set: WM: 1522 utterances (quiet 260, low 754 and high noise 508), 8056 digits MM: 850 utterances (quiet 0, low 0 and high noise 850), 4543 digits HM: 631 utterances (quiet 0, low 377 and high noise 254), 3325 digits 2 Back-end ASR Systems (ΗΤΚ and BLasr) Feature Vectors: MFCC+AM-FM (or Auditory+ΑM-FM), TECC All-Pair, Unweighted Grammar (or Word-Pair Grammar) Performance Criterion: Word (digit) Accuracy Rates
12
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 3 (HTK) Up to +62%
13
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Databases: Aurora 2 Task: Speaker Independent Recognition of Digit Sequences TI - Digits at 8kHz Training (8440 Utterances per scenario, 55M/55F) Clean (8kHz, G712) Multi-Condition (8kHz, G712) 4 Noises (artificial): subway, babble, car, exhibition 5 SNRs : 5, 10, 15, 20dB, clean Testing, artificially added noise 7 SNRs: [-5, 0, 5, 10, 15, 20dB, clean] A: noises as in multi-cond train., G712 (28028 Utters) B: restaurant, street, airport, train station, G712 (28028 Utters) C: subway, street (MIRS) (14014 Utters)
14
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +12%
15
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Work To Be Done on Modulation Features Refinements w.r.t. AM-FM Features Fusion w. Other Features
16
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ICCS-NTUA in HIWIRE: 1 st Year Evaluation DatabasesCompleted BaselineCompleted WP1 Noise Robust FeaturesResults 1 st Year Speech Modulation Features Results 1 st Year Fractal FeaturesResults 1 st Year Audio-Visual ASR Baseline + Visual Features Multi-microphone arrayExploratory Phase VADPrelim. Results WP2 Speaker NormalizationBaseline Non-native Speech DatabaseCompleted
17
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Fractal Features N-d Cleaned Embedding N-d Signal Local SVD speech signal Filtered Dynamics - Correlation Dimension (8) Noisy Embedding Filtered Embedding FDCD Multiscale Fractal Dimension (6) MFD Geometrical Filtering
18
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Databases: Aurora 2 Task: Speaker Independent Recognition of Digit Sequences TI - Digits at 8kHz Training (8440 Utterances per scenario, 55M/55F) Clean (8kHz, G712) Multi-Condition (8kHz, G712) 4 Noises (artificial): subway, babble, car, exhibition 5 SNRs : 5, 10, 15, 20dB, clean Testing, artificially added noise 7 SNRs: [-5, 0, 5, 10, 15, 20dB, clean] A: noises as in multi-cond train., G712 (28028 Utters) B: restaurant, street, airport, train station, G712 (28028 Utters) C: subway, street (MIRS) (14014 Utters)
19
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +40%
20
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +27%
21
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +61%
22
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Future Directions on Fractal Features Refine Fractal Feature Extraction. Application to Aurora 3. Fusion with other features.
23
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ICCS-NTUA in HIWIRE: 1 st Year Evaluation DatabasesCompleted BaselineCompleted WP1 Noise Robust FeaturesResults 1 st Year Audio-Visual ASR Baseline + Visual Features Multi-microphone arrayExploratory Phase VADPrelim. Results WP2 Speaker NormalizationBaseline Non-native Speech DatabaseCompleted
24
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Visual Front-End Aim: Extract low-dimensional visual speech feature vector from video Visual front-end modules: Speaker's face detection ROI tracking Facial Model Fitting Visual feature extraction Challenges: Very high dimensional signal - which features are proper? Robustness Computational Efficiency
25
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Face Modeling ● A well studied problem in Computer Vision: ● Active Appearance Models, Morphable Models, Active Blobs ● Both Shape & Appearance can enhance lipreading ● The shape and appearance of human faces “live” in low dimensional manifolds = =
26
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Image Fitting Example step 2step 6step 10 step 14step 18
27
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Example: Face Interpretation Using AAM original video shape track superimposed on original video reconstructed face This is what the visual-only speech recognizer “sees”! ● Generative models like AAM allow us to evaluate the output of the visual front-end
28
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Evaluation on the CUAVE Database
29
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Audio-Visual ASR: Database Subset of CUAVE database used: 36 speakers (30 training, 6 testing) 5 sequences of 10 connected digits per speaker Training set: 1500 digits (30x5x10) Test set: 300 digits (6x5x10) CUAVE database also contains more complex data sets: speaker moving around, speaker shows profile, continuous digits, two speakers (to be used in future evaluations) CUAVE was kindly provided by the Clemson University
30
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Recognition Results (Word Accuracy) Data Training: ~500 digits (29 speakers) Testing: ~100 digits (4 speakers) AudioVisualAudiovisual Classification99%46%85% Recognition98%26%78%
31
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Future Work Visual Front-end Better trained AAM Temporal tracking Feature fusion Experimentation with alternative DBN architectures Automatic stream weight determination Integration with non-linear acoustic features Experiments on other audio-visual databases Systematic evaluation of visual features
32
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ICCS-NTUA in HIWIRE: 1 st Year Evaluation DatabasesCompleted BaselineCompleted WP1 Noise Robust FeaturesResults 1 st Year Modulation Features Results 1 st Year Fractal Features Results 1 st Year Audio-Visual ASR Baseline + Visual Features Multi-microphone arrayExploratory Phase VADPrelim. Results WP2 Speaker NormalizationBaseline Non-native Speech DatabaseCompleted
33
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 User Robustness, Speaker Adaptation VTLN Baseline Platform: HTK Database: AURORA 4 Fs = 8 kHz Scenarios: Training, Testing Comparison with MLLR Collection of non-Native Speech Data Completed 10 Speakers 100 Utterances/Speaker
34
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Vocal Tract Length Normalization Implementation: HTK Warping Factor Estimation Maximum Likelihood (ML) criterion Frequency Warping Figures from Hain99, Lee96
35
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 VTLN Training AURORA 4 Baseline Setup Clean (SIC), Multi-Condition (SIM), Noisy (SIN) Testing Estimate warping factor using adaptation utterances (Supervised VTLN) Per speaker warping factor (1, 2, 10, 20 Utterances) 2-pass Decoding 1 st pass Get a hypothetical transcription Alignment and ML to estimate per utterance warping factor 2 nd pass Decode properly normalized utterance
36
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Databases: Aurora 4 Task: 5000 Word, Continuous Speech Recognition WSJ0: (16 / 8 kHz) + Artificially Added Noise 2 microphones: Sennheiser, Other Filtering: G712, P341 Noises: Car, Babble, Restaurant, Street, Airport, Train Station Training (7138 Utterances per scenario) Clean: Sennheiser mic. Multi-Condition: Sennheiser – Other mic., 75% w. artificially added noise @ SNR: 10 – 20 dB Noisy: Sennheiser, artificially added noise SNR: 10 – 20 dB Testing (330 Utterances – 166 Utterances each. Speaker # = 8) SNR: 5-15 dB 1-7: Sennheiser microphone 8-14: Other microphone
37
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 VTLN Results, Clean Training
38
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 VTLN Results, Multi-Condition Training
39
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 VTLN Results, Noisy Training
40
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Future Directions for Speaker Normalization Estimate warping transforms at signal level Exploit instantaneous amplitude or frequency signals to estimate the warping parameters, Normalize the signal Effective integration with model-based adaptation techniques (collaboration with TSI)
41
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ICCS-NTUA in HIWIRE: 1 st Year Evaluation DatabasesCompleted BaselineCompleted WP1 Noise Robust FeaturesResults 1 st Year Audio-Visual ASR Baseline + Visual Features Multi-microphone arrayExploratory Phase VADPrelim. Results WP2 Speaker NormalizationBaseline Non-native Speech DatabaseCompleted
42
WP1: Appendix Slides Aurora 3
43
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 ASR Results Ι 18.1755.9732.8426.1538.6043.6959.92 MFCC*+ FMP 55.30 56.50 54.35 52.75 TIMIT+ Car TIMIT-Based Speech Databases (Correct Phone Accuracies (%)) 15.5830.9225.3836.8743.7059.34 MFCC*+ IF-Mean * MFCC+C 0 +D+DD, # states=3, # mixtures=16 17.6231.0526.0339.2543.5359.61 MFCC*+ IA-Mean 24.2638.4034.7441.6142.4058.89 TEner. CC -18.6017.7227.7142.4258.40MFCC* Av. Rel. Improv. TIMIT+ Pink TIMIT+ White TIMIT+ Babble NTIMITTIMIT Features
44
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Experimental Results IIa (HTK) 59.59 %89.8782.7392.4694.41MFCC*+FMP Aurora3 (Spanish Task) (Correct Word Accuracies (%)) 84.20 87.99 90.70 83.86 74.93 Average -51.5580.3192.94 Aurora Front-End (WI007) * MFCC+log(Ener)+D+DD+CMS, # states=14, # mixtures=14 72.36 77.70 86.85 65.18 HM 36.98 %89.5290.71MFCC*+IF-Mean 52.09 %92.2294.05MFCC*+IA-Mean 62.90 %91.6193.64 TEnerCC+log(Ener ) +CMS +35.62 %92.7393.68MFCC* Av. Rel. Improv. MMWM Scenario Features
45
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Aurora 3 Configs HM States 14, Mix’s 12 MM States 16, Mix’s 6 WM States 16, Mix’s 16
46
WP1: Appendix Slides Aurora 2
47
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Baseline: Aurora 2 Database Structure: 2 Training Scenarios, 3 Test Sets, [4+4+2] Conditions, 7 SNRs per Condition: Total of 2x70 Tests Presentation of Selected Results: Average over SNR. Average over Condition. Training Scenarios: Clean- v.s Multi- Train. Noise Level: Low v.s. High SNR. Condition: Worst v.s. Easy Conditions. Features: MFCC+D+A v.s. MFCC+D+A+CMS Set up: # states 18 [10-22], # mix’s [3-32], MFCC+D+A+CMS
48
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Average Baseline Results: Aurora 2 57.58 84.17 Developed Baseline Baseline* 82.0484.0481.9078.66 Multi 59.1055.4858.26 Clean Best CMS Best PLAIN CMSPLAIN Training Scenario * Average HTK results as reported with the database. Average over all SNR’s and all Conditions Plain: MFCC+D+A, CMS: MFCC+D+A+CMS. Mixture #: Clean train (Both Plain,CMS) 3, Multi train Plain: 22, CMS: 32. Best: Select for each condition/noise the # mix’s with the best result.
49
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +12%
50
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +40%
51
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +27%
52
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Results: Aurora 2 Up to +61%
53
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Aurora 2 Distributed, Multicondition Training Multicondition Training - Full AB C Subw ay Babb leCar Exhibiti on Avera ge Restaur ant Stre et Airp ort Stati on Avera ge Subway M Street M Avera ge Clean98,68 98,5 2 98, 3998,4998,5298,68 98,5 2 98,3 9 98,4 998,5298,5098,5898,5498,52 20 dB97,61 97,7 3 98, 0397,4197,7096,87 97,5 8 97,4 4 97,0 197,2397,3096,5596,9397,35 15 dB96,47 97,0 4 97, 6196,6796,9595,30 96,3 1 96,1 2 95,5 395,8296,3595,5395,9496,29 10 dB94,44 95,2 8 95, 7494,1194,8991,96 94,3 5 93,2 9 92,8 793,1293,3492,5092,9293,79 5 dB88,36 87,5 5 87, 8087,6087,8383,54 85,6 1 86,2 5 83,5 284,7382,4182,5382,4785,52 0 dB66,90 62,1 5 53, 4464,3661,7159,29 61,3 4 65,1 1 56,1 260,4746,8254,4450,6359,00 -5dB26,13 27,1 8 20, 5824,3424,5625,51 27,6 0 29,4 1 21,0 725,9018,9124,2421,5824,50 Avera ge88,76 87,9 5 86, 5288,0387,8285,39 87,0 4 87,6 4 85,0 186,2783,2484,3183,7886,39
54
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Aurora 2 Distributed, Clean Training Clean Training - Full AB C Subw ay Babb leCar Exhibiti on Avera ge Restaur ant Stre et Airp ort Stati on Avera ge Subway M Street M Avera ge Clean98,93 99,0 0 98, 9699,2099,0298,93 99,0 0 98,9 6 99,2 099,0299,1498,9799,0699,03 20 dB97,05 90,1 5 97, 4196,3995,2589,99 95,7 4 90,6 4 94,7 292,7793,4695,1394,3094,07 15 dB93,49 73,7 6 90, 0492,0487,3376,24 88,4 5 77,0 1 83,6 581,3486,7788,9187,8485,04 10 dB78,72 49,4 3 67, 0175,6667,7154,77 67,1 1 53,8 6 60,2 959,0173,9074,4374,1765,52 5 dB52,16 26,8 1 34, 0944,8339,4731,01 38,4 5 30,3 3 27,9 231,9351,2749,2150,2438,61 0 dB26,019,28 14, 4618,0516,9510,96 17,8 4 14,4 1 11,5 713,7025,4222,9124,1717,09 -5dB11,181,57 9,3 99,607,943,47 10,4 68,238,457,6511,8211,1511,498,53 Avera ge69,49 49,8 9 60, 6065,3961,3452,59 61,5 2 53,2 5 55,6 355,7566,1666,1266,1460,06
55
WP1: Appendix Slides Audio Visual: Details
56
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Introduction: Motivations for AV-ASR Audio-only ASR does not work reliably in many scenarios: Noisy background (e.g. car's cabin, cockpit) Interference between talkers Need to enhance the auditory signal when it is not reliable Human speech perception is multimodal: Different modalities are weighed according to their reliability Hearing impaired people can lipread McGurk Effect (McGurk & MacDonald, 1976) Machines should also be able to exploit multimodal information
57
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Audio-Visual Feature Fusion Audio-visual feature integration is highly non-trivial: Audio & visual speech asychrony (~100 ms) Relative reliability of streams can vary wildly Many approaches to feature fusion in the literature: Early integration Intermediate integration Late integration Highly active research area (mainly machine learning) The class of Dynamic Bayesian Networks (DBNs) seems particularly suited for the problem: Stream interaction explicitly modeled Model parameter inference is more difficult than in HMM
58
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Visual Front-End AAM Parameters First frame of the 36 videos manually annotated 68 points on the whole face as shape landmarks Color appearance sampled at 10000 pixels Eigenvectors retained explain 70% variance 5 eigenshapes & 10 eigenfaces Initial condition at each new frame the converged solution at the previous frame Inverse-compositional gradient descent algorithm Coarse-to-fine refinement (Gaussian pyramid - 3 scales)
59
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 AV-ASR Experiment Setup Features: Audio: 39 features (MFCC_D_A) Visual (upsampled from 30 Hz to 100 Hz): 5 shape features (Sh) 10 appearance features (App) Audio-Visual: 39+45 feats (MFCC_D_A+SHAPP_D_A) Two-stream HMM 8 state, left-to-right HMM whole-digit models with no state skipping Single Gaussian observation probability densities Separate audio & video feature streams with equal weights (1,1)
60
WP1: Appendix Slides Aurora 4
61
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Aurora 4, Multi-Condition Training 7138 utterances 3569 utterances (Sennheiser mic) 3569 utterances (2nd mic) 893 (no noise added) 2676 (1 out of 6 noises added at SNRs between 10 and 20 dB) Multicondition training 2676 (1 out of 6 noises added at SNRs between 10 and 20 dB) 893 (no noise added)
62
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Aurora 4, Noisy Training 7138 utterances 3569 utterances (Sennheiser mic) 3569 utterances (Sennheiser2nd mic) 893 (no noise added) 2676 (1 out of 6 noises added at SNRs between 10 and 20 dB) Multicondition Noisy training 2676 (1 out of 6 noises added at SNRs between 10 and 20 dB) 893 (no noise added)
63
HIWIRE ICCS - NTUA HIWIRE Meeting, Granada June 2005 Aurora 4, Noisy Training 7138 utterances 3569 utterances (Sennheiser mic) 3569 utterances (Sennheiser2nd mic) 893 (no noise added) 2676 (1 out of 6 noises added at SNRs between 10 and 20 dB) Multicondition Noisy training 2676 (1 out of 6 noises added at SNRs between 10 and 20 dB) 893 (no noise added)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.