Download presentation
Presentation is loading. Please wait.
Published byJasmin Holmes Modified over 9 years ago
1
To examine the feasibility of using confusion matrices from speech recognition tests to identify impaired channels, impairments in this study were simulated by setting the amplitude of one of the acoustic model channels from the acoustic waveform presented to the normal-hearing subjects to zero. Removing an acoustic model channel represents the worst-case impairment: a channel transmitting zero speech information. This is not a clinically relevant scenario, but an appropriate proof of concept, which is the intended scope of this initial study. METHOD Unimpaired model (baseline) Eight analysis / eight presentation filters (p-of-p) Logarithmically spaced from 150Hz – 6450Hz Speech processed in 2 millisecond windows Impaired models Eight impaired models [imp1, imp2, imp3, imp4, imp5, imp6, imp7, imp8] Single channel removed from model (envelope set to zero) – other channels unaffected Nine vowel tokens – [had, hawed, head, heard, heed, hid, hood, hud, who’d] Fourteen consonant tokens – [b, d, f, g, j, k, m, n, p, s, sh, t, v, z] – in /aCa/ context Three noise levels: quiet, +8 dB SNR, and +6 dB SNR Impairment measurably affects vowel recognition, smaller effect on consonant recognition The transmission of F1 and F2 is noticeably affected by the impairments. Removing channels 2 and 3 affects F1 transmission, and removing channels 5 through 7 affects F2 transmission. Inspired by Miller and Nicely information transmission analysis. Use vowel formant-to-channel mapping to determine the classification matrix assuming that the vowel formant locations are the critical discriminating features. The classification matrix for CICA can be directly calculated from the distribution of first (F1) and second (F2) vowel formants across the model channels. Calculate transmission in bits per stimulus as from the grouped confusion matrix constructed using the classification matrix. Impairments to the ideal speech presentation, which can result from various physiological or psychophysical conditions, are one factor affecting how much benefit severely deafened individuals receive from cochlear implants. Studies of the effects of impairments often use acoustic models to examine impairments on all channels; however isolated impairments affecting a small number of electrodes are more common and have been shown to negatively affect speech recognition performance. Assessing cochlear implant device performance on an electrode-by-electrode basis requires extensive, time-consuming tests. In this study, methods for screening poorly performing electrodes are evaluated using vowel confusion data from normal-hearing subjects tested with acoustic models. Confusion patterns for vowel tokens in noise with missing spectral information were measured using an eight channel acoustic model with individual channels removed. The missing channel model represents the most severe case of a poorly performing channel, with no speech information transmitted on the channel under investigation. Preliminary results indicate successful identification of the missing channels using the vowel confusion matrices for each acoustic model. A technique that quickly identifies potential poorly performing channels would allow researchers, and ultimately clinicians, to develop and implement alternate speech presentations that maximize transmission of critical speech information. Identifying Poorly Performing Channels in Cochlear Implants Through Confusion Matrix Analysis Jeremiah J. Remus and Leslie M. Collins Department of Electrical and Computer Engineering, Duke University, Durham, NC INTRODUCTION Methods Used to Identify Poorly Performing Channels 2 Two issues were identified as critical for developing a method to identify impaired channels through speech-based confusion matrix analysis: 1)creating a set of stimuli, speech-based or nonsensical, that thoroughly tests each channel for impairments. 2)developing a measure for classifying the similarities and differences between the tokens with regard to the critical discriminating features transmitted on each channel. Caution must be exercised to avoid creating a stimulus set that is overly simplified. Tokens must contain sufficient complexity to ensure their usefulness in measuring impairments of interest. Token confusions in the listening test can be attributed to one of three causes: 1)phonetic similarities between the tokens; 2)the structure of the additive noise and experiment setup; 3)the effect of the impairments on the critical discriminating features. Additional efforts to remove the influence of the first two causes of confusions from the impaired model confusion matrix, and isolate confusions resulting from impairments to the acoustic model, will improve the identification of the impaired channels. The CICA method was able to identify the missing channels with half the false alarm rate of random chance. Using the analysis presented in this study on listening test results as a pre-screener to provide some prior information for a full test of channel impairments could significantly reduce the test complexity. 3 1 Listening Experiment Performance Identifying Missing Channels 4 Conclusions This research is supported by the National Science Foundation grant NSF-BES-00-85370 Channel Information Confusion Analysis Dynamic Time Warping (DTW) Hidden Markov Models (HMM) Least cost mapping through cost/distance matrix Decision metric: Compare the cepstrum coefficients of an impaired processed token to the cepstrum coefficients of an unimpaired processed token, used as a template. Difference between cepstrum coefficient vectors calculated using Euclidean distance Model (λ) Parameters M: number of Gaussian mixtures Q: number of model states Distribution of first and second vowel formants across channels Post-processing on HMM/DTW decision metrics Average the off-diagonal values in each row of the matrix of decision metrics for use as an estimated value of the confusion matrix diagonal. Greater average off-diagonal values suggest more similarity between the stimuli and the potential incorrect responses, suggesting a lower rate of confusion. Calculate correlation coefficient between diagonal of listening test confusion matrix and averaged off-diagonal values. Decision metric: Performance comparison: percentage of subjects whose missing channels were correctly identified versus the percentage of possible “false alarms” experienced. All methods perform better than chance with the best performance obtained using the CICA method. Based on the percentage of total possible false alarms to find all missing channels, we would expect to observe (on average) the following number of false alarms en route to identifying the missing channel in a single subject with eight channels: random guessing (chance): 3.5 HMM: 2.5 DTW: 2.3 CICA: 1.6 Find path through cost matrix resulting in lowest cumulative cost D Token similarities measured quantitatively using the HMM will ideally mirror perceptual token similarities and correlate to the rate of token confusion. Calculate the log likelihood that the HMM trained on the cepstrum coefficients of an unimpaired processed token will produce as an observation the cepstrum coefficients of an impaired processed token. Hidden Markov Model Structure * Pooled across SNR **
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.