Vocal Emotion Recognition with Cochlear Implants Xin Luo, Qian-Jie Fu, John J. Galvin III Presentation By Archie Archibong.

Slides:



Advertisements
Similar presentations
Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements Christopher A. Shera, John J. Guinan, Jr., and Andrew J. Oxenham.
Advertisements

Freedom Processor for Nucleus CI24 The South of England Cochlear Implant Centre Experience Roberta Buhagiar, Sarie Cross and Julie Eyles 1 Aided Thresholds.
Associations of behavioral parameters of speech emotional prosody perception with EI measures in adult listeners Elena Dmitrieva Kira Zaitseva, Alexandr.
Implications of rate pitch on musical perception and auditory object formation in cochlear implants Bomjun Kwon, Trevor Perry Gallaudet University Dept.
School of Electrical and Computer Engineering Weldon School of Biomedical Engineering Thomas M. Talavage, PhD Associate Professor, Electrical and Computer.
Auditory Neuroscience - Lecture 7 Hearing Aids and Cochlear Implants aauditoryneuroscience.com/lectures.
Cochlear Implants The cochlear implant is the most significant technical advance in the treatment of hearing impairment since the development of the hearing.
MIMICKING THE HUMAN EAR Philipos Loizou (author) Oliver Johnson (me)
More From Music music through a cochlear implant Dr Rachel van Besouw Hearing & Balance Centre, ISVR.
Perception of syllable prominence by listeners with and without competence in the tested language Anders Eriksson 1, Esther Grabe 2 & Hartmut Traunmüller.
Facial expression as an input annotation modality for affective speech-to-speech translation Éva Székely, Zeeshan Ahmed, Ingmar Steiner, Julie Carson-Berndsen.
Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin.
Audiovisual Emotional Speech of Game Playing Children: Effects of Age and Culture By Shahid, Krahmer, & Swerts Presented by Alex Park
M.Sc. in Medical Engineering
Speech perception Relating features of hearing to the perception of speech.
Localization cues with bilateral cochlear implants Bernhard U. Seeber and Hugo Fastl (2007) Maria Andrey Berezina HST.723 April 8 th, 2009.
Interrupted speech perception Su-Hyun Jin, Ph.D. University of Texas & Peggy B. Nelson, Ph.D. University of Minnesota.
SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?
Relationship between perception of spectral ripple and speech recognition in cochlear implant and vocoder listeners L.M. Litvak, A.J. Spahr, A.A. Saoji,
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
Sound source segregation (determination)
Medical Diagnosis COMS 6998 Fall 2009 Promiti Dutta.
Discussion and Conclusions 9 of the 10 subjects were able to discriminate speech better with the radio aid at 1m, 3m and 10m than with out the radio aid.
Different evaluations for different kinds of hearing Matthew B. Winn Au.D., Ph.D. Waisman Center, UW-Madison Dept. of Surgery.
What is a Cochlear Implant?
Categorizing Emotion in Spoken Language Janine K. Fitzpatrick and John Logan METHOD RESULTS We understand emotion through spoken language via two types.
Nasal endings of Taiwan Mandarin: Production, perception, and linguistic change Student : Shu-Ping Huang ID No. : NA3C0004 Professor : Dr. Chung Chienjer.
Tone sensitivity & the Identification of Consonant Laryngeal Features by KFL learners 15 th AATK Annual Conference Hye-Sook Lee -Presented by Hi-Sun Kim-
CSD 5100 Introduction to Research Methods in CSD Observation and Data Collection in CSD Research Strategies Measurement Issues.
METHODOLOGY INTRODUCTION ACKNOWLEDGEMENTS LITERATURE Low frequency information via a hearing aid has been shown to increase speech intelligibility in noise.
Chapter 3.2 Speech Communication Human Performance Engineering Robert W. Bailey, Ph.D. Third Edition.
Clinical Applications
Multimodal Information Analysis for Emotion Recognition
Speech Perception 4/4/00.
Speech Based Optimization of Hearing Devices Alice E. Holmes, Rahul Shrivastav, Hannah W. Siburt & Lee Krause.
Sh s Children with CIs produce ‘s’ with a lower spectral peak than their peers with NH, but both groups of children produce ‘sh’ similarly [1]. This effect.
Sounds in a reverberant room can interfere with the direct sound source. The normal hearing (NH) auditory system has a mechanism by which the echoes, or.
Calibration of Consonant Perception in Room Reverberation K. Ueno (Institute of Industrial Science, Univ. of Tokyo) N. Kopčo and B. G. Shinn-Cunningham.
Need for cortical evoked potentials Assessment and determination of amplification benefit in actual hearing aid users is an issue that continues to be.
COCHLEAR IMPLANTS Brittany M. Alphonse Biomedical Engineering BME 181.
Cochlear Implantation at King Abdullaziz University Hospital, Riyadh: A Multisystem Prgram, ( )
Hearing Aid (HA) and Cochlear Implant (CI) users provided subjective ratings of usability for speech-to-interference ratios (SIRs) presented in a single-interval,
To examine the feasibility of using confusion matrices from speech recognition tests to identify impaired channels, impairments in this study were simulated.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
Temporal masking of spectrally reduced speech: psychoacoustical experiments and links with ASR Frédéric Berthommier and Angélique Grosgeorges ICP 46 av.
GUIDED BY T.JAYASANKAR, ASST.PROFESSOR OF ECE, ANNA UNIVERSITY OF TIRUCHIRAPPALLI. PRESENTED BY C.SENTHILKUMAR, REG.NO: , M.E(MBCBS),COM SYSTEM,VI.
Psychophysics and Psychoacoustics
1. Draw a square. 2. Divide in half, horizontally and vertically.
Functional Listening Evaluations:
This article and any supplementary material should be cited as follows: Oba SI, Galvin JJ 3rd, Fu QJ. Minimal effects of visual memory training on auditory.
Performance Comparison of Speaker and Emotion Recognition
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
1 Level Acoustics, Eindhoven 2 Eindhoven University of Technology Nicole van Hout 1, 2 Constant Hak 2 Jikke Reinten 2 Heliante Kort 2 Speech Intelligibility.
Predicting the Intelligibility of Cochlear-implant Vocoded Speech from Objective Quality Measure(1) Department of Electrical Engineering, The University.
On the improvement of virtual localization in vertical directions using HRTF synthesis and additional filtering Wersényi György SZÉCHENYI ISTVÁN UNIVERSITY,
Can a blind person guess the state of mind of someone they are talking with without seeing them? SAK-WERNICKA, JOLANTA. "EXPLORING THEORY OF MIND USE IN.
Control of prosodic features under perturbation in collaboration with Frank Guenther Dept. of Cognitive and Neural Systems, BU Carrie Niziolek [carrien]
Subjective evaluation of an emotional speech database for Basque Aholab Signal Processing Laboratory – University of the Basque Country Authors: I. Sainz,
RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.
What can we expect of cochlear implants for listening to speech in noisy environments? Andrew Faulkner: UCL Speech Hearing and Phonetic Sciences.
Danielle Werle Undergraduate Thesis Intelligibility and the Carrier Phrase Effect in Sinewave Speech.
4aPPa32. How Susceptibility To Noise Varies Across Speech Frequencies
August 15, 2008, presented by Rio Akasaka
Presenter: Prof.Dr.-Eng. Gheorghe-Daniel Andreescu
Copyright © American Speech-Language-Hearing Association
The influence of hearing loss and age on sensitivity to temporal fine structure Brian C.J. Moore Department of Experimental Psychology, University of Cambridge,
Fang Du, Dr. Christina L. Runge, Dr. Yi Hu. April 21, 2018
1-channel 2-channel 4-channel 8-channel 16-channel Original
Hearing Loss, Hearing Aids and Cochlear Implants
The Role of Temporal Fine Structure Processing
Presentation transcript:

Vocal Emotion Recognition with Cochlear Implants Xin Luo, Qian-Jie Fu, John J. Galvin III Presentation By Archie Archibong

What is the Cochlear Implant The Cochlear implant is a hearing aid device which has restored hearing sensation to many deafened individuals.

How does it work? Contemporary CI devices use spectrally-based speech-processing strategies in which the temporal envelope is extracted from a number of frequency analysis bands and used to modulate pulse trains of current delivered to appropriate electrodes. Video aid: 57A

Introduction to the Study Abstract: Spoken language gives speakers the power to relay linguistic information as well as to convey other important cues in regards to prosodic information (speech rhythm, intonation). Prosodic speech cues can convey information regarding the speakers emotion and recognition of the speakers emotion can greatly contribute to speech understanding. Present studies show that CI users struggle to recognize emotional speech.

Introduction To the Study This study investigated CI users’ ability to recognize vocal emotions in acted emotional speech, given CI patients limited access to pitch information and spectro-temporal fine structure cues. Vocal emotion recognition was also tested in NH subjects listening to unprocessed speech and speech processed by acoustic CI simulations. (Different amounts of spectral resolution and temporal information were tested to examine the relative contributions of spectral and temporal cues to vocal emotion recognition)

Methodology & Procedure – Subjects: 6 CI and 6 normal-hearing (NH) with even gender breakdown. (All native English speakers) – All NH subjects had puretone thresholds better than 20 dB HL at octave frequencies from 125 to 8000hz in both ears (essentially they could hear just fine) – All CI users were post-lingually deafened. 5/6 CI subjects had at least a year with their device. – The three devices used: Nucleus-22, Nucleus 24, and Freedom

Methodology & Procedure – An emotional speech database was recorded for the study in which one male and one female speaker each produced 5- simple sentences according to 5 target emotions. (neutral, anxious, happy, sad and angry) – Same sentences were used to convey target emotions in order to minimize the contextual cues. (forcing the listener to focus on the acoustic cues) – Sentences used:

Methodology & Procedure The CI subjects vocal emotion recognition was tested using just the unprocessed speech. The NH subjects’ vocal emotion recognition was tested both with the unprocessed speech as well as with speech processed by acoustic sine-wave vocoder CI simulations. (The results show the reasoning and importance of testing both for NH listeners) Image: Example of 4 channel speech processing for the CI simulation:

Methodology & Procedure Subjects were seated in a double walled sound treated booth and listened to the stimuli presented in free field over a single loud speaker. Interesting: The presentation level (65 dBA) was calibrated according to the average power of the “angry” emotion produced by the male talker. In each trial a sentence was randomly selected (without replacement) from the stimulus set and presented to the subject; subjects responded by clicking on one of the five response choices shown on screen with labels (i.e. “neutral” or “angry”) No feedback or training was provided. These responses were collected and scored in terms of percent correct. At least two runs for each experimental condition.

Results Unprocessed Speech: Vocal Emotion recognition performance for CI users (filled squares) and NH performance (filled triangles) Mean NH performance (across subjects) with the CI simulations is shown as a function of the number of spectral channels.

Results NH: With unprocessed speech, mean NH performance was %90 correct. CI: With unprocessed speech, mean performance was just %45 correct. To note: There was large inter-subject variability in each subject group, CI performance was significantly better than chance performance level.

Results NH: With Acoustic CI simulations, NH subjects’ vocal emotion recognition performance was significantly affected by both the number of spectral channels and the temporal envelop filter cutoff frequency. With the 50-Hz envelope filter, performance significantly improved when the number of spectral channels was increased (with exceptions from 1 to 2 and from 4 to 8). With the 500-Hz envelope filter performance significantly improved only when the number of spectral channels was increased from 1 to >1 & from 2 to 16.

Results NH performance with unprocessed language was significantly better than performance in any of the CI simulations with the exception of the 16-channel simulation with the 500-Hz envelop filter. Why is that?

Conclusions Results showed that both spectral and temporal cues significantly contributed to performance. With the 50-Hz envelope filter, performance generally improved as the number of spectral resolution was increased from 1 to 2, and then from 2 to 16 channels. For all but 16-channel simulations, increasing the envelope filter cutoff frequency from 50-Hz to 500 Hz significantly improved performance. CI users’ vocal emotion recognition performance was statistically similar to that of NH subjects listening to 1-8 spectral channels with the 50-Hz envelope filter, and to 1 channel with the 500-Hz envelope filter. This suggests that even though spectral cues may contribute more strongly to recognition of linguistic information, temporal cues may contribute more strongly to recognition of emotional content coded in spoken language.