A study on Prediction on Listener Emotion in Speech for Medical Doctor Interface M.Kurematsu Faculty of Software and Information Science Iwate Prefectural.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Speed dating Classification What you should know about dating Stephen Cohen Rajesh Ranganath Te Thamrongrattanarit.
Page 1. Page 2 Virtual Speaker: A Virtual Studio The software: Virtual Speaker is a package that automatically creates your voice files, prompts or any.
Chapter 3 Listening for intermediate level learners Helgesen, M. & Brown, S. (2007). Listening [w/CD]. McGraw-Hill: New York.
Spoken Vs Written Language. Introduction Languages are first spoken, then written, and then an understanding.
High Level Prosody features: through the construction of a model for emotional speech Loic Kessous Tel Aviv University Speech, Language and Hearing
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
Varied, Vivid Expressive How can you use your voice to engage, express, and create meaning?
Facial expression as an input annotation modality for affective speech-to-speech translation Éva Székely, Zeeshan Ahmed, Ingmar Steiner, Julie Carson-Berndsen.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
HABIT 5: SEEK FIRST TO UNDERSTAND, THEN TO BE UNDERSTOOD
0 - 1 © 2007 Texas Instruments Inc, Content developed in partnership with Tel-Aviv University From MATLAB ® and Simulink ® to Real Time with TI DSPs Measuring.
Wikipedia :are multimedia that are constantly received by, and normally presented to, an end-user while being delivered by a streaming provider (the term.
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
HOW IS SPEECH PRODUCED? SPEAKING IN PUBLIC PREPARING A SPEECH CREATING & CONDUCTING AN EFFECTIVE SPEECH PERSUASIVE SPEECH Speech vs. Language.
Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.
Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.
Lecture 3 Teaching Listening
Evaluating the transfer-promoting potential of ESOL materials Mark Andrew James Arizona State University / Sunshine.
From Learning Goals to Assessment Plans University of Wisconsin Parkside January 20, 2012 Susan Hatfield Winona State University
Listening and Speaking Workshop Analyzing and Evaluating Speeches Assignment Select a Speech Analyze Content Analyze Organization Analyze Delivery Evaluate.
Foundational Skills Module 4. English Language Arts Common Core State Standards.
Comprehensible Input. Appropriate Speech Rate and enunciation o How the teacher speaks Complexity of speech o What the teacher says Vocabulary Enunciation.
What’s in a Name? ICT for Students with Special Needs.
Multimedia Specification Design and Production 2013 / Semester 2 / week 3 Lecturer: Dr. Nikos Gazepidis
Real-Time Speech Recognition Subtitling in Education Respeaking 2009 Dr Mike Wald University of Southampton.
® Microsoft Access 2010 Tutorial 11 Using and Writing Visual Basic for Applications Code.
CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.
Advanced Spoken English Phonology session: Intonation.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
JAWS Tutorial JAWS Features JAWS, a screen reader that reads everything on the screen, has the following features: – Works well with Windows – Two speech.
Reading Aid for Visually Impaired Veera Raghavendra, Anand Arokia Raj, Alan W Black, Kishore Prahallad, Rajeev Sangal Language Technologies Research Center,
Elluminate Live!. You must test your Audio each time you enter an Elluminate Live! meeting.
Special Education Software and Programs Demetrios Houmas
Advanced Spoken English Phonology session 2 Stress & Weak Forms 1.
Speech Perception 4/4/00.
Creating Better Speeches LET I. Introduction Throughout your life you will be asked to give speeches. These speeches may be formal presentations or just.
American Speechsounds How to Use the Program. AmericanSpeechsounds Why use American Speechsounds? Practice the problem sounds of American English Learn.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
Individual Differences in Human-Computer Interaction HMI Yun Hwan Kang.
First Grade Reading Workshop
ENGLISH PROFICIENCY 3 BIU 2032 UNIT 1 : LISTENING DRAWING INFERENCES.
WEEK 5 - PRONUNCIATION. COMPARE /Æ/, / ɪ /, AND / Ɛ / /æ/ hat nap cat can match / ɪ / miss if pick still fish / ɛ / yes red tell best help.
USEFUL TIPS FOR DELIVERING PRESENTATIONS. Useful for delivering presentations GREETING.
Accessible Media Using Video and Audio to meet the needs of a diverse populations Presented by Kaela Parks.
TEACHING PRONUNCIATION Teaching Suprasegmentals. Word Stress A stressed syllable is…
Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.
YALE LAW SCHOOL POLICY SCIENCES CENTER ANNUAL INSTITUTE Using a New Method of Natural Language Intelligence for Performing Wiretap Analysis Amy Neustein,
PPT Study Guide Rio Carter. Tone Any sound considered with reference to its quality pitch, strength or source. Quality or Character of sound. Stress of.
Objectives: Terminology Components The Design Cycle Resources: DHS Slides – Chapter 1 Glossary Java Applet URL:.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.
CHARACTERISATION Physical & Verbal Characterisation.
Listening comprehension is at the core of second language acquisition. Therefore demands a much greater prominence in language teaching.
PENNSYLVANIA COMMON CORE STANDARDS 1.5 Speaking and Listening Students present appropriately in formal speaking situations, listen critically, and respond.
Getting ready. Why C? Design Features – Efficiency (C programs tend to be compact and to run quickly.) – Portability (C programs written on one system.
COMPREHENSIVE Access Tutorial 11 Using and Writing Visual Basic for Applications Code.
A SPEAKER’S GUIDEBOOK 4 TH EDITION CHAPTER 18 The Voice in Delivery.
PERSONAL EXPERIENCE SPEECH FOCUS ON VOCAL VARIETY, TONE, AND EYE CONTACT.
ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation.
Objectives of session By the end of today’s session you should be able to: Define and explain pragmatics and prosody Draw links between teaching strategies.
Functions of Intonation By Cristina Koch. Intonation “Intonation is the melody or music of a language. It refers to the way the voice rises and falls.
RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.
Chapter 17: Your Voice in Delivery. On a Separate Sheet of Paper  In what ways is a speech like a conversation? What are some differences between conversations.
How can speech technology be used to help people with disabilities?
Teaching Listening Why teach listening?
Voice Vocabulary Use it or lose it…..
Egyptian Language School
Unit 20 Reviewing Animations
Remember Each listening test is 15 questions
Presentation transcript:

A study on Prediction on Listener Emotion in Speech for Medical Doctor Interface M.Kurematsu Faculty of Software and Information Science Iwate Prefectural University

Virtual Medical Doctor System What is Virtual Medical Doctor System? –Diagnoses people like a human doctor –Interact with people like a person What should the speech interface module? –Speech Recognition Understands what people say Estimates Emotion in Speech –Speech Synthesize Tells diagnosis, Asks people Expresses Emotion in Speech Research Target

How to Estimate Emotion in Speech 1 Conventional Approach from Exists Works Step.1 Collect Human Speeches Human speech has Sound data and Emotion putted by Speaker Step.2 Feature Selection Step.2-1 Get Speech Features Step.2-2 Calculate Statistics Values Step.3 Make Classifiers of Speech Features Extracts Relation between Emotion and Speech Features Learning Phase Step.1 Record Human Speech Step.2 Feature Selection Get Speech Features & Calculate Statistics Values Step.3 Estimate Emotion Using Classifiers Estimate Phase

How to Estimate Emotion in Speech 2 Our Approach We modify the Learning Phase in Conventional Approach 1. Collect Human SpeechesPoints+ Use Emotion putted by Listeners + Use Synthetic Speech as Human Speeches 2-1. Feature Selection / Get Speech Features Points+ Focus on Features in each Syllable 2-2. Feature Selection / Calculate Statistics Value Points+ Calculate Quartile & Interquartile range + Calculate the coefficient of the regression formula 3. Make Classifiers Points+ Make a set of Classifiers make each classifier for each emotion

Evaluate Our Approach Based on Experiments Points of Modification in Our ApproachEvaluate Step.1+ Use Emotion putted by ListenersWeak Depends on a Listener Step.1+ Use Synthetic Speech as Human SpeechesWeak Useful for Some Emotions Step.2+ Focus on Features in each SyllableWeak Useful for Some Emotions Step.2+ Calculate Quartile & Interquartile rangeMaybe Good Step.2+ Calculate the coefficient of the regression formulaNot Good Step.3+ Make a set of ClassifiersGood We should modify this module more and more

Future Works about Estimation For Collect Speech –Subdivide Emotion by Expression Patterns –Collect Speeches more (Radio, TV, Movies etc.) For Feature Selection –Focus on Other Features E.g. Self-Correlation, LPC Spectrum etc. –Focus on Other Statistics Values E.g. correlation between some speech features For Make Classifiers –Using Other Machine Learning Methods E.g. Bagging

How does system Express Emotion in Speech? Adjust Speech features to Emotion Based on Relations between Emotion & Features –Speech Features= Pitch, Volume, Speed etc –Relation shows How does a system change speech features to express an emotion. How do we make relations? Developer defines based on his experience Extract from Speech and Emotion estimated by People –People hear speeches and estimate emotionSomet09(6) Express Emotion in Speech Our Approach

Extract Relations between Emotion and Speech features 1.Synthesize some speeches whose features are difference each other To synthesize speeches, we use SMARTALK powered by OKI Co. We use difference parameter set each synthesized speeches Parameter={ Volume, Speed, Pitch, Intonation } 2.People estimate emotion in synthetic speeches and answer emotion 14 men and 10 women answered 3.Defined Parameters as Relation We select a parameter set. Most people answered same emotion in a speech which synthesized with this parameter set We select 3 parameter sets for each emotion. Synthesize Speech to Express Emotion –We give a phrase and emotion to the module. –The module selects relation (a parameter set) and sets them. –The module synthesizes speechSomet09(7) How to Make our Speech Module

Snap Shot of Our System Text Box + Input a phrase Synthesize Button + Synthesize Speech with Emotion written on a Button + [SPEAK] means Not to express Emotion Development Environment +OS Windows XP sp3 +Language Visual C Library Smartalk.OCX or SAPI

Future Works about Synthesize Modify Relation (Parameter Set) –People evaluated this module We demonstrated this module in a local museum and asked the following question “Is synthesized speech like a human speech?” Answer: Yes=50,Moderately Yes=147, Moderately No= 133, No=27 –We need to modify Relation to synthesize speech like a human Change other parameters Give variety to parameters Add Stress and Pause Etc.

Appendix I showed a next slide on the Workshop –I showed the content of that slide in preceding slides.

Speech DATA Analysis Estimate Features Making Classifier Human Speech intentional mean ・ Max A Classifier for all emotions Our Approach Synthesized voice +SD + Kurtosis + Skewness Classifier for each emotion Classifier StepWell-Known Using to estimate emotion in speech Pitch & Power measure +Difference / Ratio Estimate Emotion in Speech