Presented by Ravi Kiran. Julia Hirschberg Stefan Benus Jason M. Brenier Frank Enos Sarah Friedman Sarah Gilman Cynthia Girand Martin Graciarena Andreas.

Slides:



Advertisements
Similar presentations
KeTra.
Advertisements

1 Multimodal Technology Integration for News-on-Demand SRI International News-on-Demand Compare & Contrast DARPA September 30, 1998.
Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Detecting Certainness in Spoken Tutorial Dialogues Liscombe, Hirschberg & Venditti Using System and User Performance Features to Improve Emotion Detection.
Spoken Cues to Deception CS 4706 What is Deception?
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
Annotation and Detection of Blended Emotions in Real Human-Human Dialogs recorded in a Call Center L. Vidrascu and L. Devillers TLP-LIMSI/CNRS - France.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Emotion in Meetings: Hot Spots and Laughter. Corpus used ICSI Meeting Corpus – 75 unscripted, naturally occurring meetings on scientific topics – 71 hours.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Identifying Local Corrections in Human-Computer Dialogue Gina-Anne Levow University of Chicago October 5, 2004.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Spoken Language Processing Lab Who we are: Julia Hirschberg, Stefan Benus, Fadi Biadsy, Frank Enos, Agus Gravano, Jackson Liscombe, Sameer Maskey, Andrew.
Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.
Extracting Social Meaning Identifying Interactional Style in Spoken Conversation Jurafsky et al ‘09 Presented by Laura Willson.
Deceptive Speech Frank Enos April 19, 2006 Defining Deception Deliberate choice to mislead a target without prior notification (Ekman‘’01) Often to gain.
Deceptive Speech Frank Enos April 25, Defining Deception Deliberate choice to mislead a target without notification (Ekman‘’01) Often to gain some.
Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Detecting missrecognitions Predicting with prosody.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
Varying Input Segmentation for Story Boundary Detection Julia Hirschberg GALE PI Meeting March 23, 2007.
Producing Emotional Speech Thanks to Gabriel Schubiner.
Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.
Automated Scoring of Picture- based Story Narration Swapna Somasundaran Chong Min Lee Martin Chodorow Xinhao Wang.
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
1 Deceptive Speech CS4706 Julia Hirschberg 2 Everyday Lies Ordinary people tell an average of 2 lies per day –Your new hair-cut looks great. –I’m sorry.
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
1 Computation Approaches to Emotional Speech Julia Hirschberg
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
Structural Metadata Annotation of Speech Corpora: Comparing Broadcast News and Broadcast Conversations Jáchym KolářJan Švec University of West Bohemia.
MedIX – Summer 07 Lucia Dettori (room 745)
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
1 Prosody-Based Automatic Segmentation of Speech into Sentences and Topics Elizabeth Shriberg Andreas Stolcke Speech Technology and Research Laboratory.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
National Taiwan University, Taiwan
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Hello, Who is Calling? Can Words Reveal the Social Nature of Conversations?
Predicting Voice Elicited Emotions
1/17/20161 Emotion in Meetings: Business and Personal Julia Hirschberg CS 4995/6998.
Using Linguistic Analysis and Classification Techniques to Identify Ingroup and Outgroup Messages in the Enron Corpus.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.
1 Identifying Deceptive Speech within and across Cultures Sarah Ita Levitan, Guozhen An, Michelle Levine, Andrew Rosenberg, Julia Hirschberg Computer.
Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.
Audio Books for Phonetics Research CatCod2008 Jiahong Yuan and Mark Liberman University of Pennsylvania Dec. 4, 2008.
Investigating Pitch Accent Recognition in Non-native Speech
CSC 594 Topics in AI – Natural Language Processing
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
Spoken Dialogue Systems
Automatic Fluency Assessment
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Searching and Summarizing Speech
Fadi Biadsy. , Andrew Rosenberg. , Rolf Carlson†, Julia Hirschberg
Searching and Summarizing Speech
Spoken Dialogue Systems
Automatic Detection of Causal Relations for Question Answering
Emotional Speech Julia Hirschberg CS /16/2019.
Low Level Cues to Emotion
Acoustic-Prosodic and Lexical Entrainment in Deceptive Dialogue
Automatic Prosodic Event Detection
Presentation transcript:

Presented by Ravi Kiran. Julia Hirschberg Stefan Benus Jason M. Brenier Frank Enos Sarah Friedman Sarah Gilman Cynthia Girand Martin Graciarena Andreas Kathol Laura Michaelis Bryan Pellom Elizabeth Shriberg Andreas Stolcke

Problem. Solution. Results. Conclusion.

Problem Definition. Applications. Challenges.

Distinguish deceptive from non- deceptive speech. Machine Learning techniques. Automatically or semi-automatically extracted features. Large corpus of annotated speech.

Government and Corporations. Law enforcement. Military tasks. Insurance companies and others. Greater impact for common man? Escape into dream-world Authenticity of bargain market deals. Antique shop deals. Typical movie scene: Do you like me?

Task difficulty – How well can we do? Let’s see a couple of samples. What is deception? Is it a state of mind? Causes emotions like fear and elation. Others? Detecting emotions  Detecting deception? Absence of work on automatic (ML-based) deception detection in Speech. Detecting deception using Visual and lexical cues. Statement analysis. Detecting deception in speech – not automatic. Absence of labeled Speech Corpora for ML use.

Typical ML-based task. Labeled speech corpus. Feature Extraction. Training Classifier.

Some Challenges. Method. Some Challenges solved. Annotation and Post-processing. Some interesting questions.

Any new methods designed to collect data for the task might have to consider a few things. Bogus pipeline effect. Control of information revealed to subjects before conclusion of experiment. Other implications for designing experiments? Deception might be individualized phenomena. Require both deceptive and non-deceptive samples from each speaker to build speaker specific models. Other implications for designing experiments?

Deception is not easy, it makes you think. So, why do it? (Motivation plays important role) High stakes (shame, fear)  Good results. But, not possible presently. Financial reward – Depends on the reward. Self-presentational perspective to deceive well. Any others? What is Deception? Factually wrong statement. The world, as you know, will end in Statement prompting incorrect deduction or inference. Equivocal sentence. Example? Common in Courts? Any other challenges?

32 native English speakers. Phase 1. Tasks (activities and QnA) in 6 different areas. Task difficulty manipulated – 2G, 2P, 2S. Results compared with “Top Entrepreneurs” profile. Phase 2. $100 reward for convincing the interviewer they did better. Interviewer asked questions (with constraints). Subjects answered the questions. Subjects indicated the truthfulness of each answer.

Post-Processing. Audio data was orthographically transcribed. Sentence-like units in transcription labeled. Transcription automatically aligned with Audio. Annotation. Units of data. Words, Slash Unit, Breath group, Turn unit. Deception: Little lie and Big lie. Psychological Motivation? Uses?

Corpus Collection - Post-Processing.

Individualized Phenomenon. Manipulation of task difficulty. Deceptive & non-deceptive speech for every user. Motivation. Profiles compared with “Top Entrepreneurs”. Financial reward of $100. Types of deception. Self reported Little lie – Factual deception. Inferred Big lie – Task-oriented deception. Any Others?

Little vs. Big lie. Any psychological motivation? Uses? (hasn’t been used in the ML experiment). Interviews. Is the interviewer a trained psychologist? He selects the questions dynamically. So, a trained person might help? Effect of the interview-based style on deception. The interviewer was asking questions that would definitely prompt deception. How does this effect the deception data collected?

Feature selection – How? Corpus analysis. Features selected for ML task.

How do we determine what features to explore? Existing work to the rescue. Lexical features. Patterns of word usage. Filled pauses. Deception  Emotion  Word choices ? Non-lexical features. Significant increase in pitch for deceptive speech. Emotions might affect the energy changes. Deception is a speaker dependent phenomenon. Any Others?

Lexical features. Acoustic and Prosodic features.

Word usage patterns – LIWC. Ratio of Positive emotion words – Deceptive speech > Non- deceptive speech. Demo. To explore: Word Count, Causation related items. Filled pauses. Correlation with Little Truth > Little Lie. No significant correlation for Big Truth & Big Lie. Emotions experienced – DAL. Pleasantness dimension most promising. Increase in Minimum pleasantness score  Deception.

150 features for each segmented SU’s. Pitch. Features extracted on voiced region – 3 forms. Derived: Max, Min, Mean, Range of pitch etc. Normalized by 5 different approaches. Energy. Raw energy of SU. Raw energy of voiced region – 3 forms. Derived: Max, Min, Mean, Range of energy etc. Durational. Max and Avg duration in SU – Raw and Normalized.

Lexical, Acoustic, Prosodic and Speaker- dependent features used for ML task. Some lexical features selected. POS and Word features (presence of cue phrases, verb tense etc.). Bag of words. Presence of positive & negative emotion words. Utterance syntactic question or not. Number of word repetitions (interviewer’s query). Disfluency or laughter in phrase. All non-lexical features (previous slide) selected.

Some Speaker-dependent features. Subject ID. Gender. Subject dependent ratios. Ratio of filled pauses in Little Truth & Little Lie. Ratio Laughter in Little Truth & Little Lie. Ratio of Cue phrase in Little Truth & Little Lie. Ratio of phrases containing filled pauses to all phrases. Ratio of phrases containing laughter to all phrases. Ratio of phrases containing cue words to all phrases. Any other interesting features?

Input instances were 9491 SU’s. Labels or Classes were Little Lie and Little Truth. Ripper algorithm was used. Outputs were Rules for each Class. 90%-10% Training-Test split. Baseline is the no. of majority class instances “true”. Results averaged over 5 trial runs.

All results are in terms of Error %. Baseline – 39.8%. Acoustic and Prosodic features only – 38.5%. Rules based on Energy and F0 features. Lexical features only – 39%. Rules based on word repetitions from query, verb tense, presence of filled pauses, presence of word “poor” in SU. Acoustic and Prosodic + Lexical features – 37.2% Rules were dominated by Acoustic and Prosodic features. Acoustic and Prosodic + Lexical features + Speaker Dependent features – 33.6%. What about only Speaker-dependent features?

Creation of CSC corpus. Within speaker deceptive & non-deceptive speech. Preliminary analysis. Positive emotion words, filled pauses, pleasantness score are significantly different for the two classes. Speaker independent & dependent features. Speaker dependent features are very important. Greater focus on these techniques in future. ML based Classification task carried out.