Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vocalic Markers of Deception and Cognitive Dissonance for Automated Emotion Detection Systems Dr. Aaron C. Elkins The University of Arizona.

Similar presentations


Presentation on theme: "Vocalic Markers of Deception and Cognitive Dissonance for Automated Emotion Detection Systems Dr. Aaron C. Elkins The University of Arizona."— Presentation transcript:

1 Vocalic Markers of Deception and Cognitive Dissonance for Automated Emotion Detection Systems Dr. Aaron C. Elkins The University of Arizona

2 Emotional Voice 2

3 Can computers perceive vocal emotion? Yes…. but, The science of the emotional voice is young Communication is complex and dynamic Moods and emotions contextually switch Emotion is computationally ill-defined Measuring emotion may inform theory 3

4 Emotional Dimensions 4 DISGUST?

5 Four Components of Speech Voiced vs. Unvoiced sounds [v] vs. [f] Airstream through mouth or nose [m] vs. [o] 5

6 Speech Sounds (1) pitch, (2) loudness, and (3) quality Sound is small variations in air pressure that occur rapidly in succession Vocal folds superimpose outgoing air of voiced sounds The vocal folds vibrate to create a periodic vibration (100 – 250 Hz) We measure these features digitally 6

7 Recording Father – Digital Audio 7 Waveform measures pulses of vocal folds Based on air pressure disturbance (dB) Voiced vs. Unvoiced (low pressure) Each peak occurs every 100 th of a second (100 Hz)

8 Vowel Articulation 8 Source-Filter Theory (Müller, 1848) Vocal Folds vibrate at same speed (pitch) Resonance changes in vocal tract to filter frequencies (formants)

9 Vocalics Vocalic Analysis Examines how it was said Amplitude Pitch (frequency) Response latency Tempo Linguistics Examines what was said 9

10 Sound Production is Complex When we tense our muscles, such during stress, our larynx tenses Higher Pitch The process is complex Emotions affect the normal operation Deception takes away cognitive resources away and is stressful More mistakes, lower quality, increased average and variation in pitch Sympathetic Nervous system response Increased auditory acuity Heightened arousal 10

11 Standard Vocal Measures Calculated with Praat and Custom Signal Processing Software 11

12 Nemesysco LVA 6.50 Commercial Vocalic Software Evaluated 12

13 Five Vocalic Studies Summarized Study One (Deception Experiment) Study Two (Cognitive Dissonance) Study Three (Embodied Conversational Agent and Trust) Study Four (Embodied Conversational Agent Security Screening - Bomber) Study Five (Embodied Conversational Agent Security Screening - Imposter)

14 Vocal Deception (Study 1) – Experimental Design N = 96 $10 reward for appearing credible to professional interviewer Two Sequences: First Sequence: DT DDTT TD TTDD T Second Sequence: DT TTDD TD DDTT T 13 Short-Answer Questions Only 8 had variation both within and between subjects Two types of questions: Charged and Neutral 14

15 Results Built-in classification performed at chance level Vocal measures independent of system discriminated deception: FMain, AVJ, and SOS Possible Latent Variables measuring Conflicting Thoughts, Cognitive Effort, and Emotional Fear Logistic regression performed best on charged questions Higher pitch, cognitive effort, and hesitations are predictive of deception in more stressful interactions The claim that the vocal analysis software measures stress, cognitive effort, or emotion cannot be completely dismissed Deception and Stress can be predicted by Acoustic measures of Voice Quality and Pitch when controlling for speaker characteristics 15

16 Vocal Dissonance (Study 2) – Experimental Design Modified Induced-Compliance Paradigm Participants (N=52) made two vocal counter-attitudinal arguments for cutting funding for service for the disabled Choice is manipulated High vs. Low (IV) High N = 24, Low N = 28 Participants report attitude towards argument issue (DV)

17 Arousal (Vocal Pitch) 17 High choice had a 10Hz higher pitch F(1,50) = 4.43, p =.04 All participants reduced their pitch over time F(1,50) = 4.90, p =.03

18 Cognitive Difficulty High Choice had nearly 2x the response latency on argument two F(1,50) = 4.53, p =.04 Arousal moderation 18

19 Cognitive Difficulty Participants spoke with 33% more nonfluencies on the second argument F(1,50) = 4.03, p =.05 19

20 The Importance of Language (Imagery as Abstract Language) 20

21 Vocal Dissonance Model χ²(1, N = 51), p =.49 SRMR =.02 R² Attitude Change =.17, Imagery =.11 21

22 From the lab to the AVATAR 22

23 First Kiosk 23

24 Kiosk from Last Year 24

25 Third-Generation Kiosk 25

26 Gender and Demeanor 26

27 Vocal Trust (Study 3) – Experimental Design Participants completed pre- survey Packed bag before ECA screening interviewing Completed security screening All responses to ECA recorded for vocal analysis

28 ECA Demeanor and Gender 28 Question Block 1 Question Block 2 Question Block 3 Question Block4 Repeated Measures Latin Square Design All participants interacted with all demeanor and gender ECA combinations 4 Questions Per block, 16 Total Questions N = 88 Participants (53 Males, 35 Females)

29 Trust and Time Main effects Initial Trust = 4.09 Trust Rate of Change.04 per second increase p <.01 Duration.05 decrease in trust for every second spent answering the ECA over the 7.6 second average p <.001 29 Multilevel Growth Model Specified with Trust as the DV (N = 218) with Subject as random effect (N=60)

30 Vocal Pitch, Time, and Trust Main Effect of Pitch For every 1Hz increase in pitch over 156Hz trust drops by.01 p =.03 Interaction Pitch and Time Pitch x Time b = 9.3e- 05, p =.03 Over time pitch predicts trust less and less 30

31 Results Human perceptions of trust transfer to ECA Time plays in important role in the interaction All participants trusted the ECA more over time, particularly when it smiled 48 increase in trust when ECA smiles Vocal measures of pitch predicted trust, but only early on For every 1Hz increase in pitch over 156Hz trust drops by.01 Over time pitch predicts trust less and less 31

32 Vocalics of a Bomber (Study 4) Experimental Design 29 EU border guards were randomly assigned to build a bomb (N = 16) or Control (N = 13) then pack a bag Identical to Study 3, but no breaks in the interview Only male neutral demeanor ECA interviewed participants Bomb Makers were instructed to successfully smuggle the bomb past the ECA

33 Vocal Analysis Recorded responses to question: “Has anyone given you a prohibited substance to transport through this checkpoint?” Average Response 2.68 sec (SD = 1.66) Responses such as “No” or “of course not” Vocal measures of Pitch and Pitch Variation 33

34 Results of Vocal Pitch Voice Quality, Gender, and Intensity included as covariates No difference in mean vocal pitch F(1,22)=0.38, p =.54 Main Effect of pitch variation Bomb Makers had 25.34% more variation F(1,22)=4.79, p=.04 34

35 Pitch Contours 35

36 Eye Gaze: Guilty 36

37 Eye Gaze: Innocent 37

38 Vocalics of an Imposter (Study 5) – Experimental Design 38 EU Border Guards All required to present visa and passport through multiphase screening E-gate Manual Processing AVATAR Screening Interview Four randomly assigned imposters carrying false documents with hostile intentions through screening

39 AVATAR Interaction Example

40 iPad Output for Screener 40

41 Voice Quality Change from Baseline Question (What is your full name?) 41

42 Vocalic Classification Model 42

43 Vocalic Resulting Classification 7 innocents falsely classified as terrorists 27 correctly classified as innocent All “guilty” referred to secondary Overall accuracy = 81% TPR = 100% TNR = 79% FPR = 20% FNR = 0% 43

44 Eye Fixations on Visa 44

45 Date of Birth Results – Correct? 45

46 Final Decision Model 46

47 Vocalic Resulting Classification 3 innocents falsely classified as terrorists One of these three was actually lying Actually a True Positive 31 correctly classified as innocent All “guilty” referred to secondary Overall accuracy = 94.47% TPR = 100% TNR = 88.24% FPR = 5.8%  Reduced by 3/4 FNR = 0% 47

48 Questions? Isn’t the voice amazing?


Download ppt "Vocalic Markers of Deception and Cognitive Dissonance for Automated Emotion Detection Systems Dr. Aaron C. Elkins The University of Arizona."

Similar presentations


Ads by Google