Presentation is loading. Please wait.

Presentation is loading. Please wait.

HUMAINE - WP5 Belfast04 1 Experience with emotion labelling in naturalistic data L. Devillers, S. Abrilian, JC. Martin, LIMSI-CNRS, Orsay E. Cowie, C.

Similar presentations


Presentation on theme: "HUMAINE - WP5 Belfast04 1 Experience with emotion labelling in naturalistic data L. Devillers, S. Abrilian, JC. Martin, LIMSI-CNRS, Orsay E. Cowie, C."— Presentation transcript:

1 HUMAINE - WP5 Belfast04 1 Experience with emotion labelling in naturalistic data L. Devillers, S. Abrilian, JC. Martin, LIMSI-CNRS, Orsay E. Cowie, C. Cox, I. Sneddon, QUB, Belfast

2 HUMAINE - WP5 Belfast042 QUB - LIMSI QUB and LIMSI are developing complementary approaches (coding scheme and tools) for annotating naturalistic emotional behaviour in English and French TV videos. This cooperation will enable: - to study cross cultural issues in naturalistic emotions - to compare and eventually combine discrete/continuous coding schemes. QUB and LIMSI have already exchanged some data and begun to annotate them

3 HUMAINE - WP5 Belfast043 Outline 1.Challenges in annotating naturalistic emotion 2.Experiments of emotion labelling on audio and audio-visual data: call centers, movies, TV 3.Experiment of emotion labelling on EMoTV1 4.On-going work: new emotions and metadata coding scheme 5.Illustrative examples (ANVIL + Feeltrace) 1. EMoTV1 2. Belfast Naturalistic database

4 HUMAINE - WP5 Belfast044 1 – Challenges in annotating naturalistic emotion Goals: Detection of « real-life » emotion Simulation of « real-life » emotion with ECAs  Which emotions modelled ?  Which context annotated ?  Which representation ? Descriptors of emotional states  Verbal categories (Ekman, Plutchick)  Abstract dimensions (Osgood, Cowie)  Appraisal-based description (Sherer), interaction model OCC (Ortony)

5 HUMAINE - WP5 Belfast045 Categories and dimensions: Redundancy/Complementarity Verbal categories:  Applied to segments: speaker turns, sub-units  Choose among a finite list, a task-dependent labels list: finance, emergency, interviews TV.  Limited in number to be tractable Dimensions: contiuous (Feeltrace Cowie, Schröder), scale (Craggs, Devillers&Vidrascu, Abrilian et al)  Segments: sequence, sub-units of the sequence  3 dimensions: Activation (Intensity) / Valence / Control Dimensions match with categories (coarse classes) but not allow to distinguish between by example fear and anger Study redundancy and complementarity of verbal categories and dimensions

6 HUMAINE - WP5 Belfast046 Context annotation for naturalistic emotion We have to take into account of the contextual information for naturalistic emotion at multiple levels Different contextual information are needed for different type of application and modalities A tentative proposal of relevant contextual cues is on-going for EMoTV corpus. This scheme can be refined through work with different databases

7 HUMAINE - WP5 Belfast047 In practice Iterative annotation protocol:  Definition  emotion labels abstract dimensions: Task-dependent labels Universal dimensions  segmental units: overall sequence, utterance, sub-units, words, etc.  Annotation  One label or combined labels with abstract dimensions per segment  Meta-annotation: context situation, appraisal related descriptors  Validation  inter-annotator agreement  perceptual tests

8 HUMAINE - WP5 Belfast048 2 – Naturalistic data: audio and audio-visual Audio: Call Centers  Pros: Natural H-H Interaction, Cons: social aspects, phone channel  Task-dependent emotion: financial aspects, emergency, etc Audio-visual: TV, movies TV: +/- natural dependent on the type of TV broadcastings (Games, reality- shows, news, interviews, etc), Live/Non live, recording context, etc  EMoTV1, Interviews, very variable emotional content behavior Realistic fictions: less naturalistic  Emotions (such as fear, distress) in abnormal dangerous situation are impossible to collect. Goals: Call centers and movies: detection of emotion (audio-based) EMoTV: provocative corpus for stuying ECA specification

9 HUMAINE - WP5 Belfast049 Task-dependent annotations (1) FP5 - Amities Project collaboration LIMSI/ENST Call center 1: Stock Exchange Customer Service Center Fear of losing money! fear/apprehension, anger/irritation, satisfaction, excuse, neutral 2 annotators - 12 % of speaker turns with emotion - kappa 0.8 100 dialogs, 5000 speaker turns [Devillers, Vasilescu, LREC 2004, Speech prosody 2004, ICPhS 2003] Call center 2: Capital Bank Service Center Fear of missing money! fear/apprehension, anger/irritation, satisfaction, excuse, neutral 2 annotators on 1K speaker turns randomly extracted – 10 % of speaker turns with emotion - kappa 0.5 250 dialogs, 5000 speaker turns extracted [Devillers, Vasilescu, LREC 2004]

10 HUMAINE - WP5 Belfast0410 Task-dependent annotations (2) Collaboration LIMSI-Emergency call center Call center 3: Emergency Service Center Fear of being sick, real fear/panic, call for help Larger emotional behaviour than for financial call centers - 18 classes obtained after labels selection among Humaine emotion list (R. Cowie) - 5 persons (transcribers), majority voting: anxiety, stress, fear, panic, annoyance, cold anger, hot anger, disappointement, sadness, despair, hurt, dismay, embarassment, relief, interest, amusement, compassion, surprise + negative, positive and neutral. Annotation of 20h (on-going process with Transcriber tool, refinement of the labels list) 1/ manual segmentation (sub-speaker-turn segments) 2/ segment annotation with major/minor emotion, with abstract dimensions (scale) 3/ meta-annotation: contextual information: motif for the call, patient lies (kin), etc. audio information: quality, accent, pathological voice, etc. PhD student: Laurence Vidrascu (LIMSI)

11 HUMAINE - WP5 Belfast0411 Task-dependent annotations(3) collaboration ENST/LIMSI/THALES Fiction: Fear manifestation in realistic movies Video surveillance application Fear of aggression ! Fear vs other neg emotions, other emotions, neutral Valence, Intensity, Control Video help for fear emotion annotation (providing context) context: ex: aggressor, victim POSTER – I Vasilescu [clavel, vasilescu, devillers, ehrette, ICSLP 2004] PhD student: Chloé Clavel

12 HUMAINE - WP5 Belfast0412 EmoTV1 – Large number of topics – interviews TV from news 51 clips, various context, 14 emotion labels, multimodal annotation [Ref] S. Abrilian, L. Devillers, JC Martin, S. Buisine, SummerSchoolWP5 Task-dependent annotations (4) EmoTV – FP6- HUMAINE

13 HUMAINE - WP5 Belfast0413 3 – Experience of labelling on EmoTV1 Study the influence of the modalities on the perception of emotions Two independent annotators: master students in psychology- coder1 (male), coder2 (female) Annotations with Anvil tool (Kipp 2001) for 3 conditions:  Audio without video  Video without audio  Audio with video

14 HUMAINE - WP5 Belfast0414 Segmentation and annotation protocol for the 3 conditions Instructions: detect emotional events Segmentation (free) followed by agreed segmentation Annotation scheme combining:  Labels (free-choice)  Two abstract dimensions Intensity (from very low to very high) Valence (from negative to positive)  Context: theme, what-for, etc (for the audio+video) [Ref: Buisine, Abrilian, Devillers, Martin, poster WP3]

15 HUMAINE - WP5 Belfast0415 1. Segmenting the extracts 2 independent coders Separate segmentation of audio and video extracts Unifying the segments 2. Intersection for videos Union for audio corpus Labeling the segments 3. Analyses inter-coder reliability For categories of labels (Cohen’s kappa) on audio and video corpus Step1: Audio-only and Video-only conditions

16 HUMAINE - WP5 Belfast0416 1. Segmenting the extracts 2 independent coders Separate segmentation of audio-video extracts Unifying the segments 2. Labeling the segments 3. Analyses inter-coder reliability For categories of labels (Cohen’s kappa) On audio- video corpus Step2: Audio+Video conditions

17 HUMAINE - WP5 Belfast0417 Segmentation (free for two annotators): Audio-only and Video-only  2 times more segments for video than for audio condition for both annotators  Automatic decision for obtaining a common set of segments (decision semantically motivated)  Intersection for video condition: 295 segments  Union for audio condition: 181 segments Audio+video  Agreed on a common set of 281 emotional segments The use of audio-only segmentation for audio+video is not straightforward Audio+Video segments are included in audio-only segments. Analyse: Speech vs. audio visual differences for segmentation

18 HUMAINE - WP5 Belfast0418 Emotional labels From the three experiments of annotation: a list of 176 different labels after normalization -> classified into a set of 14 labels anger, despair, disgust, doubt, exaltation, fear, irritation, joy, neutral, pain, sadness, serenity, surprise and worry.

19 HUMAINE - WP5 Belfast0419 Analyse: Speech vs. audio visual differences for annotation Inter-coder agreements: Kappa values (on segments)  Emotional Labels (14 values): audio-video 0.37, video 0.43, audio 0.54  2 abstract dimensions: Intensity: low inter-coder agreements except audio Video and Audio+Video very low, audio 0.69 Valence (Neg/?/Pos): high level of agreement for audio, Audio+Video 0.3 and Video 0.4, Audio 0.75 Low kappa for valence: Positive/Negative confusion Audio+Video 11%, Video 7%, Audio 3% Real-life emotions -> blended, ambiguous, difficult to annotate

20 HUMAINE - WP5 Belfast0420 Emotion annotation agreement for the 3 conditions (1)

21 HUMAINE - WP5 Belfast0421 Emotion annotation agreement for the 3 conditions (2) Anger, Irritation, Joy, Exaltation, Sadness -> high level of agreement for Video, Audio + Audio-Video conditions Surprise, Worry -> for Video condition (visual cues) Doubt -> for Audio or Video conditions, not for Audio+Video Pain -> for Audio and Audio+Video (acoustic cues) Serenity -> only for Audio+Video condition (much more subtle) Neutral -> 1% of agreement for Video condition

22 HUMAINE - WP5 Belfast0422 Valence and Emotion Audio-video  

23 HUMAINE - WP5 Belfast0423 Clip 29: Joy/Disgust valence ?

24 HUMAINE - WP5 Belfast0424 Emotion perception - high subjectivity: examples Different perception between coder1(male)/coder2(female): in the same valence classe: ex: clip3 audio/video condition: anger/sadness, anger/despair -> blended emotion between negative/positive classes: ex: a woman cries for joy (relief) clip 4 audio condition: sadness/sadness video condition: sadness/don’t know audio-video condition:joy/sadness -> cause-effect conflicts 

25 HUMAINE - WP5 Belfast0425 Clip 4: Joy (relief)/ Sadness: valence ? 

26 HUMAINE - WP5 Belfast0426 Assessment of annotations: Next steps Inter-annotation agreement  Kappa low (14 classes) -> ambiguous annotated data but also rich data  Study of disagreements in order  to define the different type of complex or blended emotions: low- intensity, cause-effect, masked, sequential(transition), ambiguous, etc.  Define hierarchical levels of annotations Perceptual tests: Multilingual cross-cultural perceptual tests For validating annotation labels and type of emotions For studying the emotional perceptual abilities of coders: personality, sensibility to different emotional cues in audio, face, gesture, etc  Multilingual cross-cultural perceptual tests Collaboration WP3-WP5 Unige, QUB, LIMSI

27 HUMAINE - WP5 Belfast0427 Emotion categories: fine to coarse grain Hierachical level of annotation: fine to coarse grained-labels Surprise Shame Shame, embarrassment neutral/other Embarrassment Doubt Pride neutral/other Pain

28 HUMAINE - WP5 Belfast0428 4- New annotation scheme (on-going) Annotation of the global sequence and emotional segments of the sequence with:  non-basic emotion patterns: blended, masked, sequential, etc.  two emotion labels (major/minor)  activation, Intensity, Control, Valence (scale 1-5)  discrete temporal pattern: describing temporal evolution inside segments Contextual annotations included derived appraisal-based descriptions: event causes emotion, Global multimodal descriptors: Audio, face (eyes, gaze), torso, gesture (free-text fields) Emotions and Metadata Coding Scheme: Annotation guide -> WP5 exemplar

29 HUMAINE - WP5 Belfast0429 Intra-emotional segment temporal evolution Abstract dimensions much more suitable than categorical description to describe gradual and mixed emotional behavior. In ANVIL scheme, temporal evolution is given by the sequence of emotional segments (some are transitions) but intra-segment dynamic is lacking On-going study: temporal evolution + categorical labels  Feetrace continuous dimension annotation(LIMSI/QUB)  Discrete temporal pattern describing intra segment evolution (LIMSI/Paris 8) such as:

30 HUMAINE - WP5 Belfast0430 Context annotation for naturalistic emotion (on-going) A tentative proposal of relevant contextual annotations: Emotion-context (some derived from appraisal-related descriptors)  Cause-emotion: text-free  Time-of-emotion: immediate, near past, past, future  Relation person-emotion: society subject, true story by self, by kin  Degree-of-implication: low, normal, high Overall communicative goal  What-for: to claim, to share a feeling, etc  To-whom: public, kin, society, etc Scene descriptors: theme, type of interaction, Character descriptors: age, gender, race Recording: quality, camera/person position, channel and time

31 HUMAINE - WP5 Belfast0431 5- Example EMoTV Clip 3

32 HUMAINE - WP5 Belfast0432 Global sequence annotation

33 HUMAINE - WP5 Belfast0433 Illustration of segmentation problems Segmentation/Annotation (clip 3) (Summer School Belfast) Coder1 4-99-20 20-2929-31 anger angerangeranger despairdespairdespairsadness Coder2 4-2424-34 angerdespair sadness Coder3 4-77-1111-24 24-32 ?despair irritationanger anger Coder44-77-1010-1320-2326-31 angerangerangerdispointment Final 4-77-20 20-33 On-going study to find adequate rules to segment audio-video sequence in emotional units

34 HUMAINE - WP5 Belfast0434 Emotional annotations per segment by several coders French coders - New scheme Majeur/Minor (I, A, C, V) 1-5 Coder1 4-77-2020-31 angerangeranger sadnessdespair (4,4,3,2) (4,4,2,1) (5,5,2,1) blended blended Coder2 4-77-2020-31 4-7 7-20 20-31 worryangerdespair worry 0,66 anger 0.5 despair 0.5 sadnessanger anger 0.34 sadness 0.34 anger 0.5 (4,4,3,2) (4,4,2,1) (5,4,1,1) disgust 0.16 blended blend (3,3,3,2) (4,4,2,1) (5,4,1,1) Coder34-77-2020-31 worryangerdespair disgustanger (2,2,3,2) (4,4,2,1) (5,4,1,1) blended Instead of a priori choice -> weighted vector of categories could be kept

35 HUMAINE - WP5 Belfast0435 Feeltrace annotations combined with ANVIL labels despair anger Coder1 Coder2 worry French Coders

36 HUMAINE - WP5 Belfast0436 Clip3 annotated by QUB team with Feeltrace Coder: Cate Coder: Ellen Coder: Ian High similarity between Feeltrace Annotations (QUB and LIMSI) for this clip

37 HUMAINE - WP5 Belfast0437 Clip3: global sequence annotations from LIMSI/QUB Coder: CateCoder: EllenCoder: Ian High similarity between global label annotations (QUB and LIMSI) for this clip Angry Strong Sad Medium Hurt Medium Angry Strong Hurt Strong Despair Strong Angry Medium Resentful Medium Despair Weak Coder: Sarkis Anger (5, 5, 2, 1) Coder: Jean-Claude Anger (5, 4, 1, 1) Coder: Laurence Anger (5, 4, 2, 1) QUB LIMSI

38 HUMAINE - WP5 Belfast0438 Belfast naturalistic Corpus Collaboration QUB/LIMSI Example clip 61b A+

39 HUMAINE - WP5 Belfast0439 Weighted vectors combining emotion annotations from coders French coders (Clip61) - Majeur/Minor (I, A, C, V) scale 1 -5 3 coders: Instead of a priori choice -> weighted vector of categories per segment 0-2.84 3.28-3.48 3.64-4.08 4.48-6 joy 0.66 exaltation 0.5 joy 0.5 joy 0.75 pleased 0.34 joy 0.25 pleased 0.25 pride 0.25 (4,4,3,4) pride 0.25 pride 0.25 (4,4,4,5) (4,4,4,5) (4,4,3,5) 6.16-7.52 7.68-10-52 11.04-12.24 12.24-14.76 exaltation 0.5 joy 0.2 doubt 0.5 pride 0.5 joy 0.25 serenity 0.2 pleased 0.5 pleased 0.25 pride 0.25 pleased 0.2 (3,3,4,4) serenity 0.25 (5,5,5,5) doubt 0.2 (4,4,5,4) worry 0.2 (3,3,3,3)

40 HUMAINE - WP5 Belfast0440 Clip 61b: Feeltrace Coder1 Coder2 LIMSI Coders

41 HUMAINE - WP5 Belfast0441 Clip61b global annotation by QUB and LIMSI QUB coders – Intensity: strong, medium, weak Coder 1 Coder 2Coder 3 Coder 4Coder 5 Coder 6 Happy AffectionateAffectionate AffectionateHappy Confident Affectionate HappyHappy HappyExcited Amused Agreable LIMSI coders – (I, A, V, C) – scale 1 - 5 Coder 1 Coder 2Coder 3 Joy JoyJoy (5 4 5 3) (4 4 4 4)(4 5 5 4) High similarity between global label annotations (QUB and LIMSI) for this clip

42 HUMAINE - WP5 Belfast0442 Conclusion/Perspectives Conclusions Annotation of 2 verbal labels per segment for naturalistic emotions Combination of emotion annotations from coders (« soft categories ») Combination of categorical and dimension emotion representation (QUB/LIMSI) On-going work Temporal emotion evolution for ECAs (Univ. P8/LIMSI/QUB) Validation of the new annotation scheme Re-Annotation of EmoTV1 (others coders) Perceptual tests (UNIGE/QUB/LIMSI) Perspectives Correlation between multimodal and emotion annotations ECA with « real-life » emotion (Univ. P8/LIMSI) EMoTV2

43 HUMAINE - WP5 Belfast0443 Next talk Manual Annotation of Multimodal Behaviors in Emotional TV Interviews with ANVIL Thank you for your attention by Jean-Claude Martin:


Download ppt "HUMAINE - WP5 Belfast04 1 Experience with emotion labelling in naturalistic data L. Devillers, S. Abrilian, JC. Martin, LIMSI-CNRS, Orsay E. Cowie, C."

Similar presentations


Ads by Google