Presentation is loading. Please wait.

Presentation is loading. Please wait.

Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec.

Similar presentations


Presentation on theme: "Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec."— Presentation transcript:

1 Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec.

2 Juli 2010 eNTERFACE Lip reading Facial expression recognition Automatic recognition of facial expressions and lipreading using vector flow Model based approach

3 Juli 2010 eNTERFACE What makes visual speech recognition so hard?  Visemes  Smaller word separability  Speech info in audio > Speech info in video

4 Juli 2010 eNTERFACE Lip-reading by Humans  People recognize speech better when the signal is both auditory and visual  The difference in recognition rates grows with the level of noise in the environment

5 Juli 2010 eNTERFACE Inspiration  In the 1968 Stanley Kubrick film 2001: A space odyssey the computer reads from the lip- movements the conversation of two astronauts.  Thirty years later automated lip- reading becomes a significant part of research in speech recognition systems.

6 Juli 2010 eNTERFACE

7 New speech corpus AV speech corpus

8 Juli 2010 eNTERFACE

9 Databases of different quality and resolution

10 Juli 2010 eNTERFACE Recording a new speech corpus AV speech corpus Visemes|Corpus|Tracking|Features Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations

11 Juli 2010 eNTERFACE Recording a new speech corpus AV speech corpus Visemes|Corpus|Tracking|Features Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations

12 Juli 2010 eNTERFACE New speech corpus  Dutch  Recorded at high-speed: 100 fps  Front and profile views included  70 people 49 male, 21 female Students, professors, secretaries, friends  Utterances: Sentences, digits, spelling, conversation starters/endings, open questions Normal, fast, whispering AV speech corpus Visemes|Corpus|Tracking|Features Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations

13 Juli 2010 eNTERFACE New speech corpus AV speech corpus Visemes|Corpus|Tracking|Features Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations

14 Juli 2010 eNTERFACE Lip-reading by Humans  People recognize speech better when the signal is both auditory and visual  The difference in recognition rates grows with the level of noise in the environment

15 Juli 2010 eNTERFACE ISFER Workbench Examples (continued)

16 Juli 2010 eNTERFACE Active Contours  Internal and external energies Internal energy forces contour to shrink Locally defined external energy forces the contour to stop at the edge of the mouth  Computationally cheap  Sensitivity to initial setting of the contour 7 9 810 12 13 1113 11 1097 8 7 6 8 6 5

17 Juli 2010 eNTERFACE Template Matching  Internal and external energies Internal energy forces template to maintain geometry Globally defined external energy forces appropriate placement on the picture  Better results than with snakes  Integration of energy functions at each step can be very time consuming

18 Juli 2010 eNTERFACE Model  Goal: lip-reading  Needed: accurate description of visible parts of articulatory system  Accurate description of the shape of the mouth: measurements of the distance of the lip to a center of the mouth measurements of thickness of visible part of the lips

19 Juli 2010 eNTERFACE Data processing  Filtered image -intensity distribution -center of mouth  Image in polar coordinates  Conditional distribution  Mean and variance functions ( continued )

20 Juli 2010 eNTERFACE Data visualization  Single frame data vector:

21 Juli 2010 eNTERFACE Results of Experiments  Feed Forward BP Vanmiddag komt de pianostemmer langs om mijn vleugel te stemmen

22 Juli 2010 eNTERFACE

23

24 Tracking the face – Optical flow  Capturing apparent motion of subsequent images in a grid of motion vectors  Advantages No lip model required Good at capturing motion  Disadvantage Slow Face tracking

25 Juli 2010 eNTERFACE Tracking the face – Lip Geometry Estimation  Applying some color filters and capturing the lip contours in polar coordinates  Advantages No lip model required More or less person-independent  Disadvantage Not robust to external factors Face tracking

26 Juli 2010 eNTERFACE Tracking the face – Active Appearance Models  Point tracking according to a statistical lip model  Disadvantage Requires annotated training images  Advantages Robust against external factors Fast! Face tracking

27 Juli 2010 eNTERFACE Active Appearance Models – Design of the lip model Face tracking

28 Juli 2010 eNTERFACE AAM model point coordinates Face tracking

29 Juli 2010 eNTERFACE Features plotted for “F” Feature extraction time (frames)

30 Juli 2010 eNTERFACE 5-states HMM

31 Juli 2010 eNTERFACE Automatic bi-modal human emotion recognition Automatic recognition of facial expressions using active Appearance model Model based approach

32 Juli 2010 eNTERFACE Face localization

33 Juli 2010 eNTERFACE User-interface prototype iCat to help users in daily tasks.

34 Juli 2010 eNTERFACE M.A.E.L.I.A. Our digital cat H.C.I. Group

35 Juli 2010 eNTERFACE H.C.I. Group

36 Juli 2010 eNTERFACE H.C.I. Group

37 Juli 2010 eNTERFACE Requirements in other words… Are you out of your mind? I am sleeping!!! Get a life! I am still sleeping! I am so bored! I wish I had a companion! 7:00 AM8:00 AM 11:00 AM14:00 AM I feel so lonely!!! I am very sad and depressed. 16:00 AM Finally I have a friend! I am so happy and I even managed to pick up the bone! Wow!!! AIBO! Bring me my newspaper!!! AIBO! Let’s play!!! Follow me

38 Juli 2010 eNTERFACE Multimodal Communication Uh, …. I have no time to do anything with you Hello, do you like to chat with me ? Uh, what a nerd I want a date She looks nice

39 Juli 2010 eNTERFACE Multi-modal interaction

40 Juli 2010 eNTERFACE

41 Would you like to join me for a dinner ?

42 Juli 2010 eNTERFACE

43

44

45

46

47 Chat-session  A cup of tea?  Mmh, njeh, I don’t like tea.  What’s wrong with tea?  Tea makes me sick.  That’s nonsense!!  And my sister doesn’t like you too!  She is very disappointed!!  Hihi, I was joking!!!  Oh, that’s funny!!!

48 Juli 2010 eNTERFACE Chat-session  (f)A cup of tea?: - )  (m)Mmh, njeh, I don’t like tea.(: - (  (f)What’s wrong with tea?: - o  (m)Tea makes me sick.% - \  (f)That’s nonsense!!: - l l  (f)My sister doesn’t like you too!: - l l  (f)She is very disappointed!!: - (  (m)Hihi, I was joking!!!; - )  (f)Oh, that’s funny!!!: - ]

49 Juli 2010 eNTERFACE A cup of tea? : - )

50 Juli 2010 eNTERFACE Mmh, njeh, I don’t like tea. (: - (

51 Juli 2010 eNTERFACE What’s wrong with tea? : - o

52 Juli 2010 eNTERFACE Tea makes me sick. % - \

53 Juli 2010 eNTERFACE That’s nonsense!! : - l l

54 Juli 2010 eNTERFACE My sister doesn’t like you too! : - l l

55 Juli 2010 eNTERFACE She is very disappointed!! : - (

56 Juli 2010 eNTERFACE Hihi, I was joking!!! ; - )

57 Juli 2010 eNTERFACE Oh, that’s funny!!! : - ]


Download ppt "Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec."

Similar presentations


Ads by Google