Presentation is loading. Please wait.

Presentation is loading. Please wait.

ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008.

Similar presentations


Presentation on theme: "ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008."— Presentation transcript:

1 eNTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008

2 Team Olga Vybornova (Université catholique de Louvain, UCL-TELE, Belgium) ‏ Hildeberto Mendonça (Université catholique de Louvain, UCL-TELE, Belgium) ‏ Ao Shen (University of Birmingham, UK) ‏ Daniel Neiberg (TMH/CTT, KTH Royal Institute of Technology, Sweden) ‏ David Antonio Gomez Jauregui (TELECOM and Management SudParis, France) ‏

3 Project objectives to augment and improve the previous work, look for new methods of data fusion to resolve the problem and implement a/the technique distinguishing between the data from different modalities that should be fused and the data that should not be fused but analyzed separately to explore and employ a context-aware cognitive architecture for decision-making purposes. 3

4 4 A set of variables describing states of the world (user’s input, an object, an event, behavior, etc.) represented in different media and through different information channels. GOAL OF DATA FUSION: The result of the fusion (merging semantic content from multiple streams) should give an efficient joint interpretation of the multimodal behavior of the user(s) – to provide effective and advanced interaction Background - Multimodality

5 Audio Stream Video Stream Speech Recognizer Video Analyzer Sound Waves Syntactic Analyzer Recognized String Sequence of Images Semantic Analyzer Syntactic Triple Knowledge Base Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings

6 Audio Stream Video Stream Speech Recognizer Video Analyzer Sound Waves Syntactic Analyzer Recognized String Sequence of Images Semantic Analyzer Syntactic Triple Knowledge Base Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings

7 Audio Stream Video Stream Sphinx-4 Open CV Sound Waves C & C Tool Parser Recognized String Sequence of Images C & C Tool Boxer Syntax Analysis Protegè Jena Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings Semantic Validation

8 Integration 8 All tools are integrated through socket communication C++ and Java interoperating normally The interchanging data format is XML Verifiable Easy data identification Easy data compatibility Low cost of manipulation Processing XML on demand Main issues: transparency, extensibility and customization

9 Speech Recognition 9 Sphinx 4 Integrated in system! Fined tuned for maximum length of n-best lists 2 Language models created Scenario dependent 3-grams, 150 Words 86,9% Accuracy, Speed: 0,94 X real time Wall Street Journal + scenarios 3-grams, 5000 words 68,6% Accuracy, Speed: 3,19 X real time

10 Speech Identification 10 Standard GMM-based speaker identification system Developed in Matlab To the right are the results from a 2-person development set as a function of Gaussians

11 Speech Recognition Output 11 yesterday i received an email from nick yesterday i received an email from nick yesterday i received an email from nick to yesterday i received an email from nick for

12 Syntax and Semantics 12

13 Syntax and Semantics 13

14 Syntax and Semantics 14

15 Image Processing 15 OpenCV Library (Open Source) ‏ Motion History to calculate the motion direction Matching template to identify objects in the scene Gaussian probability distribution to model the color of clothes Background subtraction technique to detect the foreground Blob identification to track people in the scene

16 Image Processing 16

17 Image Processing Output 17

18 Ontology 18 Restricted-domain ontology – structure and its instantiation Pattern situations (semantic frames) ‏ User profile - a priori collected information about users - preferences, social relationships information, etc. - and dynamically obtained data Using Protegè to create and edit Using Jena to manage the ontology data

19 Ontology 19

20 Project schedule 20 Overall progress: 65 % WP1: Workshop preparation – Done WP2: Integration of multimodal components – Done WP3: Multimodal fusion implementation – Running WP4: Scenario implementation and reporting – To do Strategic changes to achieve the goal: Everybody focusing on the fusion mechanism Less priority on the improvement of modalities Each risky task has a plan B associated with less time consuming, but less robust too.

21 Next Steps 21 Intergration of WordNet into the ontology Rules to process human behavior Mapping the semantic analysis with the ontology Fusion mechanism


Download ppt "ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008."

Similar presentations


Ads by Google