Presentation is loading. Please wait.

Presentation is loading. Please wait.

ENTERFACE’08 Multimodal high-level data integration Project 2 1.

Similar presentations


Presentation on theme: "ENTERFACE’08 Multimodal high-level data integration Project 2 1."— Presentation transcript:

1 eNTERFACE’08 Multimodal high-level data integration Project 2 1

2 Team Olga Vybornova (Université catholique de Louvain, UCL-TELE, Belgium) Hildeberto Mendonça (Université catholique de Louvain, UCL-TELE, Belgium) Ao Shen (University of Birmingham, UK) Daniel Neiberg (TMH/CTT, KTH Royal Institute of Technology, Sweden) David Antonio Gomez Jauregui (TELECOM and Management SudParis, France)

3 Project objectives to augment and improve the previous work, look for new methods of data fusion to resolve the problem and implement a/the technique distinguishing between the data from different modalities that should be fused and the data that should not be fused but analyzed separately to explore and employ a context-aware cognitive architecture for decision-making purposes. 3

4 4 A set of variables describing states of the world (user’s input, an object, an event, behavior, etc.) represented in different media and through different information channels. GOAL OF DATA FUSION: The result of the fusion (merging semantic content from multiple streams) should give an efficient joint interpretation of the multimodal behavior of the user(s) – to provide effective and advanced interaction Background - Multimodality

5 Cognitive behavior is goal-oriented, it takes place in a rich, complex, detailed environment, so… the system should: o acquire and process a large amount of knowledge, o be flexible and be a function of the environment, o be capable of learning from the environment and experience. Requirements behavior = architecture + content

6 6 Types of context Domain context (prior knowledge of the domain, semantic frames with predefined action patterns, user s profiles, situation modelling, a priori developed and dynamically updated ontology defining subjects, objects, activities and relations between them for a particular person) Video context (capturing the users’ actions in the observation scene) Linguistic context (derived from natural language semantic analysis)

7 Example scenario 7

8 Knowledge-based approach Restricted-domain ontology – structure and its instantiation Pattern situations (semantic frames) User profile - a priori collected information about users - preferences, social relationships information, etc. - and dynamically obtained data 8

9 Audio Stream Video Stream Speech Recognizer Video Analyzer Sound Waves Syntactic Analyzer Recognized String Sequence of Images Semantic Analyzer Syntactic Triple Knowledge Base Fusion Mechanism Human Behavior Analyzer Knowledge Base Movements Coordinates Movements Meanings Advise People Architecture Linguistic meanings

10 Tooling / Implementation Speech recognition: Sphinx-4 Syntactic Analysis: C&C Parser + semantic analyzer (http://svn.ask.it.usyd.edu.au/trac/candc/wiki)http://svn.ask.it.usyd.edu.au/trac/candc/wiki Semantic Analysis Ontology construction and instatiation: Protégé (http://protege.stanford.edu/)http://protege.stanford.edu/ Analysis: Soar (http://sitemaker.umich.edu/soar/home)http://sitemaker.umich.edu/soar/home Video Analysis: Visual Hull (developed by Diego Ruiz, UCL- TELE) Human Behavior Analysis: Soar Ontology: Protégé Fusion Mechanism: Soar Integration: OpenInterface (www.openinterface.org)www.openinterface.org

11 Challenges Unrestricted natural language Free natural behavior within home/office environment 11

12 Why do we need Soar ? CAPABILITIES: manages a full range of tasks expected of an intelligent agent, from routine to extremely difficult, open-ended problems represents and uses different forms of knowledge – procedural (rules), semantic (declarative, facts), episodic employs various problem solving methods interacts with the outside world integrates reaction, deliberation, planning, meta-reasoning dynamically switching between them has integrated learning (chunking, reinforcement learning, episodic & semantic learning) is useful in cognitive modeling + taking into account emotions, feeling and mood is easy to integrate with other systems and environments (SML – Soar Markup Language – efficiently supports many languages) 12

13 Soar architecture 13

14 Project schedule 14

15 WP 1 Pre-workshop preparation Task 1.1 - Identify and investigate the necessary multimodal components and systems to use during the workshop; Task 1.2 - define the system architecture taking advantage of the previously accumulated experience and existing results; Task 1.3 - describe precisely the scenario to work on it during the workshop Task 1.4 - make the video sequences 15

16 WP 2 Integration of multimodal components and systems (1 st week) Task 2.1 - implement the architecture, putting all multimodal components and systems work together. Task 2.2 - explore and select the most suitable method(s) for action-speech multimodal fusion. Task 2.3 - investigate the fusion implementation 16

17 WP 3 Multimodal fusion implementation (2 nd week) Task 3.1 - fusion algorithms implementation Task 3.2 - fusion algorithms testing in a distributed computer environment. 17

18 WP 4 Scenario implementation and reporting (3 rd and 4 th weeks) Task 4.1 – integrate and test of all the components and systems on the OpenInterface platform Task 4.2 - prepare a presentation and reports about the results Task 4.3 - demonstrate the results 18


Download ppt "ENTERFACE’08 Multimodal high-level data integration Project 2 1."

Similar presentations


Ads by Google