ENTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008.

Slides:



Advertisements
Similar presentations
Intisar O. Hussien Faculty of Computer Studies Arab Open University
Advertisements

CHART or PICTURE INTEGRATING SEMANTIC WEB TO IMPROVE ONLINE Marta Gatius Meritxell González TALP Research Center (UPC) They are friendly and easy to use.
Technical and design issues in implementation Dr. Mohamed Ally Director and Professor Centre for Distance Education Athabasca University Canada New Zealand.
Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
Martin Wagner and Gudrun Klinker Augmented Reality Group Institut für Informatik Technische Universität München December 19, 2003.
International Maritime Protection Symposium 2005 The Harbour Defence IKC2 Experiment 13 Dec 2005 Tan Choon Kiat Defence Science Technology Agency, Singapore.
ENTERFACE’08 Multimodal high-level data integration Project 2 1.
Didier Perroud Raynald Seydoux Frédéric Barras.  Abstract  Objectives  Modalities ◦ Project modalities ◦ CASE/CARE  Implementation ◦ VICI, Iphone,
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
1 SAFIRE Project DHS Update – July 15, 2009 Introductions  Update since last teleconference Demo Video - Fire Incident Command Board (FICB) SAFIRE Streams.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Game Development with Kinect
© Anselm SpoerriInfo + Web Tech Course Information Technologies Info + Web Tech Course Anselm Spoerri PhD (MIT) Rutgers University
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Community Manager A Dynamic Collaboration Solution on Heterogeneous Environment Hyeonsook Kim  2006 CUS. All rights reserved.
HUMANOID ANIMATION DRIVEN BY HUMAN VOICE Thesis Advisor : Dr. Donald P. Brutzman Second Reader : Dr. Xiaoping Yun A Thesis By Ozan APAYDIN, Turkish Navy.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
 CoDesign A Highly Extensible Collaborative Software Modeling Framework Jae young Bang University of Southern California.
Work Package 3 SEE cluster policy learning platform.
English Language (1) for Computer Students “ENG 126”
The next step for IDEC From information to conversation Using Internet for Sharing Knowledge Jeroen Clemens
Data collection and experimentation. Why should we talk about data collection? It is a central part of most, if not all, aspects of current speech technology.
Kwabena Frimpong-Manso (Ph.D)
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
An approach to Intelligent Information Fusion in Sensor Saturated Urban Environments Charalampos Doulaverakis Centre for Research and Technology Hellas.
Improving Human-Robot Interaction Jill Drury, The MITRE Corporation Improving Human-Robot Interaction Jill Drury, The MITRE Corporation Collaborators:
SIMILAR NoE at the HUMAINE meeting - 5/06/2007 Multimodal Interaction R&D 4 Years Dec 2003 – Dec M€ 32 partners + 8 fellows.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Break-out Group # D Research Issues in Multimodal Interaction.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Agropedia A brand new 2 What is new? 3 Why? 4.
The Free / Open Source Software (FOSS) movement stands for full freedom for the production, distribution, modification and use of software as per the.
Multimodal Information Analysis for Emotion Recognition
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Scenarios for a Learning GRID Online Educa Nov 30 – Dec 2, 2005, Berlin, Germany Nicola Capuano, Agathe Merceron, PierLuigi Ritrovato
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
by Maria Rita Marruganti DIFFERENT WAYS OF SENDING INFORMATION Passive e.g. newspapers, radio, television. You don’t produce, just receive information.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Cloud Computing Applications Hsu, Ya-Lun. Google App Engine Using Python and Django Register applications for free from Google Run web applications on.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Team working in distributed environments M253 Communicating, Cooperating & Collaborating on Line Faculty of Computer Studies Arab Open University Kuwait.
ICT-enabled Agricultural Science for Development Scenarios, Opportunities, Issues by ICTs transforming agricultural science, research & technology generation.
Semantic Web Project Pancreatic Cancer Search Facilitator.
Creating User Interfaces Ideas & Trends Homework: Post constructive comments. Work on project.
Fall CSE330/CIS550: Introduction to Database Management Systems Prof. Susan Davidson Office: 278 Moore Office hours: TTh
The Collaborative Imaging Grid Paul Javid, Kurtis Heimerl A collaborative research environment enabling Researchers to learn from images when computer.
ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.
WBI/WCI - SKM 14 July Analysis and Knowledge Extraction from Video & Audio Rick Parent Jim Davis Raghu Machiraju Deleon Wang Department of Computer.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Visual Information Retrieval
Utilizing AI & GPUs to Build Cloud-based Real-Time Video Event Detection Solutions Zvika Ashani CTO.
Visit for more Learning Resources
OUTLINE Basic ideas of traditional retrieval systems
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Title of poster... M. Author1, D. Author2, M. Author3
3rd Studierstube Workshop TU Wien
Professor John Canny Spring 2003
S Carbini, O. Bernier, J.E. Viallet,
Title of ePoster... M. Author1, D. Author2, M. Author3
Artificial Intelligence 2004 Speech & Natural Language Processing
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
Example of Event-Based Video Data (Touch-down Scenario)
Presentation transcript:

eNTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008

Application challenges 2 users in their home/office environment unrestricted natural language free human behavior

Components integrated Speech Recognizer Video Analyzer Sound Waves Syntactic Analyzer Recognized String Sequence of Images Semantic Analyzer Syntactic Triple Knowledge Base Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings Audio Stream Video Stream

Audio StreamVideo Stream Sphinx-4 Open CV Sound Waves C & C Parser Recognized String Sequence of Images C & C Boxer Syntax Analysis Protegè Jena Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings Semantic Validation

Example Scenario [Ronald] I want to call Nick. Nick mentioned that he attended a wine tasting course. [Beto] It sounds interesting, I like wine. [Ronald] Actually I plan to join the next class. He also mentioned a book about French wines, but I cannot recall the name of the author. [Beto] Why don't you send a mail to Nick? [Ronald] Maybe I can find a book about it in the library. [Beto] Yes, you are right. [Beto] Did you find it? [Ronald] Yes, I did.

Hints for plan recognition by speech Alerts: want, need, wish, require, going to, plan, look for, wonder, can, may, must, do you know, do we have, etc. Stop-alerts: - negation ( I am not going to …) - past tense ( Yesterday I was going to …)

Maybe I can find a book about it in the library Ronald is moving towards the book shelves

Decision making If (Ronald) [wants to send] { to Nick} & (Ronald [is moving to] {the computer} | He [is close to] {the computer}) then open the mail client with the “to” field filled with If (Ronald) [can] find {book} [about] {it} [in] {the library} & (Ronald [is moving to] {the library} then There is a book about French wines on the first shelf. If (Ronald) [can] find {book} [about] {it} [in] {the library} & (Ronald [is moving to] {the computer}) then Open a web search website and put the keyword in the search field.

Achievements spatial relationships (based on the fixed “anchor” objects in the room) semantic fusion of events not coinciding in time good results in speaker identification: synchronisation between image and speech identification an open framework to manage fusion between two (our case) or more modalities was created during the project and will be enhanced further each component can run in a separated machine thanks to the distribution mechanism interchanging data through a TCP/IP network.

Future work implement effective learning efficient decision making even from information fragments spatial relationships relatively to moving people 3D video analysis detection of orientation of the people in the scene eye gaze tracking recognition of various types of gestures dealing with natural language redundancy (repeating the same idea in different words)

Further development of results integration on the OpenInterface platform (openinterface.org) create an open-source community around the project to - gain ideas and contributions from outside - have new modalities to fuse create a website, a forum, a mailing list