Multimodal Human-Computer Interaction by Maurits André The design and implementation of a prototype.

Slides:



Advertisements
Similar presentations
National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory
Advertisements

HCI Research in The Netherlands Prof. dr. Matthias Rauterberg IPO Center for User-System Interaction TU/e Eindhoven University of Technology.
Virtual Reality Design Virtual reality systems are designed to produce in the participant the cognitive effects of feeling immersed in the environment.
Shweta Jain 1. Motivation ProMOTE Introduction Architectural Choices Pro-MOTE system Architecture Experimentation Conclusion and Future Work Acknowledgement.
Digital Interactive Entertainment Dr. Yangsheng Wang Professor of Institute of Automation Chinese Academy of Sciences
Page 1 SIXTH SENSE TECHNOLOGY Presented by: KIRTI AGGARWAL 2K7-MRCE-CS-035.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
SM1205 Interactivity Topic 01: Introduction Spring 2012SCM-CityU1.
Class 6 LBSC 690 Information Technology Human Computer Interaction and Usability.
- List of Multimodal Libraries - (EIF students only)
Multimodal User Interface Management Part of STIMULATE project at CAIP Center, Rutgers University PI: Dr. James L. Flanagan Research on natural human-computer.
Ambient Computational Environments Sprint Research Symposium March 8-9, 2000 Professor Gary J. Minden The University of Kansas Electrical Engineering and.
SM1205 Interactivity Topic 01: Introduction Spring 2010SCM-CityU1.
MUltimo3-D: a Testbed for Multimodel 3-D PC Presenter: Yi Shi & Saul Rodriguez March 14, 2008.
Stanford hci group / cs376 research topics in human-computer interaction Multimodal Interfaces Scott Klemmer 15 November 2005.
CS335 Principles of Multimedia Systems Multimedia and Human Computer Interfaces Hao Jiang Computer Science Department Boston College Nov. 20, 2007.
SM1205 Interactivity Topic 01: Introduction Spring 2011SCM-CityU1.
Hardware (how they work)
“ Walk to here ” : A Voice Driven Animation System SCA 2006 Zhijin Wang and Michiel van de Panne.
Virtual Reality Virtual Reality involves the user entering a 3D world generated by the computer. To be immersed in a 3D VR world requires special hardware.
Introduce about sensor using in Robot NAO Department: FTI-FHO-FPT Presenter: Vu Hoang Dung.
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
MACHINE-MEDIATED COMMUNICATION 3D environments, (sensors, projectors: sight, sound, touch, other sensory modes) information capture, display personal agents.
Introduction to Graphics and Virtual Environments.
Knowledge Systems Lab JN 8/24/2015 A Method for Temporal Hand Gesture Recognition Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
Chapter 12 Designing the Inputs and User Interface.
Copyright John Wiley & Sons, Inc. Chapter 3 – Interactive Technologies HCI: Developing Effective Organizational Information Systems Dov Te’eni Jane.
Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis
Presentation by: K.G.P.Srikanth. CONTENTS  Introduction  Components  Working  Applications.
Parser-Driven Games Tool programming © Allan C. Milne Abertay University v
11.10 Human Computer Interface www. ICT-Teacher.com.
CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.
Unit 1_9 Human Computer Interface. Why have an Interface? The user needs to issue instructions Problem diagnosis The Computer needs to tell the user what.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
APML, a Markup Language for Believable Behavior Generation Soft computing Laboratory Yonsei University October 25, 2004.
Multimodal Information Access Using Speech and Gestures Norbert Reithinger
Human Computer Interaction © 2014 Project Lead The Way, Inc.Computer Science and Software Engineering.
Human-Computer Interaction
卓越發展延續計畫分項三 User-Centric Interactive Media ~ 主 持 人 : 傅立成 共同主持人 : 李琳山,歐陽明,洪一平, 陳祝嵩 水美溫泉會館研討會
Fundamentals of Information Systems, Sixth Edition1 Natural Language Processing and Voice Recognition Processing that allows the computer to understand.
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
A Common Ground for Virtual Humans: Using an Ontology in a Natural Language Oriented Virtual Human Architecture Arno Hartholt (ICT), Thomas Russ (ISI),
Mixed Reality: A Model of Mixed Interaction Céline Coutrix, Laurence Nigay User Interface Engineering Team CLIPS-IMAG Laboratory, University of Grenoble.
1 Human Computer Interaction Week 5 Interaction Devices and Input-Output.
HCI 입문 Graphics Korea University HCI System 2005 년 2 학기 김 창 헌.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
GUI Meets VUI: Some Possible Guidelines James A. Larson VP, Larson Technical Services 4/21/20151© 2015 Larson Technical Services.
Student: Ibraheem Frieslaar Supervisor: Mehrdad Ghaziasgar.
ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.
Copyright John Wiley & Sons, Inc. Chapter 3 – Interactive Technologies HCI: Developing Effective Organizational Information Systems Dov Te’eni Jane.
1 Interaction Devices CIS 375 Bruce R. Maxim UM-Dearborn.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
A Speech Interface to Virtual Environment Authors Scott McGlashan and Tomas Axling Swedish Institute of Computer Science.
Software Architecture for Multimodal Interactive Systems : Voice-enabled Graphical Notebook.
Unit F451 Computer Fundamentals Components of a Computer System Software Data: Its representation, structure and management in information.
MULTIMODAL AND NATURAL COMPUTER INTERACTION Domas Jonaitis.
AUTHOR PRADEEP KUMAR B.tech 1 st year CSE branch Gnyana saraswati college of eng. & technology Dharmaram(b)
Hand Gestures Based Applications
Human Computer Interaction (HCI)
SIXTH SENSE TECHNOLOGY
11.10 Human Computer Interface
Module 2… Talking with computers
Human Computer Interaction (HCI)
SIXTH SENSE TECHNOLOGY
Multimodal Interfaces
Virtual Reality.
Professor John Canny Spring 2003
Professor John Canny Spring 2004
Presentation transcript:

Multimodal Human-Computer Interaction by Maurits André The design and implementation of a prototype

2 Overview Introduction –Problem statement Technologies used –Speech –Hand gesture input –Gazetracking Design of the system Multimodal issues

3 Overview Testing –program tests –usability tests –human factors studies Conclusions and recommendations Future work Video

4 CAIP Center for Computer Aids in Industrial Productivity Robust Image Understanding Machine Vision Groupware Networking Visiometrics and modeling VSLI design Multimedia Information Systems Virtual Reality Speech Generation Image & video compression Adaptive voice mimic Prof. James L. Flanagan Microphone arrays Speech / Speaker Recognition

5 Multimodal HCI Currently: mouse, keyboard input More natural communication technologies available: –sight –sound –touch Robust and intelligent combination of these technologies

6 Aim

7 Problem statement Study three technologies: Speech recognition and synthesis Hand gesture input Gazetracking Design prototype (appropriate application) Implement prototype Test and debug Human performance studies

8 SR and TTS Microsoft Whisper system (C++) Speaker independent Continuous speech Restricted task-specific vocabulary (150) Finite state grammar Sound capture: microphone array

9 Hand gesture input Advantages: –Natural –Powerful –Direct Disadvantages: –Fatigue (RSI) –Learning –Non-intentional gestures –Lack of comfort

10 Force feedback tactile glove Polhemus tracker for wrist position/orientation 5 gestures are recognized

11 Implemented gestures Grab “Move this” Open hand “Put down” Point at an object “Select” “Identify” Thumb up “Resize this”

12 Eyes output Direction of gaze Blinks Closed eyes Part of emotion

13 Gazetracker ISCAN RK-726 gazetracker 60 Hz. Calibration

14 Application Requirements: –Multi-user, collaborative –Written in Java –Simple Choice: –Drawing program –Military mission planning system

15 Drawing program

16 Military mission planning

17 Frames Slots Inheritance Generic properties Default values circle/rect/line Type Create Figure Color [white]/red/grn/... Height 0..max_y 0..max_x Width Location (x,y)

18 Command Frame YES Confirm EXIT 0...max_int Object ID Cmd. with object Confirm NO circle/rect/line Type Create Figure Color [white]/red/grn/... Height 0..max_y 0..max_x Width Location (x,y) Destination Move

19 Design Gazetracking Speech Recognition Gesture Recognition Fusion Agent Collaborative Workspace Speech Synthesis

20 Fusion Agent Parser Slot Buffer Control Rule Based Feedback Rule Based Feedback Tank 7 object (x4,y4) where Frame: MOVE Example:“Move tank seven here.” (x1,y1) (x2,y2) (x3,y3) (x4,y4)

21 Classification of feedback Confirmation –Exit the system.Are you sure? Information retrieval –What is this?This is tank seven. –Where is tank nine.Visual feedback. Missing data –Create tank.Need to specify an ID for a tank. Semantic error –Create tank seven.Tank seven already exists. –Resize tank nine.Cannot resize a tank.

22 Multimodal issues Referring to objects –describing in speech:Move the big red circle –using anaphora:Move it –by glove –gaze + pronoun:Delete this –glove + pronoun:Delete this Timestamps Create a red rectangle from here to here T1 T2 T3 T4 T5 T6 T7 T8 xy1 xy2 xy3 xy4 xy5 xy6 xy7 xy8

23 Multimodal issues Ambiguity saying x, looking at y»» x saying x, pointing at y»» x looking at x, pointing at y»» x saying x, gesturing y»» xy or yx Redundancy saying x, looking at x»» x etc.

24 Program testing Implementation in Java Program testing and debugging Module testing Integration testing Configuration testing Time testing Recovery testing

25 Testing Usability tests –Demonstration with military personnel –Human factors study: Script for user Questionnaire for user Tables for observer Log-file for observer

26 Lab

27 Conclusions: ModalityAccuracySpeed Learning Speech++++ – Gaze Glove++ + Mouse++– + Selecting

28 Conclusions / recommendations Speech real-time low error rate timestamps misunderstanding grammar in help file Glove real-time fatigue low precision non-intentional gesture 2D »» 3D limited number of gestures Gaze real-time head movements self-calibration jumpiness of eye movements face tracker object of interest

29 General remarks Response time within 1 sec. Instruction, help files Application effective but limited

30 Future work Human performance studies Conversational interaction Context-based reasoning and information retrieval New design:

31 Control Status History Gazetracker Facetracker Speaker ident. Gesture recog. Speech recog. Agents Blackboard User Speech synth. Haptic feedback Discourse mng. Graphical fb. User modelDisambiguator Domain knowledge Application Controller

32 Problem statement Study three technologies: Speech recognition and synthesis Hand gesture input Gazetracking Design prototype (appropriate application) Implement prototype Test and debug Human performance studies