Professor John Canny Spring 2003

Slides:



Advertisements
Similar presentations
PGNET, Liverpool JMU, June 2005 MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney, Paul Mc Kevitt School of Computing.
Advertisements

Alford Academy Business Education and Computing1 Advanced Higher Computing Based on Heriot-Watt University Scholar Materials GUI – advantages and disadvantages.
ENTERFACE’08 Multimodal high-level data integration Project 2 1.
John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.
Lecture 7 Date: 23rd February
Stanford hci group / cs376 research topics in human-computer interaction Multimodal Interfaces Scott Klemmer 15 November 2005.
Multimodal Interaction. Modalities vs Media Modalities are ways of encoding information e.g. graphics Media are instantiations of modalities e.g. a particular.
An Overview of QuickSet, from OGI  Cohen, P. R., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., and Clow, J. (1997). QuickSet:
CS335 Principles of Multimedia Systems Multimedia and Human Computer Interfaces Hao Jiang Computer Science Department Boston College Nov. 20, 2007.
Client/Server Architectures
Should Intelligent Agents Listen and Speak to Us? James A. Larson Larson Technical Services
Chapter 11: Interaction Styles. Interaction Styles Introduction: Interaction styles are primarily different ways in which a user and computer system can.
Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.
Unit 1_9 Human Computer Interface. Why have an Interface? The user needs to issue instructions Problem diagnosis The Computer needs to tell the user what.
MIT 6.893; SMA 5508 Spring 2004 Larry Rudolph Lecture 4: Graphic User Interface Graphical User Interface Larry Rudolph MIT 6.893; SMA 5508 Spring 2004.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
GUI Meets VUI: Some Possible Guidelines James A. Larson VP, Larson Technical Services 4/21/20151© 2015 Larson Technical Services.
O The Feedback Culture. o Theories of Communication. o Barriers. REVIEW.
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
Speech and multimodal Jesse Cirimele. papers “Multimodal interaction” Sharon Oviatt “Designing SpeechActs” Yankelovich et al.
Multimodal and Natural computer interaction Evelina Stanevičienė.
MULTIMODAL AND NATURAL COMPUTER INTERACTION Domas Jonaitis.
Characteristics of Graphical and Web User Interfaces
Invitation to Computer Science 6th Edition
Early Intervention-Preschool Conference
Ten Myths of Multimodal Interaction
Prototyping in the software process
Software Prototyping.
11.10 Human Computer Interface
Sizing With Function Points
Principles of Network Applications
Event Driven Programming Dick Steflik
Human Computer Interaction Lecture 20 Universal Design
Interaction Styles.
The Client/Server Database Environment
CSC 480 Software Engineering
TYPES OFF OPERATING SYSTEM
DESIGNING WEB INTERFACE Presented By, S.Yamuna AP/CSE
Software engineering USER INTERFACE DESIGN.
Professor John Canny Fall 2001 Nov 29, 2001
Professor John Canny Spring 2003
Issues in Spoken Dialogue Systems
Professor John Canny Spring 2003
Multimodal Interfaces
Types of Communication
CEN3722 Human Computer Interaction Advanced Interfaces
Chapter 12: Automated data collection methods
Class Announcements 1 week left until project presentations!
Tools of Software Development
Ch 15 –part 3 -design evaluation
Professor John Canny Fall 2004
Lecture 1: Multi-tier Architecture Overview
Design Tools Jeffrey Heer · 7 May 2009.
GRAPHICAL USER INTERFACE
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Speech & Multimodal Scott Klemmer · 16 November 2006.
Software Construction
I/O Toolkits Scott Klemmer · 16 November 2006.
Characteristics of Graphical and Web User Interfaces
GRAPHICAL USER INTERFACE GITAM GADTAULA. OVERVIEW What is Human Computer Interface (User Interface) principles of user interface design What makes a good.
GRAPHICAL USER INTERFACE GITAM GADTAULA KATHMANDU UNIVERSITY CLASS PRESENTATION.
Chapter 14 The User View of Operating Systems
Implementation support
Human and Computer Interaction (H.C.I.) &Communication Skills
Human Computer Interaction Lecture 19 Universal Design
ACM programming contest
Professor John Canny Spring 2004
Implementation support
CS249: Neural Language Model
Presentation transcript:

Professor John Canny Spring 2003 CS 160: Lecture 23 Professor John Canny Spring 2003 2/24/2019

Preamble Handout for next lecture. Quiz on Help systems. 2/24/2019

Multimodal Interfaces Multi-modal refers to interfaces that support non-GUI interaction. Speech and pen input are two common examples - and are complementary. 2/24/2019

Speech+pen Interfaces Speech is the preferred medium for subject, verb, object expression. Writing or gesture provide locative information (pointing etc). 2/24/2019

Speech+pen Interfaces Speech+pen for visual-spatial tasks (compared to speech only) 10% faster. 36% fewer task-critical errors. Shorter and simpler linguistic constructions. 90-100% user preference to interact this way. 2/24/2019

Put-That-There User points at object, and says “put that” (grab), then points to destination and says “there” (drop). Very good for deictic actions, (speak and point), but these are only 20% of actions. For the rest, need complex gestures. 2/24/2019

Multimodal advantages Advantages for error recovery: Users intuitively pick the mode that is less error-prone. Language is often simplified. Users intuitively switch modes after an error, so the same problem is not repeated. 2/24/2019

Multimodal advantages Other situations where mode choice helps: Users with disability. People with a strong accent or a cold. People with RSI. Young children or non-literate users. 2/24/2019

Multimodal advantages For collaborative work, multimodal interfaces can communicate a lot more than text: Speech contains prosodic information. Gesture communicates emotion. Writing has several expressive dimensions. 2/24/2019

Multimodal challenges Using multimodal input generally requires advanced recognition methods: For each mode. For combining redundant information. For combining non-redundant information: “open this file (pointing)” Information is combined at two levels: Feature level (early fusion). Semantic level (late fusion). 2/24/2019

Early fusion Vision data Speech data Other sensor data Feature recognizer Feature recognizer Feature recognizer Fusion data Action recognizer 2/24/2019

Early fusion Early fusion applies to combinations like speech+lip movement. It is difficult because: Of the need for MM training data. Because data need to be closely synchronized. Computational and training costs. 2/24/2019

Late fusion Vision data Speech data Other sensor data Feature recognizer Feature recognizer Feature recognizer Action recognizer Action recognizer Action recognizer Fusion data Recognized Actions 2/24/2019

Late fusion Late fusion is appropriate for combinations of complementary information, like pen+speech. Recognizers are trained and used separately. Unimodal recognizers are available off-the-shelf. Its still important to accurately time-stamp all inputs: typical delays are known between e.g. gesture and speech. 2/24/2019

Contrast between MM and GUIs GUI interfaces often restrict input to single non-overlapping events, while MM interfaces handle all inputs at once. GUI events are unambiguous, MM inputs are based on recognition and require a probabilistic approach MM interfaces are often distributed on a network. 2/24/2019

Agent architectures Allow parts of an MM system to be written separately, in the most appropriate language, and integrated easily. OAA: Open-Agent Architecture (Cohen et al) supports MM interfaces. Blackboards and message queues are often used to simplify inter-agent communication. Jini, Javaspaces, Tspaces, JXTA, JMS, MSMQ... 2/24/2019

Adminstrative Final project presentations on May 12 and 13. Presentations go by group number. Groups 7-12 on Monday 12, groups 1-6 on Tuesday 13. Final reports are due on Weds May 7. 2/24/2019

Symbolic/statistical approaches Allow symbolic operations like unification (binding of terms like “this”) + probabilistic reasoning (possible interpretations of “this”). The MTC system is an example Members are recognizers. Teams cluster data from recognizers. The committee weights results from various teams. 2/24/2019

MTC architecture 2/24/2019

Probabilistic Toolkits The “graphical models toolkit” U. Washington (Bilmes and Zweig). Good for speech and time-series data. MSBNx Bayes Net toolkit from Microsoft (Kadie et al.) UCLA MUSE: middleware for sensor fusion (also using Bayes nets). 2/24/2019

MM systems Designers Outpost (Berkeley) 2/24/2019

MM systems: Quickset (OGI) 2/24/2019

Crossweaver (Berkeley) 2/24/2019

Crossweaver (Berkeley) Crossweaver is a prototyping system for multi-modal (primarily pen and speech) UIs. Also allows cross-platform development (for PDAs, Tablet-PCs, desktops. 2/24/2019

Summary Multi-modal systems provide several advantages. Speech and pointing are complementary. Challenges for multi-modal. Early vs. late fusion. MM architectures, fusion approaches. Examples of MM systems. 2/24/2019