Creating a Multimodal Design Environment Using Speech and Sketching Aaron Adler Student Oxygen Workshop September 12, 2003.

Slides:



Advertisements
Similar presentations
Chapter 11 Designing the User Interface
Advertisements

S 1 Intelligent MultiModal Interfaces Manuel J. Fonseca Joaquim A. Jorge
From Model-based to Model-driven Design of User Interfaces.
Natural Language Systems
Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Evaluation of Speech Detection Algorithm Project 1b Due October 11.
Project 1b Evaluation of Speech Detection Due: February 17 th, at the beginning of class.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Input to the Computer * Input * Keyboard * Pointing Devices
1 / 31 CS 425/625 Software Engineering User Interface Design Based on Chapter 15 of the textbook [SE-6] Ian Sommerville, Software Engineering, 6 th Ed.,
EGR 106 – Week 2 – Arrays & Scripts Brief review of last week Arrays: – Concept – Construction – Addressing Scripts and the editor Audio arrays Textbook.
Smart Space & Oxygen CIS 640 Project By Usa Sammpun
User Interface Design. Overview The Developer’s Responsibilities Goals and Considerations of UI Design Common UI Methods A UI Design Process Guidelines.
Object-Oriented Analysis and Design
Chapter 2: Developing a Program Extended and Concise Prelude to Programming Concepts and Design Copyright © 2003 Scott/Jones, Inc.. All rights reserved.
Quark QuarkXPress 4 Intermediate Level Course. Working with Master Pages The Document Layout Palette allows you to add, delete, and move document and.
Chapter 13: Designing the User Interface
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
Domain Modeling (with Objects). Motivation Programming classes teach – What an object is – How to create objects What is missing – Finding/determining.
May 5, 2015 Allison Kidd, ATRC. Direct Services for CSU Students & Employees with Disabilities Ensure Equal Access to Technology & Electronic Information.
Chapter 5 Input. What Is Input? What are the input devices? Input device is any hardware component used to enter data or instructions Data or instructions.
Model the User Experience Today:  Detail some Use Cases  Develop a storyboard of the use cases  Sketch mock-ups of the use case's information requirements.
Chapter 11: Interaction Styles. Interaction Styles Introduction: Interaction styles are primarily different ways in which a user and computer system can.
Understanding Topology for Soil Survey
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
Week 11 Practical: Microsoft Word Theory: Midterm situations.
Input Devices What is input? Everything we tell the computer is input.
Lesson 4 Access Lesson 4 Lesson Plans Michele Smith – North Buncombe High School, Weaverville, NC
11.10 Human Computer Interface www. ICT-Teacher.com.
CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.
Getting the Language Right ITSW 1410 Presentation Media Software Instructor: Glenda H. Easter.
Learning Styles The “Learning Styles Inventory” There are three basic learning styles:  Visual  Auditory  Tactile.
Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.
LEARN MICROSOFT POWERPOINT 2010 Westerville Public Library 2014.
AVI/Psych 358/IE 340: Human Factors Interfaces and Interaction September 22, 2008.
Chapter 5: Input CSC 151 Beth Myers Kristy Heller Julia Zachok.
Enabling Natural Interaction Randall Davis, Howard Shrobe Aaron Adler, Christine Alvarado, Mark Foltz, Tracy Hammond, Mike Oltmans, Metin Sezgin,Olga Veselova.
Black-box Testing.
Input Devices Lecture 3 Input Devices Md. Mahbubul Alam, PhD PRESENTED BY MD. MAHBUBUL ALAM, PHD 1.
E.g.: MS-DOS interface. DIR C: /W /A:D will list all the directories in the root directory of drive C in wide list format. Disadvantage is that commands.
MIT 6.893; SMA 5508 Spring 2004 Larry Rudolph Lecture Introduction Sketching Interface.
Automatic Storytelling in Comics
Welcome to Centra Web conferencing and application sharing utility.
Bayesian Networks for Sketch Understanding Christine Alvarado MIT Student Oxygen Workshop 12 September 2003.
Software Engineering Saeed Akhtar The University of Lahore.
Topic 4 - Database Design Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
Collecting.  What are some Tools for Information Processes?  Collecting is the information process that involves deciding what to collect, locating.
Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
1 Interaction Devices CIS 375 Bruce R. Maxim UM-Dearborn.
Lesson 4 Access Lesson 4 Lesson Plans Michele Smith – North Buncombe High School, Weaverville, NC
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Knowledge Based Systems ExpertSystems Difficulties in Expert System Development u Scarce resources – new technology – demand for trained personnel u Development.
What is Input?  Input  Processing  Output  Storage Everything we enter into the computer to do is Input.
Pen Based User Interface Issues CSE 490RA January 25, 2005.
Notes for Speech Recognition. Speech Recognition Continuous Speech Recognition (CSR) is the software that allows users to speak normally and input data.
Enabling Natural Interaction Randall Davis Aaron Adler, Sonya Cates, Jacob Eisenstein, Tracy Hammond, Mike Oltmans, Metin Sezgin, Chen Li, David Pitman.
User Stories- 2 Advanced Software Engineering Dr Nuha El-Khalili.
Day 4 – Process Modeling cont’d Today’s Goals  More on Process Models  Leveling DFDs  Exercise 5 – in class  Group Project / Client Project reminders.
Computer Graphics Lecture 1. Books D. Hearn, M. P. Baker, "Computer Graphics with OpenGL", 3rd Ed., Prentice Hall, 2003, ISBN
Computer Graphics Lecture 1 Introduction to Computer Graphics
Standard Methods of Input.
Methods of Computer Input and Output
11.10 Human Computer Interface
Prototyping.
Creating Interactive Assignments in BCPS One
Assistive System Progress Report 1
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Chapter 9 System Control
Presentation transcript:

Creating a Multimodal Design Environment Using Speech and Sketching Aaron Adler Student Oxygen Workshop September 12, 2003

Goals for System Create a natural user interface for a design environment Not command based Create a natural multimodal UI by combining speech and sketching Some things more easily expressed with sketching and speaking

ASSIST Natural sketching tool for mechanical engineering designs Stylus-style input devices

Motivating Example Newtons Cradle

Natural Language Need to determine how users naturally talk about the devices Videotaped 6 users sketching 6 drawings at a non-interactive whiteboard Transcribed data and produced time- stamped speech and sketching events

Video of People Sketching

Segmenting the Data Once the data was transcribed, graphs and charts were created to help analyze the data Rules were created to encapsulate the knowledge about segmentation

Rules Three types of rules –Rules about the text of the speech Repeated words, mumbled words, key words –Rules about gaps between speech and sketching Long pauses, timing of speech and sketching events –Rules about groups of sketched items Similarly shaped objects

Some Key Words from the Speech And And then Then So Next Also mumbled words, ahhh and ummm, are important We have There is Weve got Its Ill

WATCH Rule output too large, need tool to view relationships between rules WATCH created to view output of rules as a timeline

Rule Layout

Results Software matched 24 of 29 break points Found an additional 18 break points, 10 which were harmless, 7 were ambiguous, and 1 was wrong Hand segmentation had all events to examine at once, spatial relationships Rules kept general to avoid over fitting

Harmless Im puzzled as to how to indicate that > equal size of the suspended balls

Ambiguous [draws top anchor] The slopes are fixed in position [draws middle ramp] [draws middle anchor] > [draws bottom ramp] slope

Speech System Speech done by SLS Sapphire system The transcribed speech was used as a basis to generate a recognizer (missing words were added) Speaker independent Open microphone, continuous recognition

ASSIST Modifications ASSIST needed some modification to allow the system to manipulate the widgets –Identical, touching, equally spaced functions Also needed to send the current widgets to the rule system to be combined with the speech input

System Overview Combines ASSIST and speech recognizer using the developed rules

Ambiguity Need some inherent knowledge of pendulums, wheels, etc. Car on ramp example –Two identical wheels Need to know what a wheel is! Where should this knowledge go? –Top down view – speech triggers search for pendulum

How it Finds the Pendulums Based around nouns and adjectives Speech like: There are three identical touching pendulums. –Look though widgets around that time –Extract pendulums from group of possible widgets Looking for an attached rod and circle –If the speech and the sketch disagree about the number of pendulums, dont do anything

The System in Action

Related work Work at OGI by Oviatt and Cohen ASSISTANCE Several other command-based systems

Future Work Larger vocabulary Using Joshua instead of JESS Learning new vocabulary and corresponding sketches Next generation Blackboard-based system