Download presentation
Presentation is loading. Please wait.
Published byBaldric Reeves Modified over 8 years ago
1
Designing Speech Interfaces for Kiosks Max Van Kleek Buddhika Kottahachchi Tyler Horton Paul Cavallaro
2
AGENDA ● Background ● Motivation ● Design ● Current Implementation ● Demo (Video) ● Evaluation ● Conclusions & Future Work
3
Background: OK-Net Oxygen Kiosk Network
4
Background: Smart Kiosk Information Navigation and Noteposting Interface (SKINNI) Provide timely, relevant information to visitors and members of the CSAIL community through a touchscreen GUI
5
Background: Smart Kiosk Information Navigation and Noteposting Interface (SKINNI) Provide timely, relevant information to visitors and members of the CSAIL community through a touchscreen GUI
6
Background: Smart Kiosk Information Navigation and Noteposting Interface (SKINNI) Provide timely, relevant information to visitors and members of the CSAIL community through a touchscreen GUI
7
MOTIVATION Searching for specific information via touchscreen GUIs feels tedious, error prone - more time consuming than desirable - poor pointing accuracy - widgets behave differently on touchscreens - no tactile feedback Optimizing the GUI for touchscreens, and adding shortcuts to allow searching/rapid information access yielded limited success - screen clutter - new vs experienced users - forced user to use attached keyboard
8
MOTIVATION Example: Navigating the directory Searching for “Howard Shrobe”: Touch “Directory Pane” (Scan list of names, realize they are alphabetical by last name) Touch scrollbar Down arrow Attempt to drag scroll box downward (fails) Touch “S” shortcut at top of screen (Scan list of names) Touch scrollbar Down arrow Touch row corresponding to Howard Shrobe
9
MOTIVATION Example: Navigating the directory using keyboard Searching for “Howard Shrobe”: Touch “Directory Pane” (Scan list of names, realize they are alphabetical by last name) Touch text field corresponding to “Last name” (Move hand / glance from screen to keyboard) Type “S”, “h”, “r”, “o”, “b”, “e” Touch row corresponding to Howard Shrobe Much shorter, but much less frequently used awkward since eyes/hands are swapping between screen and keyboard
10
Example: Navigating the directory using keyboard Why not Kiosk Kiosk on the wall... What's the best interface of them all?
11
DESIGN – Speech Challenges Robustness - Speaker independence - Speech dysfluencies and accents - Signal capture in noisy environments...achieving good recognition accuracy. Usability - Low threshold of use - Initial learning curve - Visibility of system state - Handling misrecognition errors gracefully - Managing user expectations Related work:ESPIRIT MASK project – Gavin et. al. (1996) Smart Kiosk project – Christian et al. (2000)
12
DESIGN - Galaxy Galaxy gives us... - Speaker independence - Handling of Speech disfluencies/accents Speechbuilder gives us... - Ease of speech domain definition/manipulation Distributed architecture lends well to Kiosks - Thin clients dependent on more powerful servers
13
IMPLEMENTATION - Architecture
14
IMPLEMENTATION – Speech Domain Constrained domain - Only directory field and map queries Iterative Design - Initial domain extended through informal user survey where is [room] [thirty two] two two six [A] [can you] [please] [(show me | tell me)] [a map] [of] [where] [room] [thirty two] two two six [A] [can you] [please] [(show me | tell me)] [a map] [of] [where] [is] Ben Bitdiddle office [is] [Do you know] where [is] Ben Bitdiddle office [is] Hal Abelson Bryan Adams Edward Adelson.
15
IMPLEMENTATION – Innovation Speech state feedback GUI - Provides immediate visual feedback of the system state - What was recognized? - Is the system ready for interaction? - Is the system busy?
16
IMPLEMENTATION – Innovation Advantages - User is made aware of what the system is trying to do - Reasons for recognition failures can be determined - Initial familiarization process is much smoother - User retention increases Disadvantages - Isn't helpful for visually impaired users - Takes up display space
17
DEMO
18
EVALUATION - Methodology Informal user study 10 subjects (lab members – not representative) Task - Look up the phone number for 18 randomly selected lab members - First 6 using the Speech Interface - Second 6 using the Touchscreen Interface - Final 6 using the preferred Metric - Time taken - From: when name to be looked up provided to the subject - To: when subject retrieves the number from the kiosk
19
EVALUATION - Results Subjects were not aware of supported query forms - recognition rate in the first 2 queries 50% - thereafter 72% 8/10 subjects preferred the speech interface When recognition was successful, performance was consistently better!
20
CONCLUSIONS Users are receptive to using speech interfaces Failed recognition imposes severe penalties on performance “Ramp-up” time can be reduced and user retention increased by providing appropriate feedback
21
FUTURE WORK Improve recognition rates - Improve speech domain - Update voice models (current ones from phone data) Further evaluation Extend speech interface to support all functionality exposed via touchscreen interface Conversation support - dialog and discourse management Multi-language support - Stata visitors come from all over the world
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.