CEN3722 Human Computer Interaction Advanced Interfaces Dr. Ron Eaglin
Objectives Define multi-modal interfaces Disadvantages and advantages of using speech in interfaces Define speech synthesis Define and provide examples of non-speech sound Describe problems of using hand-writing recognition and computer vision as modes in systems Describe virtual reality and how it might be used to enhance systems Describe advances in computing and HCI needs
Multi-modal interfaces We have many senses, but use only visual (little auditory) in most computer systems Develop multi-modal and multimedia systems to take advantage of our other senses
Speech Speech is a natural form of human communication Difficult for computers Language is: Complex Context-dependent Varied
Speech Recognition Barriers to success Background noise Natural continuation “umm…”, “errr…” Variations between individuals Different inflections Different euphemisms Different regional accents
Speech Advantages Disabled Jobs requiring hands Visual Physical disabilities Dyslexia Jobs requiring hands Factory workers Jobs requiring visual focus (speech input still OK)
Speech Synthesis Can use speech as output Provide for more natural, human-like conversation capabilities Problem: Understanding and meaning can be highly dependent upon variations in intonation. System must understand the problem domain Canned messages
Uninterrupted Speech Fixed prerecorded messages to supplement or replace visual information Because recorded, have natural prosody and pronunciation Announcements at airports Automatic pilot
Non-speech Sound Traditionally used to provide warnings, alarms and status information. Dual-mode displays thought to be better Presentation of information along different, parallel channels allows brain to search along two paths Presentation of redundant information may increase user’s performance May remember sound, not visual, or vice versa.
Non-speech Sound Traditionally UI almost entirely visual Potential to overload visual channel Force user to attend to too many things at once Persistence of visual information means that even detail that is quickly out of date may remain on display after it is required
Non-Speech Sound Speech is serial, we must listen to most of sentence to extract meaning Make take a long time Non-speech sound Can be associated with a particular action and assimilated in a much shorter period of time. Can also be universal, like icons; not true of speech. Must be learned, whereas speech doesn’t (as long at the listener knows the language!)
Non-speech Sound Non-speech sound Is short-lived, so can provide transitory information Provide status on background processes Provide a second representation of actions and objects to support the visual mode and provide confirmation to the user. Navigation
Handwriting Recognition Writing as an input medium A natural form of communication Digitizing tablet used to capture input Problems Similar to speech, variations in individuals Writing varies from day to day for individuals
Computer Vision Additional input channel for the computer Video camera and software used to identify user and tailor system to perceived requirements Bottom-up approach Pixels progressively analyzed to extract meaning More complex problem than speech or handwriting recognition
Ubiquitous Computing Computers on the desktop to computers “everywhere” Intention: “… to create a computing infrastructure that permeates our physical environment so much that we do not notice it anymore” Invisible Computing Analogy with motors Size depends on use Are invisible
Ubiquitous Computing Automated capture, integration, access Meeting room environments Note taking Context-aware computing Most “ubiquitous computing” miniaturized desktop computing Location awareness Other Scalable interfaces Unobtrusive ubiquity
Users with Special Needs Input/output capabilities driven by the “average user” Leaves out entire groups of people Physically challenged Sight Hearing Limbs GUI and the sight-challenged
Virtual Reality Computer generated simulation of a world Specialized hardware and software (headset) Use of multi-modal devices Heavily used in training systems (military) Expanding rapidly into entertainment industry Lagging in many industries – still an emerging market (2016)
Artificial Intelligence (AI) Advances in AI open up an new entirely new field of HCI Interfaces for self-driving vehicles (application of AI) Customer service AI system in use today Systems like IBM Watson can process incredible amounts of information – provide (with correct interface) Can write applications for Watson (ibm.com/Watson)
Personal Robotics Interaction with personal robotics now being defined