CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.

Slides:



Advertisements
Similar presentations
INTEGRATION OF VOICE SERVICES IN INTERNET APPLICATIONS By Eduardo Carrillo (lecturer), J. J Samper, J.J. Martínez-Durá Universidad Autónoma de Bucaramanga.
Advertisements

DATA PROCESSING SYSTEMS
Natural Language Systems
Speech Synthesis Markup Language V1.0 (SSML) W3C Recommendation on September 7, 2004 SSML is an XML application designed to control aspects of synthesized.
Collaborative Customer Relationship Management (CCRM) User Group June 23 rd, 2004.
Chapter 5 Input and Output. What Is Input? What is input? p. 166 Fig. 5-1 Next  Input device is any hardware component used to enter data or instructions.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
ITCS 6010 Speech Guidelines 1. Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
Auditory User Interfaces
Chapter 15 Speech Synthesis Principles 15.1 History of Speech Synthesis 15.2 Categories of Speech Synthesis 15.3 Chinese Speech Synthesis 15.4 Speech Generation.
Speech User Interfaces
“ Walk to here ” : A Voice Driven Animation System SCA 2006 Zhijin Wang and Michiel van de Panne.
Biometrics: Voice Recognition
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
WEB DESIGNING Prof. Jesse A. Role Ph. D TM UEAB 2010.
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
TEL 355: Communication and Information Systems in Organizations Speech-Enabled Interactive Voice Response Systems Professor John F. Clark.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
Some Voice Enable Component Group member: CHUAH SIONG YANG LIM CHUN HEAN Advisor: Professor MICHEAL Project Purpose: For the developers,
Chapter 5 Input. What Is Input? What are the input devices? Input device is any hardware component used to enter data or instructions Data or instructions.
Speech Guidelines 2 of Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
Section 2.1 Compare the Internet and the Web Identify Web browser components Compare Web sites and Web pages Describe types of Web sites Section 2.2 Identify.
Digital Sound and Video Chapter 10, Exploring the Digital Domain.
MMP - M204 Information Design/Cross Media Publishing - Spoken Language Interfaces - Dr. Ingrid Kirschning (UDLA)1 4. Speech Synthesis –Introduction to.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
Data collection and experimentation. Why should we talk about data collection? It is a central part of most, if not all, aspects of current speech technology.
Conversational Applications Workshop Introduction Jim Larson.
1 Web Basics Section 1.1 Compare the Internet and the Web Compare Web sites and Web pages Identify Web browser components Describe types of Web sites Section.
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
A study on Prediction on Listener Emotion in Speech for Medical Doctor Interface M.Kurematsu Faculty of Software and Information Science Iwate Prefectural.
11.10 Human Computer Interface www. ICT-Teacher.com.
Unit 1_9 Human Computer Interface. Why have an Interface? The user needs to issue instructions Problem diagnosis The Computer needs to tell the user what.
Chapter 4 – Slide 1 Effective Communication for Colleges, 10 th ed., by Brantley & Miller, 2005© Technology and Electronic Communication.
CMPD273 Multimedia System Prepared by Nazrita Ibrahim © UNITEN2002 Multimedia System Characteristic Reference: F. Fluckiger: “Understanding networked multimedia,
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
 Data or instructions entered into memory of computer  Input device is any hardware component used to enter data or instructions 2.
Math 5 Professor Barnett Timothy G. McManus Anthony P. Pastoors.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
BIOMETRICS.
Chapter 5: Input CSC 151 Beth Myers Kristy Heller Julia Zachok.
Overview of the Design Process User Centered Design.
HCC 831 User Interface Design and Evaluation. Good Design (our goal!) “Every designer wants to build a high-quality interactive system that is admired.
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
1 Human Computer Interaction Week 5 Interaction Devices and Input-Output.
COMP 6620 User Interface Design and Evaluation. Course Introduction Welcome to COMP 6620 Welcome to COMP 6620 Syllabus Syllabus Introduction Introduction.
The Direct Method 1. Background It became popular since the Grammar Translation Method was not very effective in preparing students to use the target.
© 2013 by Larson Technical Services
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
The Audio-Lingual Method
Natural Language and Speech (parts of Chapters 8 & 9)
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
What is Input?  Input  Processing  Output  Storage Everything we enter into the computer to do is Input.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
introductionwhyexamples What is a Web site? A web site is: a presentation tool; a way to communicate; a learning tool; a teaching tool; a marketing important.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
How can speech technology be used to help people with disabilities?
G. Anushiya Rachel Project Officer
DESIGINING WEB INTERFACE Presented By, S.Yamuna AP/CSE
Input Devices.
Natural Language Processing and Speech Enabled Applications
11.10 Human Computer Interface
Automatic Speech Recognition
ARTIFICIAL NEURAL NETWORKS
Biometrics Reg: AMP/HNDIT/F/F/E/2013/067.
CSC128 FUNDAMENTALS OF COMPUTER PROBLEM SOLVING
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Indian Institute of Technology Bombay
Human and Computer Interaction (H.C.I.) &Communication Skills
Presentation transcript:

CP SC 881 Spoken Language Systems

2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction

3 of 23 Good Design (our goal!) “Every designer wants to build a high-quality interactive system that is admired by colleagues, celebrated by users, circulated widely, and imitated frequently.” (Shneiderman, 1992, p.7) …and anything goes!…

4 of 23 Auditory User Interfaces An Auditory user interface (AUI) is an interface which relies primarily or exclusively on audio for interaction, including speech and sound. (Weinschenk & Barker 2000)

5 of 23 Auditory User Interfaces Natural Language/Speech User Interfaces Conversation is natural Multimodal User Interfaces Combines voice, text, graphics, gestures, keypad, stylus, etc. into one interface

6 of 23 Multimodal User Interfaces Simultaneous Multimodality Multiple modes at the same time, voice-visual Sequential Multimodality Uses multiple modes sequentially and seamlessly

7 of 23 But What Makes a Good VUI? Functionality Speed & efficiency Reliability, security, data integrity Standardization, consistency USABILITY !

8 of 23 Closer to Fine: A Philosophy …The human user of any system is the focus of the design process. Planning and implementation is done with the user in mind, and the system is made to fit the user, not the other way around…. Bruce Walker Georgia Institute of Technology

9 of 23 How Do You Know It’s Good?! Usability Test and Evaluation

Human Factors in Speech

11 of 23 Human Factors in Speech High Error Rates Speech recognition Background noise, intonation, pitch, volume Grammars (missing words, size limitations) “When speech recognition becomes genuinely reliable, this will cause another big change in operating systems.” (Bill Gates, The Road Ahead 1995)

12 of 23 Human Factors in Speech Unpredictable Errors Grammars Sound alike words Austin-Boston Missing Words Grammar Size Limitations Note: We do not like using unpredictable machines.

13 of 23 Human Factors in Speech User Expectations Novice users have high expectations of computers and speech Natural Language Novices expect to say “anything” to the machine i.e. Star Trek Spoken Language differs from written language. i.e. ums or uhs appear in spoken language

14 of 23 Human Factors in Speech Memory Speech only systems can be taxing on human memory, i.e. large telephone menu systems. Miller - 7 plus or minus 2

Definitions and Terms

16 of 23 Speech Recognition Refers to the technologies that enable computing devices to identify the sound of human voice. List all the Clemson University orders.

17 of 23 Speech Recognition Continuous Recognition Allows a user to speak to the system in an everyday manner without using specific, learned commands. Discrete Recognition Recognizes a limited vocabulary of individual words and phrases spoken by a person.

18 of 23 Speech Recognition Word Spotting Recognizes predefined words or phrases. Used by discrete recognition applications. “Computer I want to surf the Web” “Hey, I would like to surf the Web”

19 of 23 Speech Recognition Voice Verification or Speaker Identification Voice verification is the science of verifying a person's identity on the basis of their voice characteristics. Unique features of a person's voice are digitized and compared with the individual's pre-recorded "voiceprint" sample stored in the database for identity verification. It is different from speech recognition because the technology does not recognize the spoken word itself.

20 of 23 Speech Synthesis Refers to the technologies that enable computing devices to output simulated human speech. James, here are the Clemson University orders.

21 of 23 Speech Synthesis Formant Synthesis Uses a set of phonological rules to control an audio waveform that simulates human speech. Sounds like a robot, very synthetic, but getting better.

22 of 23 Speech Synthesis Concatenated Synthesis Uses computer assembly of recorded voice sounds to create meaningful speech output. Sounds very human, most people can’t tell the difference.

23 of 23 Uses of Speech Technologies Interactive Voice Response Systems Call centers Medical, Legal, Business, Commercial, Warehouse Handheld Devices Toys and Education Automobile Industry Universal Access (visual/physical impaired)