Auditory User Interfaces

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.
Natural Language Systems
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
Lets Talk 9+ Emulator e-Tech for Tots CS590 - Ashok Sahu.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
ICS 463, Intro to Human Computer Interaction Design: 9 “Theory”. Input and Output Dan Suthers.
Detecting Speech Project 1. Outline Motivation Problem Statement Details Hints.
Definition and Aspects
ENEE408G Capstone Design Project: Multimedia Signal Processing Group 1 By : William “Chris” Paul Louis Lo Jang-Hyun Ko Ronald McLaren Final Project : V-LOCK.
Designing a User Interface for People with Disabilities u u
Why is ASR Hard? Natural speech is continuous
Assistive Technology By: Roxanne Majeski, Oscar Guerin, Tasha Reaves, Elias Luna.
ISSUES IN SPEECH RECOGNITION Shraddha Sharma
Introduction to Automatic Speech Recognition
Digital Sound and Video Chapter 10, Exploring the Digital Domain.
Fall 2002CS/PSY On-Speech Audio Area Overview Will it be heard ? Will it be identified ? Will it be understood Four Areas Uses of Non-speech Audio.
Copyright John Wiley & Sons, Inc. Chapter 3 – Interactive Technologies HCI: Developing Effective Organizational Information Systems Dov Te’eni Jane.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Java-Based In-Car Cell Phone Integration By:Chris Keller Greg Nehus Matt Odille.
Multimedia Specification Design and Production 2013 / Semester 2 / week 3 Lecturer: Dr. Nikos Gazepidis
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
Speech Recognition Application
Technical Seminar Presented by :- Debabandana Apta (EC ) National Institute of Science and Technology [1] “ECHO CANCELLATION” Presented.
CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.
Unified Messaging Speech Recognition Voice Over IP Steve Flagg, Director of Technical Services.
Interaction Design Session 12 LBSC 790 / INFM 718B Building the Human-Computer Interface.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
Chapter 15 Recording and Editing Sound. 2Practical PC 5 th Edition Chapter 15 Getting Started In this Chapter, you will learn: − How sound capability.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES INTRODUCTION 6/1/ A.Aruna, Assistant Professor, Faculty of Information Technology.
Microsoft Assistive Technology Products Brought to you by... Jill Hartman.
+ New Media Production CA ~ Siri By Eva Lucey. + Introduction to Siri Apple’s latest iPhone feature – New Application First seen in October 2011 – iPhone.
Creating User Interfaces Directed Speech. XML. VoiceXML Classwork/Homework: Sign up to be Voxeo developer. Do tutorials.
© 2013 by Larson Technical Services
Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Automatic Transcription of Natural Speech - A Broader Perspective – Dirk Van Compernolle ESAT.
Using Google's Web Speech API with Moodle for language learning tasks
+ Assistive Technology By Lyndsay RHodes. + Screen Reader A screen reader is a software application for people with severe visual impairments. A screen.
© 2013 by Larson Technical Services
TEXT TO SPEECH CONVERSION SOLUTIONS COMPANY NAME, 37 D 37 th STREET I NeD YORK I NY 10018
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.
Copyright John Wiley & Sons, Inc. Chapter 3 – Interactive Technologies HCI: Developing Effective Organizational Information Systems Dov Te’eni Jane.
Natural Language and Speech (parts of Chapters 8 & 9)
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,
1 Interaction Devices CIS 375 Bruce R. Maxim UM-Dearborn.
Chatter Box Daniel Dunham Nick Noack Mike Nelson.
By Shalu Rana Effect Sizes Effect Sizes Intervention Length Disability Area Length Background of the Problem Background of the Problem.
Speech Recognition Created By : Kanjariya Hardik G.
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Assignment 1 – Voice Activated Systems Meryem Gurel PowerPack : Physical Computing, Wireless Networks and Internet of Things 10/7/2013 German W Aparicio.
Robotic Assistance. The PROBLEM Providing assistance for the Blind –What do we mean by “Blind?” Stereotypical blindness Visually impaired What assistance.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
Siri Voice controlled Virtual Assistant Haroon Rashid Mithun Bose 18/25/2014.
AAC Tools to Support Communication ATLA Webinar, August 19, 2010.
G. Anushiya Rachel Project Officer
CHAPTER 9: Expressive Human and Command Languages
2.7 Communication Methods
Automatic Speech Recognition
Human Computer Interaction Lecture 20 Universal Design
3.0 Map of Subject Areas.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Dr. Debaleena Chattopadhyay Department of Computer Science
CEN3722 Human Computer Interaction Advanced Interfaces
Dialog Design 4 Speech & Natural Language
Assistive System Progress Report 1
Command Me Specification
universal design (web accessibility)
Presentation transcript:

Auditory User Interfaces Multimedia Auditory User Interfaces T.Sharon - A.Frank

Auditory User Interfaces An Auditory user interface (AUI) is an interface which relies primarily or exclusively on audio for interaction, including speech and sound. (Weinschenk & Barker 2000) Examples: Natural Language/Speech User Interfaces. Hands-free automobile navigational system. Interactive voice response system (IVR) like automated payment center. Products for visually impaired.

Why Audio I/O? Hands busy Eyes engaged Disabilities T.Sharon - A.Frank

Potential Applications Auditory Interface can be used in different aspects of our life: Dictation systems Navigation systems Transaction systems Operator services Recording meetings and indexing them later on. T.Sharon - A.Frank

Why Audio I/O underused till now? Needs multiple I/O channels Cost problems Technical problems Algorithmic problems T.Sharon - A.Frank

Audio I/O Main Technologies Speech synthesis Speech recognition Speaker recognition Non-speech audio T.Sharon - A.Frank

Text-to-Speech Phoneme-to-Speech Stored Messages Speech Synthesis T.Sharon - A.Frank

Basic workflow of Text-to-Speech T.Sharon - A.Frank

Phoneme-to-Speech Stored phonemes - pre-recorded. Parameterization (male/female, old/young). Combined sequence to generate words/sentences. Synthesizer chip Parameters Stored Phonemes Synthesizer Chip T.Sharon - A.Frank

Prerecorded parts Message splicing How to smooth speech? Stored Messages Prerecorded parts Message splicing How to smooth speech? Voice playback T.Sharon - A.Frank

Speech Synthesis Timeline T.Sharon - A.Frank

Speech Recognition Get acoustic patterns (sampling) Match to templates (map between acoustic patterns to known templates). Identify tokens T.Sharon - A.Frank

Speech Recognition Problems Speed talkers Words swallowing Speech problems Slang words (culture oriented) Words similarity Environmental noise T.Sharon - A.Frank

Speech Recognition Factors Speaker (in)dependant Single voice training Pre-train/generalize Vocabulary size Training cost Database complexity Pace of speech Isolated words Continuous speech Connected speech T.Sharon - A.Frank

Factors affecting error rate of speech recognition Vocabulary size Background noise Speech spontaneity Sampling rate Amount of training data available T.Sharon - A.Frank

Word error rate of speech recognition 0% 10% 30% 40% 20% Word Error Rate Level Of Difficulty Digits Continuous Command and Control Letters and Numbers Broadcast News Read Speech Conversational Speech X T.Sharon - A.Frank

Basic workflow of Speech-to-Text T.Sharon - A.Frank

Siri as an Example Siri is an intelligent personal assistant that helps you get things done just by asking. It allows you to use your voice to send messages, schedule meetings, place phone calls, search the web, and more. Siri understands your natural speech, and it asks you questions if it needs more information to complete a task. T.Sharon - A.Frank