Presentation is loading. Please wait.

Presentation is loading. Please wait.

VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web.

Similar presentations


Presentation on theme: "VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web."— Presentation transcript:

1 VoiceXML

2 Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web.

3 W3C (’02) VoiceXML Forum (’00) Motorola (’98) HP (’98) IBM (’98) Bell/Lucent (’98) AT&T (‘95) History of VoiceXML PML SpeechML TalkML VoxML VoiceXML 1.0VoiceXML 2.0

4 VoiceXML Open standard-language for serving voice/audio documents. VoiceXML is designed for creating audio dialogs that feature. Synthesized speech, Digitized audio, Recognition of spoken and DTMF key input, Recording of spoken input, Telephony and Mixed-Initiative conversations.

5 VoiceXML (Cont’d) VoiceXML allows scripts/CGIs etc. Can take input from the listener via speech(fill out forms like in HTML). Used extensively for automated call handling. Makes info accessible over (cell) phones The next revolution on the Web.

6 Architectural Model

7 Goals of VoiceXML Web development and content delivery into voice response applications. Minimize client/server interactions. Separate code from service logic. Shield the application authors from platform specific details.

8 Voice Browser Software platform running on a network server. It supports the following features. ASR DTMF Recognition grammars Mixed-initiative dialog TTS Voice browser:VoiceXML :: Web browser:HTML

9 Voice Enabling

10 Sample VoiceXML Code Would you like to get rich quick? Gotcha. You want to be rich! You don't want to be rich.

11 Problem with VoiceXML Navigation of the voice document. Author has to ask where listener will like to go next. Listener has absolutely no control over navigation. Tedium, Adv.Applications not possible. Analogy: Scroll vs book

12 Our Architecture

13 Voice Anchors Speech labels that listeners can place on a dialog. Listener can return to that dialog later by uttering that label. Hard to implement, as free-form speech recognition is not possible. Need to incorporate in the voice browser.

14 Voice Anchors We developed a number of methods for attaching voice anchors. Most practical method: Spelling. Anchor as a whole word. Default anchors Default navigation strategies

15 Cumulative Anchors Different dialogs can be marked with the same label. Recalling the label reads out the corresponding dialogs. Multiple cumulative anchors in a single document.

16 Grammar Set of valid expressions. Each dialog references one or more grammars. Nuance Grammar Specification Language (GSL). Inline grammar and Offline grammar. Offline provides the following advantages: Can be generated dynamically (via Cgi’s, Asp's). Reused by multiple dialogs or applications. Updated and modified without change in source code. Subgrammars and Form-level grammar.

17 Sample Grammar code <![CDATA[ [ [(skip)]{ } [(previous)]{ } [(place anchor) (call mark) (begin mark)]{ } [(recall mark) (recall anchor) (recall)]{ } ] ]]>

18 Future work

19 Applications The Voice Web. Talking books Mathematics for visually impaired. Hazardous Material Emergency Response.


Download ppt "VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web."

Similar presentations


Ads by Google