Voice XML Team 1 Matt Ganis, Jonathan Hill, Henry Wong Anne I. Mannette-Wright Team 1 Matt Ganis, Jonathan Hill, Henry Wong Anne I. Mannette-Wright.

Slides:



Advertisements
Similar presentations
(1) VoiceXML Overview, Opportunities & Challenges Hitesh Kr. Seth Chief Technology Evangelist SeraNova, Inc OReilly Conference.
Advertisements

VoiceXML: A Field Evaluation By: Kristy Bradnum Supervisor: Peter Clayton Presented in partial fulfilment of the CS Honours Project.
Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen.
Speech Synthesis Markup Language V1.0 (SSML) W3C Recommendation on September 7, 2004 SSML is an XML application designed to control aspects of synthesized.
Collaborative Customer Relationship Management (CCRM) User Group June 23 rd, 2004.
Rob Marchand Genesys Telecommunications
Snejina Lazarova Senior QA Engineer, Team Lead CRMTeam Dimo Mitev Senior QA Engineer, Team Lead SystemIntegrationTeam Telerik QA Academy SOAP-based Web.
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
 To publish information for global distribution, one needs a universally understood language, a kind of publishing mother tongue that all computers may.
Minding Your Own Business The Platform for Privacy Preferences Project and Privacy Minder Lorrie Faith Cranor AT&T Labs-Research
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
Project 1 Introduction to HTML.
The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
Pace VoiceXML Absentee System Paul Visokey, Ping Gallivan, Yani Mulyani, Lisa Jordan, Elaine Li, George Mathew, Qisheng Hong Presenter Name : Paul Visokey.
VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University Joint work (in progress) with Daniel,
About VoiceXML 2.0 Stefanie Shriver a lot of this stuff is pulled directly from the 2.0 spec:
1st Project Introduction to HTML.
Find The Better Way Expand Your Voice with VXML May 10 th, 2005.
AN EXTENSIBLE TRANSCODER FOR HTML TO VOICEXML CONVERSION by Narayanan Annamala Gopal Gupta B. Prabhakaran DEPARTMENT OF COMPUTER SCIENCE THE UNIVERSITY.
CS 415 N-Tier Application Development By Umair Ashraf July 6,2013 National University of Computer and Emerging Sciences Lecture # 9 Introduction to Web.
5/12/981 PML: A Language Interface to Networked Voice Response Units Discussion for ATS’98 Chris Ramming AT&T Labs West.
HTML 1 Introduction to HTML. 2 Objectives Describe the Internet and its associated key terms Describe the World Wide Web and its associated key terms.
Chapter ONE Introduction to HTML.
Separating VUI from business logic Caller Experience-centered design approach Alex Kurganov, CTO Parus Interactive
VoiceXML Builder Arturo Ramirez ACS 494 Master’s Graduate Project May 04, 2001.
INTRODUCTION TO WEB DATABASE PROGRAMMING
Chapter 1 Introduction to HTML, XHTML, and CSS
Adapting Legacy Computational Software for XMSF 1 © 2003 White & Pullen, GMU03F-SIW-112 Adapting Legacy Computational Software for XMSF Elizabeth L. White.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
Conversational Applications Workshop Introduction Jim Larson.
CNIT 133 Interactive Web Pags – JavaScript and AJAX JavaScript Environment.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
VoiceXML: Forms, Menus, Grammars, Form Interpretation Algorithm.
Integrating VoiceXML with SIP services
HTML, XHTML, and CSS Sixth Edition Chapter 1 Introduction to HTML, XHTML, and CSS.
Section 17.1 Add an audio file using HTML Create a form using HTML Add text boxes using HTML Add radio buttons and check boxes using HTML Add a pull-down.
November 1, 2006IU DLP Brown Bag : Fall Data Integrity and Document- centric XML Using Schematron for Managing Text Collections Dazhi Jiao, Tamara.
1 David Thomson The Search for a Dialog Metalanguage that Makes Everybody Happy David Thomson Chair, VoiceXML Tools Committee, SpeechPhone CTO.
Web Design and Development for E-Business By Jensen J. Zhao Copyright 2003 Prentice Hall, Inc. Web Design and Development for E-Business Jensen J. Zhao.
The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
1 Welcome to CSC 301 Web Programming Charles Frank.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Building Rich Web Applications with Ajax Linda Dailey Paulson IEEE – Computer, October 05 (Vol.38, No.10) Presented by Jingming Zhang.
SOAP-based Web Services Telerik Software Academy Software Quality Assurance.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
CS562 Advanced Java and Internet Application Introduction to the Computer Warehouse Web Application. Java Server Pages (JSP) Technology. By Team Alpha.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better.
VoiceXML Version 2.0 Jon Pitcherella. What is it? A W3C standard for specifying interactive voice dialogues. Uses a “voice” browser to interpret documents,
HTML Concepts and Techniques Fifth Edition Chapter 1 Introduction to HTML.
Chapter 1 Introduction to HTML, XHTML, and CSS HTML5 & CSS 7 th Edition.
VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web.
1 Introducing Web Developer Tools Rapid application development tools ASP.NET-compatible web editors –Visual Studio.NET Professional Edition –Visual Studio.
Added Value to XForms by Web Services Supporting XML Protocols Elina Vartiainen Timo-Pekka Viljamaa T Research Seminar on Digital Media Autumn.
Presented By Sharmin Sirajudeen S7 CS Reg No :
A seminar by Ramesh Kumar Raju S CSSE 07121A1547.
HTML PROJECT #1 Project 1 Introduction to HTML. HTML Project 1: Introduction to HTML 2 Project Objectives 1.Describe the Internet and its associated key.
12. DISTRIBUTED WEB-BASED SYSTEMS Nov SUSMITHA KOTA KRANTHI KOYA LIANG YI.
Introduction to Visual Basic. NET,. NET Framework and Visual Studio
Chapter 1 Introduction to HTML.
Project 1 Introduction to HTML.
Distributed web based systems
Integrating VoiceXML with SIP services
VoiceXML An investigation Author: Mya Anderson
Presentation transcript:

Voice XML Team 1 Matt Ganis, Jonathan Hill, Henry Wong Anne I. Mannette-Wright Team 1 Matt Ganis, Jonathan Hill, Henry Wong Anne I. Mannette-Wright

April 8, 2006Team 1 VoiceXML Agenda History of Voice Applications and Voice XML Related Voice Type Languages Advantages of Voice XML Architecture of VoiceXML Paper 1 Paper 2 Paper 3 Demonstration Voice XML 2.0 Differences between Voice XML 1.0 and 2.0 The Future – Voice XML 2.1

April 8, 2006Team 1 VoiceXML History of Voice Applications Voice technologies emerged in the 1990s : –Automatic Speech Recognition (ASR) Small vocabulary and speech recognition problems were solved –Text-to-Speech Systems Can generate speech responses on the fly –Interactive Voice Response (IVR) applications

April 8, 2006Team 1 VoiceXML History of Voice Applications IVRs became programmable but programmable IVRs are: –Difficult to program (call scripting is often vendor specific) so each vendor had to “reinvent wheel” –Did not allow for the easy movement of an application from one IVR to another due to the proprietary nature of IVRs

April 8, 2006Team 1 VoiceXML History of Voice XML 1995: AT&T started work on Phone Markup Language (PML) Oct.1998: Motorola developed VoxML (Voice Markup Language) Feb.1999: IBM developed SpeechML technology Mar.1999: VoiceXML Forum was formed by IBM, AT&T, Lucent, and Motorola –Mission was to design a standard dialog design language that developers could use to build conversational applications March 2000: VoiceXML Forum releases VoiceXML 1.0 to the general public May 2000: accepted by W3C

April 8, 2006Team 1 VoiceXML W3C Speech Interface Framework From McGashan, Dr. Scott, “VoiceXML 2.0 from the Inside”, retrieved from

April 8, 2006Team 1 VoiceXML Related Voice Type Languages Related to VoiceXML –Grammar XML (grXML) Provides speech grammars used by speech recognition engines –Speech Synthesis Markup Language (SSML) SSML specification is based upon JSML(J Speech Markup Language) and JSGF (J Speech Grammar Format) specifications, which are owned by Sun.JSMLJSGF Introduced in September 2004 is currently a W3C standard at Version 1.0 Standardized way of specifying how text is rendered as speech and includes tags for pronunciation, tone, inflection, etc. Often embedded in VoiceXML scripts to drive interactive telephony systems.

April 8, 2006Team 1 VoiceXML Related Voice Type Languages Related to VoiceXML (Continued) –Call Control XML (CCXML) W3C standard markup language for controlling telephony and telephony equipment; currently at Version 1.0 Performs tasks such as setting up conference calls, transferring incoming calls, etc. Works hand-in-hand with VoiceXML

April 8, 2006Team 1 VoiceXML Architecture of VoiceXML From: eXtensible Markup Language (VoiceXML™) version 1.0

April 8, 2006Team 1 VoiceXML Advantages of Voice XML VoiceXML is a markup language that: –Minimizes client/server interactions by specifying multiple interactions per document. –Shields application authors from low-level, and platform-specific details. –Separates user interaction code (in VoiceXML) from service logic (e.g. CGI scripts). –Promotes service portability across implementation platforms. VoiceXML is a common language for content providers, tool providers, and platform providers. –Is easy to use for simple interactions, and yet provides language features to support complex dialogs.

April 8, 2006Team 1 VoiceXML Paper 1 Authored by Bruce Lucas: “ VoiceXML for Web-based Distributed Conversational Applications” Presents an introduction to VoiceXML Comparison to HTML Support for Natural Dialogue

April 8, 2006Team 1 VoiceXML Paper 1 VoiceXML is an XML application which results in the following benefits: –Allows the reuse and easy retooling of existing tools for creating, transforming, and parsing XML documents –Allows VoiceXML to make use of other complementary XML- based standards. Example: Java Speech Markup Language for speech synthesis A form is VoiceXML’s basic dialogue unit –Contains a set of inputs (fields) –Specifies what to do with a set of fields after data is collected A field includes a prompt and a specification of what the user is allowed to say

April 8, 2006Team 1 VoiceXML Paper 1 - VoiceXML Code Example Say one of: Sports scores Weather information Log in Please say your complete phone number Please say your PIN code

April 8, 2006Team 1 VoiceXML Paper 1 VoiceXML includes support for common field types including numbers, digits, phone, date and time AND for user-specified fields using grammars What would you like to drink? coffee | tea | orange juice | milk | nothing What sandwich would you like?

April 8, 2006Team 1 VoiceXML Paper 1 – The Distributed Model VoiceXML provides support for advanced features such as: –Local validation and processing –Audio playback and recording –Support for context specific and taped help and reusable sub dialogues From: Lucas, Bruce, “VoiceXML for Web-Based Distributed Conversational Applications, Communications of the ACM, Vol.43, No.9, September 2000.

April 8, 2006Team 1 VoiceXML Paper 1 – VoiceXML compared with HTML An HTML document is a single unit specified by a URI and presented to the user all at once –A VoiceXML document contains a number of dialogue units (menus or forms) presented sequentially An HTML document has no markup language to identify distinct units –A VoiceXML document is structured to reflect the sequential nature of the voice medium An HTML document is like one single dialogue –A VoiceXML document requires dialogue elements so they can be presented one at a time. –VoiceXML has application logic for sequencing among dialogue units

April 8, 2006Team 1 VoiceXML Paper 1 – Support for Natural Dialogue VoiceXML supports “directed” and “mixed initiative” dialogues –“ directed” dialogues: the computer directs the conversation at each step by prompting the user for the next piece of information Example: C: On what date do you wish to fly? H: May 6th –“mixed initiative” dialogues: each participant can take the initiative in leading a conversation. VoiceXML does this by allowing input grammars to be specified at the form level C: How can I help you? H: I’d like to fly from New York on May 8 th C: Where would you like to fly to?

April 8, 2006Team 1 VoiceXML Paper 2 Concepts of Programming by Voice –Motivated by need to program without typing, therefore preventing repetitive stress injuries (RPI), a common injury among those who spend long hours typing –Voice-activated software for the disabled is a prime motivator in development –Paper proposes a system that creates an environment for voice-activated programming

April 8, 2006Team 1 VoiceXML Paper 2 Costs of such software has fallen dramatically; –$7500 in 1998 –$100 in 2005 –Products Include; –Dragon Naturally Speaking –IBM Via Voice –Hausbie Voice Express

April 8, 2006Team 1 VoiceXML Paper 2 Authors developed a generator called VocalGenerator using Dragon Naturally Speaking with MS Visual C++ Input = a context-free grammar compatible with most programming languages Output = An environment in which a voice recognition, syntax-directed program can be written by voice input alone Allows for better recognition and selection of sections of code

April 8, 2006Team 1 VoiceXML Paper 2 Evaluation of the product –Programming is faster using a Syntax directed voice recognition system than a natural language DVR –A programmer suffering from repetitive stress injuries will be able to program at a speed sufficient to ‘maintain competitive employment’

April 8, 2006Team 1 VoiceXML Paper 3 Paper 3 focuses on ‘V-commerce’ – through a survey of Voice XML applications for business communication Looks at the inherent risks in human to human communication and the challenges these pose to human to computer communication Examines speech recognition Seeks to leverage the predominance of telephone usage globally

April 8, 2006Team 1 VoiceXML Paper 3 Utilizes the W3C Voice Browser Working Group design criteria including; –Consistency –Interoperability –Generality –Internationalization –Generalization and Readability –Implementation

April 8, 2006Team 1 VoiceXML Paper 3 Looks at the potential for Voice-activated Web interface Looks at a transactional communication method with six phases; –Sender has an idea –Sender transforms the idea into a message –Sender transmits a message –Receiver gets the message –Receiver interprets the message –Receiver reacts and sends feedback

April 8, 2006Team 1 VoiceXML Paper 3 Challenges Include –Unproven business models –Business Process Change Requirements –Channel conflicts –Technology hurdles –Legal issues –Security & privacy

April 8, 2006Team 1 VoiceXML Paper 3 Conclusions –Speech is natural, flexible and efficient –Voice technology will improve –Voice recognition capabilities will improve –The intersection of voice recognition, telecom and Web technologies may lead to a large market for products that take advantage of this intersection

April 8, 2006Team 1 VoiceXML Demo Using TellMe Studio ( TellMe Studio provides you with resources to: –Build and test your own Internet-powered "phone sites" with nothing but your Web browser and an ordinary telephone in the following ways:Build Type VoiceXML directly into an area called the “Scratchpad” and then call the phone number to preview the code Publish the VoiceXML and audio files on a publically accessible Web server, point Studio at the URL for your application's "home page", and once again call the Studio phone number to preview the application –Browse and leverage an extensive library of sample code, grammars, audio, and VoiceXML documentationcode grammarsaudioVoiceXML documentation –Participate in the Voice Web development community through open newsgroupsnewsgroups

April 8, 2006Team 1 VoiceXML Demo (Continued) This demo – Drink Recipes I - will use one of the “prebuilt” VoiceXML scripts available from the TellMe Studio Code Library This version of Drink Recipes –asks the caller for a drink name –in response, plays back the drink's ingredients list and mixing instructions. –demonstrates the use of large grammars and how to create data-driven applications.

April 8, 2006Team 1 VoiceXML VoiceXML 2.0 From: McGashan, Dr. Scott, “VoiceXML 2.0 from the Inside”, retrieved from

April 8, 2006Team 1 VoiceXML Differences Between VoiceXML 2.0 Differences between VoiceXML 1.0 and 2.0: –Interoperability –Functional Completeness –Clarity

April 8, 2006Team 1 VoiceXML VoiceXML 2.0 Interoperability: VoiceXML 2.0 contains the following new formats that guarantee developers that their applications run on any VoiceXML platform conforming to the VoiceXML 2.0 specification: –input: XML Format of the Speech Recognition Grammar Specification for speech and DTMF input; VoiceXML 1.0 did not require any particular speech grammar format –output: Speech Synthesis Markup Language (SSML) is used for text-to-speech and audio output; VoiceXML 1.0 did not use SSML and its speech markup elements are not supported in Voice XML 2.0

April 8, 2006Team 1 VoiceXML VoiceXML 2.0 Interoperability: (Continued) –protocol: the HTTP protocol for fetching documents and resources is supported. Voice XML 1.0 did not require support for HTTP –audio: audio platforms recommended for support in VoiceXML 1.0 are now required in VoiceXML 1.0

April 8, 2006Team 1 VoiceXML VoiceXML 2.0 Functional Completeness: New elements, attributes and variables have been added in VoiceXML 2.0 that enable developers to ensure that key aspects of the cycle of generating system output, interpreting user input and transitioning from one dialog to another is described. NOTE: VoiceXML 1.0 contained “gaps” for example: when prompts were played to the user Some of the new/enhanced elements, variables and support include: –application.lastresult$ variable: provides info about last recognition in the application – element: generates a debug message – and elements: enhanced to provide more info – element: enhanced with an “expr” attribute – : enhanced with “accept” attribute –Enhanced support for greater control over universal grammars

April 8, 2006Team 1 VoiceXML VoiceXML 2.0 Clarity: Voice XML 2.0 provides a clear description and interpretation of ALL elements (and their attributes), how they interact with one another, and their expected behavior. NOTE: VoiceXML 1.0 contains omissions and contradictions in this respect Some clarification changes include: –Subdialogs: description clarified –Root and Leaf document definitions explicitly defined –Prompt queueing and input collection: relationship between these two clarified –Relationship between VoiceXML 2.0 and ECMAScript variables clarified –VoiceXML 2.0 clarifies conformance between VoiceXML documents and VoiceXML processors – Alignment of VoiceXML 2.0 with Speech Grammar and Speech Synthesis specifications

April 8, 2006Team 1 VoiceXML VoiceXML 2.1 Voice XML 2.1was released on June 13, 2005 by the W3C as a “candidate” recommendation Voice XML 2.1 proposes 8 enhancements to VoiceXML 2.0 as follows: –Referencing grammars dynamically –Referencing scripts dynamically –Using to detect Barge-in during prompt playback –Using to fetch XML without requiring a dialog transfer –Concatenating prompts dynamically using. –Recording user utterances while attempting recognition –Adding namelist to –Adding type to

April 8, 2006Team 1 VoiceXML References 1.Ali, Sanwar, Albohali, Mohamed, Wibowo, Kustim, “VoiceXML for Business Applications: A Survey”, First Annual ABIT Conference, May 3-5, 2001, Pittsburg, Pennsylvania. 2.Arnold, Stephen A., Mark, Leo and Goldthwaite, John, “Programming by Voice, VocalProgramming”, ASSETS’00, November 13-15, Arlington, Virginia 3.Lucas, Bruce, “VoiceXML for Web-based Distributed Conversational Applications”, Communications of the ACM, September 2000, Vol.43, No.9, pp eXtensible Markup Language (VoiceXML version 1.0} 5. eXtensible Markup Language (VoiceXML version 2.0) 6. eXtensible Markup Language (VoiceXML version 2.1) McGashan, Dr. Scott, “VoiceXML 2.0 from the Inside”, retrieved from