VoiceXML: Application and Session variables, N- best and Multiple Interpretations.

Slides:

Advertisements

Similar presentations

Tuning Jenny Burr August Discussion Topics What is tuning? What is the process of tuning?

Advertisements

Copyright © Open Text Corporation. All rights reserved. Slide 1 Automatic Routing With Captaris FaxPress and FaxPress Premier Darin McGinnes Sales Engineer.

Introducing JavaScript

Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen.

Basic Response Letter Last Updated Basic Response Letter The response redesign in SERFF 5.6 introduces the concept of inline schedule item.

Programming with Alice Computing Institute for K-12 Teachers Summer 2011 Workshop.

VoiceXML: Events, Errors, and ECMAScript. Acknowledgements Prof. Mctear, Natural Language Processing, University.

Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,

The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.

Chapter 9 Describing Process Specifications and Structured Decisions

Designing a Multi-Lingual Corpus Collection System Jonathan Law Naresh Trilok Pace University 04/19/2002 Advisors: Dr. Charles Tappert (Pace University)

VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University Joint work (in progress) with Daniel,

ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.

ITCS 6010 XML Grammars. What is a Grammar? Specifies what can be said—all the possible sentences and phrases that can be recognized Includes entry via.

Information Extraction from HTML: General Machine Learning Approach Using SRV.

VoiceXML Basic COCOMO Calculator By Greg Kutcher.

Fundamentals of Python: From First Programs Through Data Structures

CST JavaScript Validating Form Data with JavaScript.

Should Intelligent Agents Listen and Speak to Us? James A. Larson Larson Technical Services

VoiceXML Builder Arturo Ramirez ACS 494 Master’s Graduate Project May 04, 2001.

VoiceXML: Speech Recognition Grammars

Features and Applications for Multisite Deployments

Chapter 5 Java Script And Forms JavaScript, Third Edition.

HTML DOM.  The HTML DOM defines a standard way for accessing and manipulating HTML documents.  The DOM presents an HTML document as a tree- structure.

Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.

WEEK 3 AND 4 USING CLIENT-SIDE SCRIPTS TO ENHANCE WEB APPLICATIONS.

Introduction to JavaScript + More on Interactive Forms.

JavaScript Lecture 6 Rachel A Ober

ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.

CSS Class 7 Add JavaScript to your page Add event handlers Validate a form Open a new window Hide and show elements Swap images Debug JavaScript.

VoiceXML: Forms, Menus, Grammars, Form Interpretation Algorithm.

Lab 3: Language Structures User Interface Lab: GUI Lab Sep. 9 th, 2014.

Integrating VoiceXML with SIP services

© 2006 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 VUI Directory Handler Ben Shanfelder UCBU Software Engineer.

Working with the XML Document Object Model ©NIITeXtensible Markup Language/Lesson 7/Slide 1 of 44 Objectives In this lesson, you will learn to: *Identify.

INTRODUCTION TO JAVASCRIPT AND DOM Internet Engineering Spring 2012.

Office Management Tools II Ms Saima Gul.  When you create your tables, you should assign each table a primary key—one or more fields whose contents are.

Using Client-Side Scripts to Enhance Web Applications 1.

JavaScript, Fourth Edition Chapter 5 Validating Form Data with JavaScript.

Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©

Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.

Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.

 Whether using paper forms or forms on the web, forms are used for gathering information. User enter information into designated areas, or fields. Forms.

Introduction to JavaScript CS101 Introduction to Computing.

Creating User Interfaces Another example. Classwork/homework: work on VoiceXML project.

Microsoft Expression Web 3 – Illustrated Unit D: Structuring and Styling Text.

Tips for Taking the FSA ELA Writing Assessments

Radio Buttons. Input/Form/Radio Group Use the dialog to enter label and values for the radio buttons.

 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,

Programming Logic and Design Fifth Edition, Comprehensive Chapter 6 Arrays.

PART 1 XML Basics. Slide 2 Why XML Here? You need to understand the basics of XML to do much with Android All of they layout and configuration files are.

LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.

VoiceXML Tutorial: Part 1 Introduction and User Interaction with DTMF

Programming Web Pages with JavaScript

Unit M Programming Web Pages with

ELPA21 Data Entry Interface (DEI) Overview

Yes, I'm able to index audio files within Alfresco

Types of Search Questions

Unit M Programming Web Pages with

DBW - PHP DBW2017.

Microsoft Office Illustrated

Tips for Taking the Computer-Based FSA Mathematics Assessments

Online Testing System Assessment Viewing Application (AVA)

IBM Kenexa BrassRing on Cloud Responsive Apply: Gateway Questionnaire Configuration April 2017.

Data Entry Interface (DEI) Overview

New User Guide Learning how to use your NxPay Account

Chapter 11 Describing Process Specifications and Structured Decisions

Presentation transcript:

VoiceXML: Application and Session variables, N- best and Multiple Interpretations

Acknowledgements Prof. Mctear, Natural Language Processing, University of Ulster. Bevocal Café Documentation

Overview More on variables Variable scope Session variables Application variables Using N-best Results Multiple Interpretations

More on variables: scope The scope of a variable defines when and where the variable exists and may be accessed. When the VoiceXML interpreter exits the scope where the variable was defined, the variable is destroyed. When the VoiceXML interpreter is in the scope of a variable, then that variable can be accessed by both VoiceXML and ECMAScript.

Variable ScopeWhere definedWhen initialised Where accessible When destroyed SessionBy the VoiceXML interpreter At beginning of session During session At end of session ApplicationIn the root document: child of When interpreter loads root document Throughout application When interpreter leaves root document DocumentIn a document: child of When interpreter loads document Within element When interpreter leaves document DialogWithin a form as child of or form item element When interpreter loads Within element When interpreter leaves AnonymousIn a or element When interpreter interprets Within element When interpreter leaves

Example of variable scope … var name = “student_id” expr = “123456” /> var name = “student_id” expr = “654321” /> var name = “new_student_id” expr = “document.student_id” /> … Document scopeDialog scope Reference to variable with document scope To reference a variable in enclosing scope use the scope’s logic scope name as prefix

Session variables Global, read-only, predefined variables that exist for the lifetime of the session session.connection.local.uri - URI for the local device. (also called DNIS - Dialled Number Information Service, provides the number of the called party.) session.connection.remote.uri - URI for the remote caller device. (also called ani - Automatic Number Identification, provides the number of the caller.) session.connection.protocol.name - name of the connection protocol session.connection.protocol.version - connection protocol version

Application variables Provide information about the text most recently recognized by the speech recognition engine The speech recognition engine returns a token (word or phrase) from the grammar that best matches the user’s utterance in the application.lastresult$ variable.

Attributes of application.lastresult$ The variable application.lastresult$ has four attributes: utterance - The word from the grammar that the speech recognition engine determines is the best match for the user’s utterance confidence - The speech recognition engine’s estimate of the accuracy for the utterance word. The value ranges from 0.0 to 1.0, with 0.0 for least confidence and 1.0 for most confidence. interpretation - The ECMAScript variable derived by the semantic interpretation script that will replace the utterance word. If the field has no semantic interpretation script, then the utterance word is copied to the interpretation word. inputmode - Whether the user entered the information using DTMF or voice

Example Confidence for user’s last utterance is less than 0.6 Save the value and branch to a control that validates the utterance Verify the user’s last utterance based on what the system recognised Take appropriate action

Using list of n-best tokens Set recogniser to return best 2 matches e.g. Result is returned in an array, lastresult$, in which the matched token with the highest confidence is in position 0 and the matched token with the second highest confidence match is in position 1. lastresult$ may contain up to maxnbest elements; in most cases, fewer results are returned. Can be used in cases where words are easily confused e.g. Boston / Austin and user is asked to verify which of the returned words is correct The application.lastresult$ array contains at least one element, namely, application.lastresult$[0]. You can check application.lastresult$.length to see how many elements are in the array.

Example using n-best recognition scores Cities [... (austin ?texas) { } (austin ?california) { } (boston ?massachusettes) { }... ]

I am not sure what you said. If you said press 1. If you said press 2. you said you said N-best example continued

Alternatively:

Multiple Interpretations In some applications, a single recognized utterance may have multiple interpretations, indicating that the utterance is ambiguous. Cities [... (portland ?maine) { } (portland ?oregon) { }... ] Multiple interpretations lets an application access the different interpretations for a given recognized utterance. If multiple grammar rules match the recognized utterance, all resulting interpretations are returned. You enable multiple interpretations by setting the bevocal.maxinterpretations property to a value other than one. bevocal.maxinterpretations

Combining the features Cities [... (austin ?texas) { } (austin ?california) { } (boston ?massachusettes) { }... ] Enable both Nbest and multiple interpretation features Application:Which office would you like to visit? User:(Garbled) ahstin. Application:Please say 1 if you mean Boston Massachusetts; 2 if you mean Austin Texas; 3 if you mean Austin California. If you want to start over, answer 0. User:Two. Application:Scheduling a visit with the Austin Texas office.