Tuning Jenny Burr August 2006. Discussion Topics What is tuning? What is the process of tuning?

Slides:



Advertisements
Similar presentations
{Customer} Divisions Plan {Date} – {Version} Salesforce.com.
Advertisements

Copyright © SoftTree Technologies, Inc. DB Tuning Expert.
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Convergys Confidential and Proprietary Copyright © 2007 Convergys Corporation To Confirm, or Not to Confirm … That is the Question Kristie Goss Convergys.
© 2002 – 2007 Versay Solutions, LLC. All rights reserved. Building Fault Tolerant Voice User Interfaces SpeechTEK 2007 Tuesday, August 21 Track B “Getting.
CSC271 Database Systems Lecture # 18. Summary: Previous Lecture  Transactions  Authorization  Authorization identifier, ownership, privileges  GRANT/REVOKE.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
1 Profit from usage data analytics: Recent trends in gathering and analyzing IVR usage data Vasudeva Akula, Convergys Corporation 08/08/2006.
Describing Process Specifications and Structured Decisions Systems Analysis and Design, 7e Kendall & Kendall 9 © 2008 Pearson Prentice Hall.
5.00 Understand Promotion Research  Distinguish between basic and applied research (i.e., generation of knowledge vs. solving a specific.
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
Managing Complexity: 3rd Generation Speech Applications Roberto Pieraccini August 7, 2006.
ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.
Screen guidelines For data entry. Screen Layout for Data Entry Identify screen (name and purpose). Keep number of screens to a minimum. Ensure that all.
Adjusting EPLC to your Project Colleen Robinson & Teresa Kinley Friday, February 6, 2009.
Testing - an Overview September 10, What is it, Why do it? Testing is a set of activities aimed at validating that an attribute or capability.
Database System Development Lifecycle Transparencies
Brief Overview of Data Processing of Afghanistan Household Listing, Pilot Census Results, Population and Housing Census and NRVA Survey Brief Overview.
© 2005 by Prentice Hall Appendix 2 Automated Tools for Systems Development Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F.
Listening Methods In-Depth IVR and End to End Call Analytics What is the Real Performance of My Voice Self-Service Solution? Can I compare my IVR Design.
Virtual Memory Tuning   You can improve a server’s performance by optimizing the way the paging file is used   You may want to size the paging file.
1 A Practical Rollout & Tuning Strategy Phil Shinn 08/06.
1 © 2007 Avaya Inc. All rights reserved. Avaya – Proprietary & Confidential. Under NDA Tips for Writing Gooder Grammars Judi Halperin Speech Engineer.
Introduction to Systems Analysis and Design Trisha Cummings.
VIRTUAL BUSINESS RETAILING
Chapter 9 Database Planning, Design, and Administration Sungchul Hong.
Database Planning, Design, and Administration Transparencies
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox.
Appendix 2 Automated Tools for Systems Development © 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 2 Slide 1.
SpeechCycle Confidential Confidential 1 Optimizing Natural Language Interfaces: No Data Like More Data SpeechTEK New York, 2007 Jonathan Bloom & Roberto.
Testing. Definition From the dictionary- the means by which the presence, quality, or genuineness of anything is determined; a means of trial. For software.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
1 High Resolution Statistical Natural Language Understanding: Tools, Processes, and Issues. Roberto Pieraccini SpeechCycle
Super-charge Your Speech Application With Cross- Functional Reporting Techniques Bjorn Austraat Director, Solution Delivery TuVox, Inc. August 9 th, 2006.
Describing Process Specifications and Structured Decisions Systems Analysis and Design, 7e Kendall & Kendall 9 © 2008 Pearson Prentice Hall.
Chapter 7: A Summary of Tools Focus: This chapter outlines all the customer-driven project management tools and techniques and provides recommendations.
Database System Development Lifecycle 1.  Main components of the Infn System  What is Database System Development Life Cycle (DSDLC)  Phases of the.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
PG 1. PG 2 Improving Speech Applications with Usability Surveys How does Nortel measure the ‘Usability Pulse’ of Self Service? Judith Sherwood Sales Engineer.
Computers Are Your Future Tenth Edition Chapter 13: Systems Analysis & Design Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall1.
Tuning Your Application: The Job’s Not Done at Deployment Monday, February 3, 2003 Developing Applications: It’s Not Just the Dialog.
Test and Review chapter State the differences between archive and back-up data. Answer: Archive data is a copy of data which is no longer in regular.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
3 Copyright © 2009, Oracle. All rights reserved. Creating an Oracle Database Using DBCA.
© 2006 Cisco Systems, Inc. All rights reserved.1 Connection 7.0 Serviceability Reports Todd Blaisdell.
Sylnovie Merchant, Ph.D. MIS 161 Spring 2005 MIS 161 Systems Development Life Cycle II Lecture 5: Testing User Documentation.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
© 2013 by Larson Technical Services
The Implementation of BPR Pertemuan 9 Matakuliah: M0734-Business Process Reenginering Tahun: 2010.
Unit 17: SDLC. Systems Development Life Cycle Five Major Phases Plus Documentation throughout Plus Evaluation…
Additional Features in Microsoft Word Session Version 1.0 © 2011 Aptech Limited.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
HTBN Batches These slides are intended as a starting point for further discussion of how eTime might be extended to allow easier processing of HTBN data.
Chapter 9 Database Planning, Design, and Administration Transparencies © Pearson Education Limited 1995, 2005.
AREA MEETING 2016 What is the CLIMATE of your CLUB?
A Scientific Approach to Improving VUI Design Peter U. Leppik, CEO Rick Rappe, VP of Business Development Vocal Laboratories Inc.
Graphical Data Engineering
Business process management (BPM)
STANAG for Non-Specialists
Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.
Business process management (BPM)
Business System Development
Developing an Android application for
Specifying, Compiling, and Testing Grammars
Office of Education Improvement and Innovation
Introduction to Systems Analysis and Design
Presentation transcript:

Tuning Jenny Burr August 2006

Discussion Topics What is tuning? What is the process of tuning?

What is tuning?

What is Tuning? It is optimizing application performance –The caller experience –The accuracy of the speech recognizer (ASR engine) Optimize based on quantifiable data Using statistical significance of real world data

Why is it Important? Fine tune the application based on: –Caller/Customer experience –Goal/transaction completion rates –Recognition accuracy rates Results: –Improved call automation rates –Increased user satisfaction –Larger return on investment

What are the Goals of Tuning? Its a balancing act –Business and caller goals need to be defined at the beginning of the applications life cycle Goal/Transaction completion rates Caller satisfaction Call Transfer rates –Grammar complexity Too many items in the grammars can cause false accepts; too few leads to out-of-grammars –Threshold Parameters

Optimizing Recognition Accuracy Tuning the Grammars –What can be said: words and phrases –What is never said: over-generation Tuning the Recognition Lexicon –How it can be said: the pronunciations Dialect Cross word combination: Phrases often get run together resulting in phonological differences Optimize current default dictionary pronunciations

Optimizing the VUI Review the prompts –Wording, clarity, volume Review the error strategy –Are callers able to recover? Caller reactions Unexpected language, confusion

Design Phase Requirements Gathering Requirements Analysis Functional Requirements Document Creation & Approval Sample Call – Conversations Voice Talent Selection Full Design Specification VUI Business Logic Data Interaction Customer Specification Sign-off Development Testing Usability Testing Production Tuning Sample Implementation Methodology

The Tuning Process

1.Collect tuning data –logging & utterances 2.Transcribe caller utterances 3.Generate statistical reports 4.Analyze performance statistics –Recognition, Transaction/Goal Completion 5.Listen to call recordings (if available) 6.Analyze issues and create recommendations 7.Run experiments on recommendations 8.Write report outlining issues and recommendations 9.Review report 10. Implement recommendations

Collect Tuning Data Collect from a live production environment –Do not collect test calls during User Acceptance Testing Need a statistical sample of data –Generally 2,000-5,000 calls –Or a minimum of 100 utterances per task –No changes should be made with fewer than 30 samples –Tuning for SLM grammars requires thousands of utterances Changes to the application should not occur during this collection period

Discussion How can production calls be directed to the application for tuning purposes?

Transcription Listen to callers utterances Determine if the utterance matched the logged recognition result –If not, notate what occurred and/or what was spoken Notate noise events –Background noise, side conversations, coughing, etc Spelling in transcriptions should match grammar

Generate Accuracy Statistics

Accuracy Statistics

Reviewing the Statistics In-grammar statistics –Review for problem areas –What is the percentage of false acceptance? –Will adding/modifying pronunciations help? –Will modifying the grammar help? –What is the percentage of false rejections? –Is the caller able to recover with the reprompt? –Should the confidence thresholds (or other parameters) be changed? How will that affect the false acceptance and false rejection rates?

Exercise Review the false accept in-grammar rates and determine ways of improving the recognition. –See handout

Accuracy Statistics

Reviewing the Statistics Out-of-grammar statistics –Review problem areas –Ensure that the issues are statistically significant Should you add any of the OOG/OOC items to the grammar? Should the prompt be rewritten to provide clearer instructions? Is the system correctly rejecting the OOG/OOC phrases?

Exercise Review OOG listing and listen to prompt: determine if the grammar should be modified or if the prompt should be changed –See handout

Grammar Over-generation Occurs when a grammar contains unlikely and/or unused phrases Results in unnecessary complexity Could result in nonsensical false acceptances Depending on grammar complexity, could result in recognition latency

Exercise Review a grammar for over-generation Determine items to remove –See handouts

Transaction/Goal Completion Review the goal/transaction completion statistics –Did the calls end with callers completing their goals? If not, where were the problems?

Compile Recommendations Determine the problem areas and identify ways to improve the application –Prompt changes –Error strategy –Pronunciations –Grammar additions/deletions –Weighting grammar items –Parameter changes –Frequent Timeouts –Sensitivity –Caller Confusion

Grammar Guidelines NEVER make up words in the grammar to approximate pronunciations –Example: tee vee as a grammar option for t v –Use the dictionary to create any needed pronunciations Apostrophes are a good thing - USE THEM! –thats not thats –im not im –im all done not im all done Use pre-and-post fillers only where appropriate Limit synonyms for initial production roll-out

Experiments Whats the purpose? Test potential grammar and parameter changes against the collected data set Regenerate the accuracy statistics to validate the recommendations Enumerate the difference the recommended change had on application performance

Tuning Report The tuning report outlines –Any adjustments: Grammars Pronunciations Parameters Prompts Application logic

Next Steps Implement recommended changes Gather data for a tuning effort to validate changes (if needed)

Summary –Data gathering is very important Need enough data per recognition state Data should vary for different caller populations and caller types –The transcription process is critical –Ensure that any issues reported are statistically significant Changes to the application should be made based on multiple occurrences of the problem Do not make changes for one-offs