Speech in.NET Sphinx CMU November 2002. 2 Presenter casey chesnut brains-N-brawn.com – Web Services – Mobile / Wireless – Speech.

Slides:



Advertisements
Similar presentations
(1) VoiceXML Overview, Opportunities & Challenges Hitesh Kr. Seth Chief Technology Evangelist SeraNova, Inc OReilly Conference.
Advertisements

VoiceXML: A Field Evaluation By: Kristy Bradnum Supervisor: Peter Clayton Presented in partial fulfilment of the CS Honours Project.
CS Body of Knowledge (ACM) Discrete Structures Programming Fundamentals Algorithms & Complexity Operating Systems Architecture & Organization Social &
Collaborative Customer Relationship Management (CCRM) User Group June 23 rd, 2004.
Which development tool is right for you? Commercial Tools John Fuentes – Principal Solutions Architect
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
The Speech Speech casey chesnut brains-N-brawn.com Madison.NET April 2007.
Page 1 of 29 Net-Scale Technologies, Inc. Network Based Personal Information and Messaging Services Urs Muller Beat Flepp
Languages for Dynamic Web Documents
Combining VoiceXML with CCXML: A Comparative Study Daniel Amyot and Renato Simoes School of Information Technology and Engineering University of Ottawa,
The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.
Pace VoiceXML Absentee System Paul Visokey, Ping Gallivan, Yani Mulyani, Lisa Jordan, Elaine Li, George Mathew, Qisheng Hong Presenter Name : Paul Visokey.
Template-based framework for building Multi-language VoiceXML application.
Voice XML Application Design Issues Darshan Desai And Shreenath Laxman Pace University.
VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University Joint work (in progress) with Daniel,
Template-based framework for building VoiceXML application Jonathan Law.
1 Classic ASP vs. ASP.NET Technical Information and Market Adoption Lance Welker University of San Diego Dr. Rebman MSIT 526 December 20, 2005.
About VoiceXML 2.0 Stefanie Shriver a lot of this stuff is pulled directly from the 2.0 spec:
VCIS Voice Case Information System by Selim Mimaroglu.
Multimodal Architecture for Integrating Voice and Ink XML Formats Under the guidance of Dr. Charles Tappert By Darshan Desai, Shobhana Misra, Yani Mulyani,
Thomas Kisner.  Unified Communications Architect at BNSF Railway  Board Member, DFW Unified Communications User Group ◦ Meets 4 th Thursday of Every.
Scelta della tecnologia di presentazione dei dati.
Multimodal Apps: Tablet PC & Speech Development in.NET casey chesnut brains-N-brawn.com Wisconsin.NET June 2005.
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
Find The Better Way Expand Your Voice with VXML May 10 th, 2005.
L EC. 01: J AVA FUNDAMENTALS Fall Java Programming.
Synthetic Agents that Speak and Listen Talking with Highbrow Avatars on Your Cell Phone Prof. Matthew Nickerson, Southern Utah University.
Overview of JSP Technology. The need of JSP With servlets, it is easy to – Read form data – Read HTTP request headers – Set HTTP status codes and response.
VoiceXML Builder Arturo Ramirez ACS 494 Master’s Graduate Project May 04, 2001.
1 Web Database Processing. Web Database Applications Static Report Publishing a report is prepared from a database application and exported to HTML DB.
Diane Nelson Marketing Metrics 2012 Steel Blue Media Mobile Marketing: Harness the Power of a New Generation.
From the PC to the Smartphone: Enabling Speech for On-Demand Data Applications Robert P. Bova Chief Executive Officer Vangard Voice Systems, Inc.
1 CS 3870/CS 5870 Static and Dynamic Web Pages ASP.NET and IIS.
Title slide to be used at the start of a module. Developing Mobile Apps Roland Guijt
Building Applications with Vision Media Servers Getting Your Ideas to Market Fast David Asher, Director, Product Marketing, NMS Michael Kuperstein, CEO,
CIS 375—Web App Dev II ASP.NET 2 Introducing Web Forms.
Introduction to ASP.NET. Prehistory of ASP.NET Original Internet – text based WWW – static graphical content  HTML (client-side) Need for interactive.
XForms: A case study Rajiv Shivane & Pavitar Singh.
UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005.
Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research Feb 14 th 2003.
Conversational Applications Workshop Introduction Jim Larson.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
Creating Speaking Web Pages: The Text-to-Speech Integrated Development Environment (TTS-IDE) David C. Gibbs Department of Mathematics and Computing University.
Integrating VoiceXML with SIP services
1 David Thomson The Search for a Dialog Metalanguage that Makes Everybody Happy David Thomson Chair, VoiceXML Tools Committee, SpeechPhone CTO.
C# Tutorial -1 ASP.NET Web Application with Visual Studio 2005.
Speech Technologies and VoiceXML try Department of Computer Science National Cheng-Chi University.
The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 14 Database Connectivity and Web Technologies.
Copyright© 2002 Avaya Inc. All rights reserved Anna Dorcey Director, Avaya DeveloperConnection Program August 4, 2004 Partnering in the VOIP World Anna.
March 20, 2006 © 2005 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 65 March 20, 2006 With Contribution from.
Web-based Enterprise Telephony Application Development Johnny Wong Principal Member of Technical Staff Oracle Corporation.
Preliminary Ocean Project Page 1 WGISS SG May 15, C. Caspar G. Tandurella P. Goncalves G. Fallourd I. Petiteville Preliminary Ocean Project Phase.
Phone Mashups Integrating Telephony & the Web Irv Shapiro CEO, Ifbyphone, Inc.
Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better.
Systems Analysis and Design in a Changing World, 6th Edition 1 Chapter 6 Essentials of Design.
France Télécom R&D – February 5th 2003 Internet Telephony Conference – Miami, Florida Bridging the Chasm Between Legacy and Next-Generation Networks Internet.
VoiceXML Version 2.0 Jon Pitcherella. What is it? A W3C standard for specifying interactive voice dialogues. Uses a “voice” browser to interpret documents,
Presentation Title 1 1/27/2016 Lucent Technologies - Proprietary Voice Interface On Wireless Applications Protocol A PDA Implementation Sherif Abdou Qiru.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web.
JavaScript Invented 1995 Steve, Tony & Sharon. A Scripting Language (A scripting language is a lightweight programming language that supports the writing.
Presented By Sharmin Sirajudeen S7 CS Reg No :
The Holmes Platform and Applications
Windows Forms for mobile development
SALT & The Microsoft Speech Application SDK
Running C# in the browser
VoiceXML An investigation Author: Mya Anderson
Presentation transcript:

Speech in.NET Sphinx CMU November 2002

2 Presenter casey chesnut brains-N-brawn.com – Web Services – Mobile / Wireless – Speech

3 Audience Java / C++ / VB / C# ? VoiceXml ? SALT / Speech.NET ?

4 Outline MS Technologies VoiceXml – Demo Speech.NET – Demo Future Questions (throughout) ~25 slides

5 MS Technologies Tools Devices – Phone – Desktop PC – Pocket PC – Tablet PC

6 Tools MS Agents SAPI / Speech SDK 5.1 (.NET wrappable) Office AutoPC ??? ASP.NET (VoiceXml) (beta) Speech.NET / IE Speech Add-In … SALT Telephony gateway (early 2003) … Pocket IE Speech Add-In (mid 2003)

7 Devices Phone – billions of devices, people are comfortable speaking to Desktop PC – large market, speech input is slower and uncomfortable Pocket PC – small market, opportunities for speech (device limitations) Tablet PC – new market, speech friendly (slate models don’t have keyboards)

8 Phone ASP.NET w/ VoiceXml 2.0 – Production quality now – Multiple vendor support Speech.NET VoiceOnly – Currently no way to deploy and test over a phone – Speech.NET Beta 2 has telephony simulation – MS target market for Speech.NET

9 Desktop PC Web – Speech.NET MultiModal Beta 2 IE Speech Add-In – Embedded control w/SAPI – MS Agents Fat – SAPI – MS Agents

10 Pocket PC Web – SALT Pocket IE Speech Add-Ins (mid 2003) Fat – 3 rd parties only – MS Reader does not support TTS

11 Tablet PC - TODAY! Web – … same as desktop PC – Beta 2 has added support for Tablet PC – Virtual keyboard has speech control Fat – … same as desktop PC – Virtual keyboard has speech control – MS Reader should be able to support TTS – Digital Ink is currently more compelling to MS

12 VoiceXml XML-based language – Declarative – XML tags, grammars – Procedural – Javascript Telephony Gateway is the client – Event driven – Bargein, Goodbye – Object oriented – Properties

13 Usage Input – Speech Recognition (Command and Control) – DTMF – Voice recording and posting to a server Output – Text-To-Speech – Prerecorded audio files Telephony control – Hang-up, Transfers, …

14 Architecture

15 VoiceXml DEMO – /vxml (VS.NET) – Mobile ADK (menu1.aspx) – BeVocal

16 VoiceXml - SALT VoiceXml : ??? : : SALT : Speech.NET – Nuance has some WYSIWYG SALT is considered lightweight to VoiceXml SALT was submitted to W3C August 2002 VoiceXml is v2.0 in W3C – Mandatory W3C grammar spec Beta 2 Speech.NET has moved to W3C SRGS VoiceXml has complementary specs (ccXml) VoiceXml is moving to MultiModal as well

17 VoiceXml - SALT VoiceXml = AT&T, Motorola, TellMe, (IBM) SALT = MS, SpeechWorks, Intel, (BeVocal) VoiceXml has multiple vendor support with venture capital from before the burst Most vendors will support both specs VoiceXml has ~ 15,000 developers SALT has potentially millions

18 SALT I have not read the new spec Remember doing an in-head mapping to VoiceXml when reading an early spec Why – Common spec for MultiModal operation – Multiple modes of interaction with the same syntax – Speech enabling existing sites Why not VoiceXml – MultiModal retrofit harder than redo

19 Speech.NET MS implementation of SALT (VoiceWebSolutions + DreamWeaver MX) Some Beta 1 Speech.NET apps still work, because SALT has not changed much, but Speech.NET Beta 2 controls have VoiceXml not as portable between vendors as it should be, the Speech.NET controls could help mitigate this for SALT – i.e. layer of abstraction for voice browser wars

20 Architecture

21 Code Creating static grammars and prompts Very little server-side code – Only dynamic grammars / prompts – Server-side code mods to better support speech Mainly setting properties on Speech controls and tying to client-side javascript Tie javascript to mouse-click events to avoid redundant code

22 Impression Separate app layers to reduce complexity – Voice UI will be less functional, design is key Learning low level SALT might be easier than high level Speech.NET controls Application controls change this in Beta 2 Speech.NET has a great debugger (now server side too), grammar, and prompt tools Speech Control Editor was needed for dev IE Audio meter was needed for MultiModal MultiModal has some time to grow

23 Speech.NET DEMO – Speech.NET Beta 2 (VS.NET) – /noHands (VoiceOnly web app)

24 Industry Wrote 1 st VoiceXml article a year ago – Received 1 st proposal request last month – 1 other proposal request since then Wrote 1 st Speech.NET article 5 months ago – Request for an article from MSDN magazine

25 Voice Recognition PSTN is less secure than Internet! – More accessible and easier to automate hack Traditionally spoken password OR DTMF pin, also # Clients always confuse with speech recognition Not a part of VoiceXml or SALT specs – Telephony gateways proprietary implementations Not useful for identifying somebody Useful for confirming somebody is whom they say they are Prints have to change when device changes

26 Future (MS Speech) SALT Telephony gateways Speech.NET (VoiceOnly then MultiModal) Pocket IE Speech Add-In NET Fat-client Speech APIs – Desktop / Tablet / PPC MS or 3 rd party VS.NET VoiceXml controls Possibility for Speech.NET controls to render both SALT and VoiceXml

27 Future Lots of W3C Voice specs … VoiceXml MultiModal browser Auto (hands-free, navigation, radio) 3G (bridge voice and wireless web) – offload Speech processing – VOIP or PSTN – Pocket PC Phone Edition / SmartPhones IBM recently announced chip for Speech on mobile devices

28 Questions