UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005.

UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005

Before beginning…  Thanks for the opportunity to share  Feel free to interrupt with Questions/Comments  This will hopefully be of interest to you!  AITP needed a presentation! We’re happy to be here…

Overview  UWSP Web Speech Research Group Purpose Research Interest Origins  Previous Work Shaker – PowerPoint conversion SIDE – Speaking Integrated Development Environment  Current Work Interactions with Web Pages Voice-controlled database interaction Telephony Applications  Questions

UWSP Web Speech Research Group  Purpose Research the functionality and usefulness of the use of voice input and output on the web through the creation of meaningful projects and prototypes  Voice input: speech recognition (SR, or ASR)  Voice output: speech synthesis, or text-to-speech (TTS)

UWSP Web Speech Research Group  Completed Research Developed IDE to assist in the preparation of speaking online course materials  Development tool to create speaking web pages  Useful for instructors, instructional designers, and Training materials  Current Research Interactive browsing capability (forms, etc.) Investigating speaking web pages with broader applicability – including speech recognition  Interactive database prototype – “hands-free” database updating  telephony-based systems

Origins of Web Speech Research Group  Online Course: WDMD 170 Spring 2004WDMD 170 Spring 2004 Audio over PowerPoint, or saved as HTML Audio over PowerPointsaved as HTML  large files – inaccessible to dial-up users  Clumsy to edit, maintain  Investigated Speech Recognition XP wished to train my profile  Opera introduced its “speaking browser” March 2004 Press Release Currently Opera 8.5 (load same page in Opera and speak)  Investigated Text-To-Speech (TTS)  Microsoft Speech Application Language Tags (SALT)  VoiceXML

Initial Inquiry  Investigated Speech Recognition XP prompted me to train my profile Demonstrate training speech profile  (Quick Launch Speech button)

Initial Inquiry  Investigated Text-To- Speech (TTS) Built into XP Demonstrate TTS Speech XP Supplied voices  LH Michael, Michelle  MS Mary, Mike, Sam  Purchased Voices NeoSpeech Kate, Paul

Speaking Integrated Development Environment:SIDE  Uses TTS (Text-To-Speech) technology  TTS in a web page: essentially a markup language SALT (Speech Application Language Tags)  Developed by Microsoft and the SALT Forum Voice XML  Roots in a research project called PhoneWeb at AT&T Bell Laboratories.  Eventually picked up by the VoiceXML Forum

Web Page TTS: SALT  SALT: “The Speech Application Language Tags (SALT) specification enables multimodal and telephony- enabled access to information, applications, and Web services from PCs, telephones, tablet PCs, and wireless personal digital assistants (PDAs). The Speech Application Language Tags extend existing mark-up languages such as HTML, XHTML, and XML.” -SALTForum.com

Web Page TTS: VoiceXML  Voice XML A language for creating voice-user interfaces, particularly for the telephone. It uses speech recognition and touchtone (DTMF keypad) for input, and pre-recorded audio and text-to-speech synthesis (TTS) for output. It is based on the Worldwide Web Consortium's (W3C's) Extensible Markup Language (XML), and leverages the web paradigm for application development and deployment. By having a common language, application developers, platform vendors, and tool providers all can benefit from code portability and reuse. -VoiceXML Forum

Comparison  SALT Newer technology Support from Microsoft Designed for internet age Controllable Voice purchase availability Large download to enable speech  VXML Older technology Support from VXML community Designed for telephony Many functions, not interactive Single voice available currently Very small add-in download

SALT and HTML SALT tags SALT tags are entered into the head of the document. Upon rendering the document IE recognizes the embedded SALT using a special plug-in

SALT example: Hello World  How does this work? Use of tags within the HTML document to invoke the Windows voice Example of simple tags: Hello World Hello World Example Hello World Example (Note: this example only “speaks” if you have the I.E. Web Speech Add-In installed. You can download the add-in from our web page)web page

VoiceXML and HTML VXML speech text within and VXML tags before Insert to ev:event = "load" ev:handler = "#objID"

VoiceXML example: Hello World  How does this work? Use of tags within the HTML document to invoke the voice within the Opera 8 web browser Example of simple tags: Hello World NOTE: Hello World Example must be manually loaded into Opera. Filename is Opera-HelloWorld.xml

Web Speech Research Group: First Iteration  Fall 2004: Independent Study – 2 students Conversion “wizard” –  PPT saved as HTML was input  Used notes section of PPT file as the text to be spoken by the page Added SALT Tags (worked only with SALT) Added “controls” via JavaScript Tabbed “Shaker”, as in SALT Shaker Presented December 2004 (requires I.E. Speech Add-in) Presented December 2004

Web Speech Research Group: Second Iteration  Spring 2005: Senior Projects Team (CIS 480) – 4 students create an application: Speech Integrated Development Environment (SIDE) To enable a web author to add speech to pages Useful in online courses Create SALT or VXML pages  CIS 499 Independent Study – 2 students Interactive database prototype

Spring 2005: Speech IDE  SIDE – Speech Integrated Development Environment Take SALT Shaker Wizard to an integrated development environment, or Speech IDE Allow the modification of any existing web page into one with speech

Web Page Conversion SIDE Project  SIDE Function: permits a web author to easily add speech to his/her web pages. Conversion:  HTML with SALT tags and Control Panel (IE)  HTML with Voice XML tags (Opera) HTML Text HTML with SALT SIDE HTML with VXML Keyboard Text file Voice

Where do the TTS tags go (SALT)? JavaScript SALT tags Control Panel JavaScript and SALT tags before Control Panel before

SIDE Conversion Example Can add speaking text to any HTML page  Convert the AITP Portal Page to a speaking page using SIDE. (close PPT to avoid invoking “Train Profile”) Will create SALT tags for I.E.

Speaking Pages USES??  Online course page “lectures”  Low overhead / low fidelity applications  Training Situations Looking for prototypical application

SALT Examples  Recognition only  Recognition and response combined Demonstrate internal simple processing  Interactive form Potentially linked to db and submission  Web page navigation JF

Main SALT Tags  The speaking (output) tag; TTS Methods:  Start() – begin speaking  Stop()  Pause()  Resume()  The listening (input) tag; recognition Contains one or more grammars  Grammars define what words are listened for  Binds the recognized value to a form element JF

Example: recognition  Recognize a country of the European Union (requires I.E. Speech Add-In and microphone) Recognize a country Courtesy of Mark Huckvale, University College London www.phon.ucl.ac.uk/home/mark/salt/ www.phon.ucl.ac.uk/home/mark/salt/ Key code snippets JF

Example: recognition and response  Recognize a country and supply its capital city (Huckvale) (requires I.E. Speech Add-In and microphone)supply its capital city Key code snippets function LookupCapital(country) { If (country=="Austria") return("Vienna"); else if (country=="Belgium") return("Brussels"); else if (country=="Cyprus") return("Nicosia"); else if (country=="Czech Republic") return("Prague"); etc. JF

Example: interactive form  Interactively order a pizza, using a Pizza order form (Hill, WSRG) (requires I.E. Speech Add-In and microphone) Pizza order form JF

Example: web page navigation  Navigate pages with links already established Page Control using speech recognition (Gibbs) Page Control (requires I.E. Speech Add-In and microphone) Allows speaking interruptions Allows continuous listening <salt:listen id="testreco" onsilence = "testreco.start();“ onnoreco = "testreco.start();“ onreco = "CheckCommand();"> JF

What It Can Provide  Accessibility No keyboard needed  Hands Free Use Automobiles!  Telephony Menu driven pages  Transparent Web Applications Same pages serving both www and listen-only devices? JF

Managing Data With Voice Recognition  Navigate Records  Create New  Update  Delete MS

Example: interactive database  Hands-free data management MS

Where Now?  Pre-Built Grammars  Telephony  Transparent Web Applications

Contact Information  UWSP Web Speech Group http://www.uwsp.edu/cis/dgibbs/speechgroup Questions? Comments?

UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005.

Similar presentations

Presentation on theme: "UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005.

Similar presentations

Presentation on theme: "UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005."— Presentation transcript:

Similar presentations

About project

Feedback