Presentation is loading. Please wait.

Presentation is loading. Please wait.

UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005.

Similar presentations


Presentation on theme: "UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005."— Presentation transcript:

1 UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005

2 Before beginning…  Thanks for the opportunity to share  Feel free to interrupt with Questions/Comments  This will hopefully be of interest to you!  AITP needed a presentation! We’re happy to be here…

3 Overview  UWSP Web Speech Research Group Purpose Research Interest Origins  Previous Work Shaker – PowerPoint conversion SIDE – Speaking Integrated Development Environment  Current Work Interactions with Web Pages Voice-controlled database interaction Telephony Applications  Questions

4 UWSP Web Speech Research Group  Purpose Research the functionality and usefulness of the use of voice input and output on the web through the creation of meaningful projects and prototypes  Voice input: speech recognition (SR, or ASR)  Voice output: speech synthesis, or text-to-speech (TTS)

5 UWSP Web Speech Research Group  Completed Research Developed IDE to assist in the preparation of speaking online course materials  Development tool to create speaking web pages  Useful for instructors, instructional designers, and Training materials  Current Research Interactive browsing capability (forms, etc.) Investigating speaking web pages with broader applicability – including speech recognition  Interactive database prototype – “hands-free” database updating  telephony-based systems

6 Origins of Web Speech Research Group  Online Course: WDMD 170 Spring 2004WDMD 170 Spring 2004 Audio over PowerPoint, or saved as HTML Audio over PowerPointsaved as HTML  large files – inaccessible to dial-up users  Clumsy to edit, maintain  Investigated Speech Recognition XP wished to train my profile  Opera introduced its “speaking browser” March 2004 Press Release Currently Opera 8.5 (load same page in Opera and speak)  Investigated Text-To-Speech (TTS)  Microsoft Speech Application Language Tags (SALT)  VoiceXML

7 Initial Inquiry  Investigated Speech Recognition XP prompted me to train my profile Demonstrate training speech profile  (Quick Launch Speech button)

8 Initial Inquiry  Investigated Text-To- Speech (TTS) Built into XP Demonstrate TTS Speech XP Supplied voices  LH Michael, Michelle  MS Mary, Mike, Sam  Purchased Voices NeoSpeech Kate, Paul

9 Speaking Integrated Development Environment:SIDE  Uses TTS (Text-To-Speech) technology  TTS in a web page: essentially a markup language SALT (Speech Application Language Tags)  Developed by Microsoft and the SALT Forum Voice XML  Roots in a research project called PhoneWeb at AT&T Bell Laboratories.  Eventually picked up by the VoiceXML Forum

10 Web Page TTS: SALT  SALT: “The Speech Application Language Tags (SALT) specification enables multimodal and telephony- enabled access to information, applications, and Web services from PCs, telephones, tablet PCs, and wireless personal digital assistants (PDAs). The Speech Application Language Tags extend existing mark-up languages such as HTML, XHTML, and XML.” -SALTForum.com

11 Web Page TTS: VoiceXML  Voice XML A language for creating voice-user interfaces, particularly for the telephone. It uses speech recognition and touchtone (DTMF keypad) for input, and pre-recorded audio and text-to-speech synthesis (TTS) for output. It is based on the Worldwide Web Consortium's (W3C's) Extensible Markup Language (XML), and leverages the web paradigm for application development and deployment. By having a common language, application developers, platform vendors, and tool providers all can benefit from code portability and reuse. -VoiceXML Forum

12 Comparison  SALT Newer technology Support from Microsoft Designed for internet age Controllable Voice purchase availability Large download to enable speech  VXML Older technology Support from VXML community Designed for telephony Many functions, not interactive Single voice available currently Very small add-in download

13 SALT and HTML SALT tags SALT tags are entered into the head of the document. Upon rendering the document IE recognizes the embedded SALT using a special plug-in

14 SALT example: Hello World  How does this work? Use of tags within the HTML document to invoke the Windows voice Example of simple tags: Hello World Hello World Example Hello World Example (Note: this example only “speaks” if you have the I.E. Web Speech Add-In installed. You can download the add-in from our web page)web page

15 VoiceXML and HTML VXML speech text within and VXML tags before Insert to ev:event = "load" ev:handler = "#objID"

16 VoiceXML example: Hello World  How does this work? Use of tags within the HTML document to invoke the voice within the Opera 8 web browser Example of simple tags: Hello World NOTE: Hello World Example must be manually loaded into Opera. Filename is Opera-HelloWorld.xml

17 Web Speech Research Group: First Iteration  Fall 2004: Independent Study – 2 students Conversion “wizard” –  PPT saved as HTML was input  Used notes section of PPT file as the text to be spoken by the page Added SALT Tags (worked only with SALT) Added “controls” via JavaScript Tabbed “Shaker”, as in SALT Shaker Presented December 2004 (requires I.E. Speech Add-in) Presented December 2004

18 Web Speech Research Group: Second Iteration  Spring 2005: Senior Projects Team (CIS 480) – 4 students create an application: Speech Integrated Development Environment (SIDE) To enable a web author to add speech to pages Useful in online courses Create SALT or VXML pages  CIS 499 Independent Study – 2 students Interactive database prototype

19 Spring 2005: Speech IDE  SIDE – Speech Integrated Development Environment Take SALT Shaker Wizard to an integrated development environment, or Speech IDE Allow the modification of any existing web page into one with speech

20 Web Page Conversion SIDE Project  SIDE Function: permits a web author to easily add speech to his/her web pages. Conversion:  HTML with SALT tags and Control Panel (IE)  HTML with Voice XML tags (Opera) HTML Text HTML with SALT SIDE HTML with VXML Keyboard Text file Voice

21 Where do the TTS tags go (SALT)? JavaScript SALT tags Control Panel JavaScript and SALT tags before Control Panel before

22 SIDE Conversion Example Can add speaking text to any HTML page  Convert the AITP Portal Page to a speaking page using SIDE. (close PPT to avoid invoking “Train Profile”) Will create SALT tags for I.E.

23 Speaking Pages USES??  Online course page “lectures”  Low overhead / low fidelity applications  Training Situations Looking for prototypical application

24 SALT Examples  Recognition only  Recognition and response combined Demonstrate internal simple processing  Interactive form Potentially linked to db and submission  Web page navigation JF

25 Main SALT Tags  The speaking (output) tag; TTS Methods:  Start() – begin speaking  Stop()  Pause()  Resume()  The listening (input) tag; recognition Contains one or more grammars  Grammars define what words are listened for  Binds the recognized value to a form element JF

26 Example: recognition  Recognize a country of the European Union (requires I.E. Speech Add-In and microphone) Recognize a country Courtesy of Mark Huckvale, University College London www.phon.ucl.ac.uk/home/mark/salt/ www.phon.ucl.ac.uk/home/mark/salt/ Key code snippets JF

27 Example: recognition and response  Recognize a country and supply its capital city (Huckvale) (requires I.E. Speech Add-In and microphone)supply its capital city Key code snippets function LookupCapital(country) { If (country=="Austria") return("Vienna"); else if (country=="Belgium") return("Brussels"); else if (country=="Cyprus") return("Nicosia"); else if (country=="Czech Republic") return("Prague"); etc. JF

28 Example: interactive form  Interactively order a pizza, using a Pizza order form (Hill, WSRG) (requires I.E. Speech Add-In and microphone) Pizza order form JF

29 Example: web page navigation  Navigate pages with links already established Page Control using speech recognition (Gibbs) Page Control (requires I.E. Speech Add-In and microphone) Allows speaking interruptions Allows continuous listening <salt:listen id="testreco" onsilence = "testreco.start();“ onnoreco = "testreco.start();“ onreco = "CheckCommand();"> JF

30 What It Can Provide  Accessibility No keyboard needed  Hands Free Use Automobiles!  Telephony Menu driven pages  Transparent Web Applications Same pages serving both www and listen-only devices? JF

31 Managing Data With Voice Recognition  Navigate Records  Create New  Update  Delete MS

32 Example: interactive database  Hands-free data management MS

33 Where Now?  Pre-Built Grammars  Telephony  Transparent Web Applications

34 Contact Information  UWSP Web Speech Group http://www.uwsp.edu/cis/dgibbs/speechgroup Questions? Comments?


Download ppt "UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP Monday, October 17, 2005."

Similar presentations


Ads by Google