Download presentation
Presentation is loading. Please wait.
1
Creating User Interfaces
Discussion: current speech reco products. VoiceXML Homework: register as developer at studio.tellme.com. Do tutorial. Come to class with phones (prepare to share).
2
Discussion Reports on current speech products
3
Telephone Caller to system: speech recognition,
using grammars (limited vocabulary, general audience, no training) optional use of touch tones (numbers) System to caller: recorded audio (wav files) plus TTS (text to speech) Limited bandwidth, in comparison to other applications, but very familiar, ubiquitous medium 800 long distance, some airline information systems, others?
4
studio.tellme.com Company that provides ‘engine’ for applications
Provides developing environment We are doing the tellme version of VoiceXML, but it appears to be standard. Register as a developer: Provide your own id; assigned a PIN Scratchpad for quick testing Put VoiceXML in ScratchPad place (no audio files) VXML (8965) SAY id and then PIN. Application URL for projects with multiple files To look at someone else's project, you change your Application URL called pointing your account to a new source.
5
Previously, tellme offered something called My Extensions that was another phone number.
6
VoiceXML XML document (VXML header)
VoiceXML has tags for flow-of-control and calculations. Also can use <script> for JavaScript Grammars come in different varieties. We will use the tellme way. Grammars are included in CDATA tags to prevent XML interpretation. Many grammars constructed for you. <field name="answer" type="boolean" >… will listen for yes or no. <field name="price" type="currency" > … will listen for currency. <menu > <choice > <choice> for list Only touch on topic. The tellme site has several tutorials and examples.
7
example This is current URL that my MyStudio points to. You can set yours to point to it. NOTE: this is not particularly appropriate but …
8
Very brief overview <vxml> document contains <form> and/or menu elements. <form> can contain <block>, <field> <block> can contain <audio> or do its own audio <field> can contain <prompt>, <grammar>, <noinput>, etc. NOTE: certain types of <field> elements use built-in grammars, for example, boolean Can have a child node <filled> that indicates what to do if there is a match <menu> is a compressed way use a simple grammar
9
Very brief, cont. Logic can be done using a <script> element that contains a variant of JavaScript and/or vxml logic elements, including <var> <if>, <else> <elseif> other These may be part of a <filled> element
10
Audio Tellme studio provides way to record [your] speech as a wav file to upload to a website. Sends it to your address You upload your VoiceXML file plus any wav files (and anything else) <audio src="mygreeting.wav">Welcome to my site </audio> If tellme can't find the mygreeting.wav file, it uses its Text to Speech on the string "Welcome to my site". Note: you also can use a full URL: You put in the URL for the voicexml file into your Tellme studio account, called pointing to the URL. TEST
11
VoiceXML basics, continued
<form> element can contain <block> elements, which can contain <audio>, <go>, other <field> which can contain <prompt> <grammar> (if not one of built-in grammars) <filled> <var> tags can be at different levels (for example, document, block, or higher levels) <if> <elseif><else> tags <script> elements for JavaScript (which can also appear in expressions> The filled element is a possible element in the field element.
12
VoiceXML basics: typical case
a form element <field> <prompt>, made up of <audio>, with reference to recorded wav file and backup text <grammar>, if NOT using built-in grammars designated by type attribute of field. This is a CDATA section. <filled> with (follow-on) code using field <catch> for nomatch, noinput cases
13
Caution A form contains various elements, including a field.
If a field has a grammar and the grammar is satisfied, control goes to a filled tag
14
obligatory… <?xml version="1.0"?> <vxml version="2.0">
<form> <block> <audio src="prompt1.wav">Hello, world </audio> </block> </form> </vxml> recorded using tellme studio Try it! backup using TTS, just in case src file missing
15
Preparation: objects JavaScript (and other languages) use classes and objects Objects (aka object instances) are declared (created, instantiated) as members of a class Objects have properties ('the data') methods (functions that you can use 'on' the objects) static methods Math.random Preparation for use of the date/time object
16
Example: tm_date var dt = new tm_date; creates a date/time object. Use methods to extract/manipulate information held 'in' dt. var day = dt.get_day(); Use static methods supplied to do common tasks: var dn=tm_date.to_day_of_week_name(day); or directly: var dn=tm_date.to_day_of_week_name(dt.get_day());
17
outline Header stuff script with external reference
script (code) encased in CDATA notation Form/Block, with text to speech using value produced by script Closing stuff
18
<?xml version="2.0"?> <vxml> <script src=" Will make use of data functions
19
<script> <![CDATA[ var dt = new tm_date();
var monis = tm_date.to_month_name(dt.get_month()); var dateis = dt.get_date(); var dayis = tm_date.to_day_of_week_name(dt.get_day()); var yearis = tm_date.to_year_name(dt.get_full_year()); var houris= dt.get_hours() - 4; var minutesis=dt.get_minutes() var whole = 'The date is '+ monis+' '+dateis+'. It is ' + dayis+'. The time is ' + houris + ' ' + minutesis; ]]> </script> brute force correction from GMT
20
<value expr="whole"/> Good bye. </block> </form>
<block>Hello. <value expr="whole"/> Good bye. </block> </form> </vxml> Can use block for audio
21
Example: my family Directed responses to 3 family members:
Daniel, question/response on activities Aviva, question/response on number of cranes Esther response Calculations (arithmetic) done using variables if tags The cond attribute is a condition test. limited error handled: exit on no-match event alternative is to repeat prompt, generally using count attribute You will need to sketch a tree of possibilities.
22
<vxml version="2.0"> <form> <field name="childid">
<prompt> <audio src="whosthis.wav">Hello. Who is calling?</audio> </prompt> 4 possibilities for Daniel, 2 each for Aviva and Esther.
23
<grammar type="application/x-gsl" mode="voice">
<![CDATA[ [ [dan daniel (daniel meyer) (dan meyer)] {<childid "daniel">} [aviva (aviva meyer)] {<childid "aviva">} [esther (esther minkin) ] {<childid "esther">} ] ]]> </grammar>
24
<if cond="'daniel'==childid"> <goto next="#danfollowup"/>
<catch event="noinput nomatch"> <audio src="sorry.wav">Sorry. I didn't get that.</audio> <exit/> </catch> <filled> <if cond="'daniel'==childid"> <goto next="#danfollowup"/> <elseif cond="'aviva'==childid"/> <goto next="#avivafollowup"/> <elseif cond="'esther'==childid"/> <goto next="#estherfollowup"/> <else/> <reprompt/> </if> </filled> </field> </form> never happens The #xxx are forms in this document. Note inner, single quote marks. Note double ='s
25
<form id="danfollowup"> <field name="today" >
<prompt> <audio src="congratsdan.wav" >Congratulations on the new job. Did you work on your thesis, or do aikido or jo today?</audio> </prompt> <grammar type="application/x-gsl" mode="voice"> <![CDATA[ [ [aikido (i key dough)] {<today "aikido">} [thesis (work)] {<today "thesis">} [jo (joe) ] {<today "jo">} [both (all) (everything) ((i key dough) jo)]{<today "both">} [none nothing (sort of)] {<today "nothing">} ] ]]> </grammar> <catch event="noinput nomatch"> <audio >I didn't quite understand. Call or send .</audio> <exit/> </catch> Daniel's followup.
26
<if cond="today=='aikido'" >
<filled> <if cond="today=='aikido'" > <audio>Some aikido is fine. </audio> <elseif cond="today=='thesis'" /> <audio>Good, but do other things also.</audio> <elseif cond="today=='jo'" /> <audio>don't get hit in the head.</audio> <elseif cond="today=='both'" /> <audio>Doing some of everything is best. </audio> <elseif cond="today=='nothing'"/> <audio> You deserve a break, but remember you want to be done by September </audio> <else/> <audio> See you soon.</audio> </if> </filled> </field> <block> <audio> Good bye </audio> </block> </form> Aikido and jo are martial arts. Again: note the the value of the cond attribute is a string holding a logical test.
27
<form id="avivafollowup"> <var name="rest" expr="1000"/>
<field name="bcount" type="number"> <prompt> <audio src="howmanycranes.wav">Hello, Aviva. How many cranes have you made? </audio> </prompt> <grammar type="application/x-gsl" mode="voice" > <![CDATA[ NATURAL_NUMBER_THRU_9999 ]]> </grammar> <catch event="noinput nomatch"> <audio src="sorry.wav">Sorry. I didn't get that.</audio> <exit/> </catch> This makes use of a grammar that VoiceXML supplies, namely a natural (whole) number from zero to The code expects a number from 0 to 1000.
28
can't use < <filled>
<assign name="rest" expr="1000-bcount"/> <audio> <value expr="rest" /> </audio> <audio src="togo.wav"> to go. </audio> <if cond="rest<200" > <audio src="homestretch.wav">You're in the home stretch </audio> <elseif cond="rest<500" /> <audio src="morethanhalf.wav">More than half way </audio> <elseif cond="rest<800" /> <audio src="goodstart.wav">Off to a good start </audio> <else/> <audio> Get a move on </audio> </if> <audio src="goodbye.wav">Good bye. </audio> </filled> </field> </form> can't use < Note: < for less than sign.
29
<form id="estherfollowup"> <block>
<audio >Hello, Mommy. This is all I can do now. </audio> </block> </form> </vxml> I ran out of energy, so this is all the system says to Esther (my Mommy).
30
[again] Application logic
Implicitly in way menus and grammars work VoiceXML elements (for example, <if> and <var>. JavaScript code in attributes (for example, cond, expr) JavaScript code in <script> </script> Encase in CDATA to avoid problems with certain characters external JavaScript code, cited using <script src=file address /> Any or all of these options. More than one way to do things.
31
Class work [if time] EVERYONE (who hasn't already) signup studio.tellme.com Design SIMPLE application (you may work in groups): Ask one question Detect and respond to each of 2 or 3 answers Use examples here for models All text to speech Pick (at least) one and implement.
32
Homework Go to studio.tellme.com
[signup as developer] try examples (using scratch pad) record some voice samples Study tellme tutorials!!!! Note: final project will be a tellme application, may be done in teams of 2 or 3.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.