Presentation is loading. Please wait.

Presentation is loading. Please wait.

VoiceXML continued Speech reco/speech synthesis recap rps example ( ) Homework: Do VoiceXML examples. Start planning Project 2.

Similar presentations


Presentation on theme: "VoiceXML continued Speech reco/speech synthesis recap rps example ( ) Homework: Do VoiceXML examples. Start planning Project 2."— Presentation transcript:

1 VoiceXML continued Speech reco/speech synthesis recap rps example ( ) Homework: Do VoiceXML examples. Start planning Project 2.

2 Speech recognition / Speech synthesis Area of research Area of ‘marketing’ / product research –What is the killer ap? Perhaps phones & PDAs. Focus was on dictation You do not need to understand the science since tellme studio does it for us. –Some background always helpful.

3 Speech recognition concepts Air pressure  diaphragm in phone  electrical signal  (Fourier Transform)  wave pattern matched against sets of canonical patterns (native speaker of English, perhaps male/female & young/old alternatives) generated for the specified grammar (using a segmentation=dividing up of the parts) produce set of probabilities for each option in grammar –Tellme/VoiceXML provides fieldname$.confidence

4 Fourier Transform (Discrete Fourier Transform -- FFT) Takes data representing a signal And produces numbers representing the combination of sine and cosine waves that make up the signal

5 Speech recognition Works on the product of the FFT Uses (in most cases) –Segmentation: attempt to break up into pieces, perhaps syllables or words –Grammar: definition of what is to be expected –Probabilities: if first part matched X, then greater probability that then next would match to Y

6 Speech synthesis aka TTS (text to speech) lexical units (syllabus of words)  phonemes  pre- recorded (wav) files of phonemes This is again a segmentation process: need to divide up the words and then put together so speech sounds 'natural'. –particular phoneme may [need to] sound different in different context. –also need to deal with abbreviations & local accents –Place names (important in travel & weather applications) Special case: detect and use wav file for each name. Older methods were all synthesized –similar distinction between all synthesized and samples of music

7 Speech synthesis is essentially ‘the computer’ reading ‘out loud’. Easy to do most things More and more difficult to do complete job

8 VoiceXML elements –for handling less than ideal conditions and –for selection from list (alternative to grammar) –pause

9 Accepting caller input Application: getting caller name More complex than you would think There is the element

10 Simple record/playback Hello. Who is calling? Hello. Have a nice day.

11 Problems Replays in caller’s voice Can’t use outside the form Could save to server: lookup record under VoiceXML elements. –Sample php code

12 More involved dialogue Hello. Who is calling? Great to hear from you.

13 Continued: ask for class Which class are you in? <![CDATA[ [ [databases] { } [interfaces both] { } [no none not] { } ] ]]>

14 Continued: respond for each class Good to hear from you check the next class notes for how to do this This may not be quite what some of you wanted. I don't know why you are calling but it is nice to chat

15 Suggestion Get the simple one working and then work on the second one Suggested use is for callers to record a greeting –So it makes sense to be in the caller’s voice Experiment with grammar

16 rock paper scissors Note: a different version is on the tellme site. This is not a great choice (identify a better one) since it doesn't support the illusion that 'the computer' is making the choice at the same time that the player is. … but it does illustrate the features.

17 rps logic requirements create a random move for 'the computer' –use Math.random and Math.floor prompt and then distinguish caller saying 'rock', 'paper', 'scissors', 'score', 'quit' –use and elements keep score –use JavaScript variables modest amount of error handling – and and count attribute in

18 VoiceXML features Menu element replaces form element (and its use of field and grammar elements) Each has a count. Used when nomatch or noinput occurs and you don’t want to re-prompt with exact same words. Can distinguish nomatch from noinput Can control timing: amount of time waiting for caller input and time between system utterances. –My version could be improved!

19 var moves=new Array('rock','paper','scissors'); function randommove() { var r = Math.floor(Math.random()*3); return moves[r]; }

20 Say rock, paper, scissors, score or quit Please make a choice, rock, paper or scissors, or say score to get the score or quit to quit I guess you're done.

21 I didn't understand. I didn't hear anything rock scissors paper score quit

22 Scores are wins. Losses Ties loops back

23 Computer played Tie You lose. Paper covers rock. You win. Rock breaks scissors

24 Exercise: write the missing forms. Exercise: improve prompts. (Exercise when you can test it using the phone: improve use of breaks.)

25 Good bye

26 testing Go to your tellme account Point your account at the website. For my examples, find out URLs by going to my XML stuff site. Rock-paper-scissors is at http://newmedia.purchase.edu/~Jeanine/interfaces/rps.xml

27 Adding your voice Use tellme studio procedure for recording using the phone. They mail you the file. You should rename file. You upload that file to your website. You will need to keep track of files. They will use a name/number system for files sent. –Do one at a time

28 Other VoiceXML elements subdialog –to jump to encapsulated dialog OR server-side script (which generates a VoiceXML document) submit –to jump to a new document generated by server-side script data –to access XML content (generated by server-side programs) record –recording caller input for storage on server transfer –transfer to another phone number

29 Phone as interface: recap Familiar, ubiquitous, friendly (?), potentially hands-free Design challenge same (similar, not graphic design) –focus on function (information exchange) –caller actions can be guided but still potentially quite variable

30 Homework VoiceXML exploration exercises Begin to plan Project 2. Look at my pages for ideas. Make posting to Forum (indicate team members). –another Web/XML project –VoiceXML –WML and/or XHTML-MP (using Nokia or OpenWave. Can also try with your phone, if your phone is WAP enabled.)


Download ppt "VoiceXML continued Speech reco/speech synthesis recap rps example ( ) Homework: Do VoiceXML examples. Start planning Project 2."

Similar presentations


Ads by Google