Download presentation
Presentation is loading. Please wait.
Published byPiers Welch Modified over 9 years ago
1
The Speech Speech casey chesnut brains-N-brawn.com Madison.NET April 2007
2
Powerpoint Page Up Page Down
3
brains-N-brawn.com Pervasive Computing –Tablet PC (MVP 03) –Compact Framework (MVP 04) –Advanced Web Services (MVP 05) –Media Center (MVP 06) –Speech –Location Based Services –Artificial Intelligence –3D
4
Outline Speech Overview Vista Speech Recognition SAPI 5.3 / System.Speech Speech Server 2007
5
Outline : Speech Overview Voice User Interface How does it work? –Synthesis (TTS) –Recognition (SR)
6
Overview Speech is just another presentation system –Synthesis = Output to user –Recognition = User input Voice User Interface (VUI)
7
VUI Modes Applications –Multi-modal –Voice-only
8
VUI Tips Don't replicate the touch-tone-based menu system Restrict options on the main (opening) menu to 4 or fewer Make sure your opening greeting is short Don't design the app solely for the new user Focus on task completion above all What can I say? http://blogs.msdn.com/anandis_thoughts/archive/2 006/02/08/528181.aspx
9
Speech Synthesis Text to Speech –Dynamic –Prompt database
10
How Synthesis Works Text parsing –Sentences, numbers, symbols, pauses Natural language processing –Part of speech, tense Phonemes are looked up or sounded out Diphones are appended together Post process audio to add emphasis Play speech audio
11
How Synthesis Works Demo –/xnaSynth app Article –http://www.brains-N-brawn.com/ttSpeech/http://www.brains-N-brawn.com/ttSpeech/ –http://www.brains-N-brawn.com/xnaSynth/ (codebase from /ttSpeech)http://www.brains-N-brawn.com/xnaSynth/
12
Speech Recognition Speech to Text –Dictation –Command and Control
13
How Recognition Works Audio signal is processed Look for signals which might be speech Phonemes are found in audio signals Phonemes are mapped to a dictionary or words –Dictation or grammar-based Apply natural language processing
14
How Recognition Works Demo –/wavReader app Article –http://www.brains-N-brawn.com/noReco/http://www.brains-N-brawn.com/noReco/ –http://www.brains-N-brawn.com/speakerVerify/ (codebase from /noReco)http://www.brains-N-brawn.com/speakerVerify/
15
Outline : Vista Speech Recognizer Built-in to Vista’s shell Microphone bar Language support Can be trained to improve accuracy Command-and-control, also Dictation Automagic application support Horrible Office integration UAC problems
16
Demo Say what you see Show numbers Correct Spell it Mouse grid http://www.istartedsomething.com/20060808 /vista-speech-recognition-screencast/
17
High Risk Demo
18
Hack http://news.bbc.co.uk/1/hi/technology/63208 65.stm /micBarExtend – tap and talk
19
Narrator Vista’s screen reader
20
Outline : SAPI 5.3 / System.Speech Desktop applications –SAPI 5.3 –System.Speech
21
SAPI 5.3 COM based Native applications Managed apps which need more control
22
System.Speech Part of.NET 3.0 WPF Managed wrapper built on SAPI 5.3 Simple API Standards support (SSML, SRGS) Language support Vista Speech Recognition integration Does not work in XBAP
23
System.Speech.Synthesis SpeechSynthesizer SSML PromptBuilder Voices
24
System.Speech.Synthesis Demo –/speechSamples - /speechSynth
25
System.Speech.Recognition SpeechRecognizer / SpeechRecognizerEngine SRGS GrammarBuilder Advanced users –Deep-link functionality –Mixed initiative
26
System.Speech.Recognition Demo –/speechSamples - /speechReco
27
System.Speech Demo –/micBarExtend –/mceSapiMcpl Article –http://www.brains-N-brawn.com/speechSamples/http://www.brains-N-brawn.com/speechSamples/ –http://www.brains-N-brawn.com/micBarExtend/http://www.brains-N-brawn.com/micBarExtend/ –http://www.brains-N-brawn.com/mceSapi/ (not updated for Vista yet)http://www.brains-N-brawn.com/mceSapi/
28
What about Mobile Devices OEMs can add VoiceCommand –VoiceCommand is not accessible to developers WindowsMobile has the SAPI API, but no engines PlatformBuilder is supposed to have engines There are 3 rd party engines for purchase
29
Outline : Speech Server 2007
30
Speech Server 2007 Telephony Applications Outgoing calls Speaker Independent
31
Speech Server 2007 VOIP Language support VoiceXML / SALT Workflow development model Reports Still in beta
32
Speech Server 2007 Speech Synthesis –Inline –PromptBuilder –SSML –Prompt databases Speech Recognition –Inline –Dynamic Grammar –SRGS –Conversational Grammar Builder –DTMF
33
VoiceXML Declarative language Article –http://www.brains-N-brawn.com/vxml/http://www.brains-N-brawn.com/vxml/ –http://www.brains-N-brawn.com/myVoices/http://www.brains-N-brawn.com/myVoices/ –http://www.brains-N-brawn.com/voiceBio/http://www.brains-N-brawn.com/voiceBio/
34
SALT Yet another declarative language Multimodal support has been dropped Article –http://www.brains-N-brawn.com/noHands/http://www.brains-N-brawn.com/noHands/ –http://www.brains-N-brawn.com/speechMulti/http://www.brains-N-brawn.com/speechMulti/ –http://www.brains-N-brawn.com/tabletWeb/http://www.brains-N-brawn.com/tabletWeb/ –http://www.brains-N-brawn.com/mceSalt/http://www.brains-N-brawn.com/mceSalt/
35
Speech Workflow Speech Sequence Workflow designer Speech activities –Statement –QuestionAnswer Debugging tools
36
Speech Workflow Demo –/speechTextAdv –/speakerVerify –/mobileRecord Article –http://www.brains-N- brawn.com/speechTextAdv/http://www.brains-N- brawn.com/speechTextAdv/ –http://www.brains-N- brawn.com/speakerVerify/http://www.brains-N- brawn.com/speakerVerify/
37
Where Accessibility Telephony Telematics Home automation Mobile Devices / Tablets Gaming Warehouses …
38
Possible Future Telematics Service Pack for Office Support Exchange Server 2007 Speech Server 2007 release Rumors that WindowsMobile will get a public API Dictation has room to improve Hope that System.Speech will ultimately work in XBAP
39
Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.