Multimodal Apps: Tablet PC & Speech Development in.NET casey chesnut brains-N-brawn.com Wisconsin.NET June 2005
Source Code The associated source can be found here: –
Seamless Computing Advanced Web Services (MVP05) Compact Framework (MVP04) MapPoint Tablet PC (MVP03) Speech Artificial Intelligence Direct3D Media Center
Questions How many programmers? –Tablet PC –Speech –Media Center
Outline Tablet PC Speech –Speech API (SAPI) –Speech Application SDK (SASDK) –Speech Server Demo –Tablet and Speech –Media Center and Speech
Outline : Tablet PC Development environment How it works Working with Ink Opinion Future
Development Environment Windows XP Pro (non Tablet edition) Visual Studio.NET 1.1 Tablet PC SDK 1.7 – 4b83-a821-40bc-aa85-c9ee3d6e9699&displaylang=enhttp:// 4b83-a821-40bc-aa85-c9ee3d6e9699&displaylang=en Recognizer Pack – 184DD-5E B E9F918B&displaylang=enhttp:// 184DD-5E B E9F918B&displaylang=en Digitizer Board – Tablet PC
How Ink works Digitizer collects stroke information Strokes are broken up into characters / words / drawings Character / word stroke info is transformed into some feature set Feature set is run through some sort of pre-trained AI Output is mapped to a dictionary or words
Demo Digitizer collects stroke information Tablet PC Inspector –
Demo Strokes are broken up into characters / words / drawings InkDivider –Tablet PC SDK Sample
Demo Character / word stroke info is transformed into some feature set Feature set is run through some sort of pre- trained AI Demo –/aiTabletOcr Article – / /
Demo Output is mapped to a dictionary or words Dictionary Tool – Article – / /
Working with Ink InkControls InkOverlay –Collection –Recognition RealTimeStylus Ink on the web
Ink Controls InkEdit InkPicture Code from scratch
InkOverlay Collection Recognition Demo apps
RealTimeStylus RealTimeStylusPlugin –Tablet PC SDK Sample
Ink on the Web IE only InkBlogWeb –Tablet PC SDK Sample Article –
Opinion Green Light –Tablet PC Edition 2005 improved recognition and usability dramatically –Recognition Pack made development more accessible –Language Support Chinese (Traditional and Simplified),U.S. English, U.K. English, French, German, Italian, Japanese, Korean, Spanish
Possible Future VS.NET 2005? Avalon? Will IE7 have tighter integration with ink? Longhorn – baked in Possiblity for training ink recognition
What about Pocket PCs Handwriting Recognition Form factors
Outline : Speech How does it work? –Synthesis (TTS) –Recognition (SR) Development –Speech API (SAPI) –Speech Application SDK (SASDK) –Speech Server (MSS)
How Synthesis Works Text is converted to phonemes Phonemes are appended together Audio is played back Demo –/ttSpeech app Article –
How Recognition Works Audio wav is transformed to some meaningful form Phonemes are found in audio signals Phonemes are mapped to a dictionary or words Demo –wavReader app Article –
Speech API (SAPI) Old school COM Windows applications Can do dictation Demo –SAPI app
Opinion Yellow light –It works, but is aging –Has to be trained for dictation –Limited language support Green light for Tablet PCs –Tablet PC has recognition and synthesis engines installed –Some Tablets have microphone arrays built in
Future System.Speech –Simple API –Reflection capabilities –Standards support (SSML, SRGS) –Engines should be improved from all the Speech Server work
What about Pocket PCs OEMs can add VoiceCommand WindowsMobile has the SAPI API, but no engines PlatformBuilder is supposed to have engines There are 3 rd party engines for purchase
Speech Application SDK VS.NET 1.1 integration For web based apps –Voice-only telephony –Multimodal browser Demo –Code voice-only from scratch Article –
SASDK Speech Synthesis –Inline –Code behind –Prompt functions –Prompt databases Speech Recognition –Inline –Static Grammar –Dynamic Grammar –DTMF
Speech Server Runs SASDK applications Primarily for Voice-only apps Also for Multimodal PocketPC apps Speech Language Packs –North American Spanish –Canadian French Article –
Deployment
Opinion Green light for Voice-Only –Great tool support –Cheap hardware –Language support Red light for Multimodal –Standards battle with VoiceXml –IE Speech Add-Ins are not accessible –Pocket IE Speech Add-In not updated for R2 release, nor does it support Smartphone
Possible Future VS.NET 2005? XAML? Will IE7 have voice browsing built-in? Other browsers to add SALT support? Pocket IE Professional?
Combo Demos Ink and Speech (WinForm) –InkCollection app – Ink and Speech (WebForm) –Video – Remote and Speech (AddIn) – Remote and Speech (HostedHTML) –
Questions