Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research Feb 14 th 2003
Talk Outline Role of speech technology in devices Telephony Smartphones and PDAs Multimodality in User Interface
The Promise of Speech Technology
High InternetTV Phone PDA Ease of text input (keyboard/pen) Ease of GUI (screen/Pointer) Low High PC TabletPC ScreenPhone ScreenPhone PDA TabletPC Car Car InternetTV Role of Speech in Different Devices
Phone PC ScreenPhone PDATabletPCCar InternetTV A Roadmap for Speech Ease of text input (keyboard/pen) Ease of GUI (screen/Pointer) High High Low Speech-OnlyTelephony Dictation MultimodalCommand/Control
Speech Technology Meeting / Voic Transcription Market Opportunity Mobile Devices / Cars Telephony / Call Center Accessibility Desktop Dictation Desktop Command & Control Technology Readiness Customer Need Poor Alternative
The Business Value of Speech for Call Centers Customer Focus Less Time/Call Efficient Agents Less Time in Queue Increased System Usage Customer Retention $5/call to $.20/call Reduced Call Time Fewer Agents New Revenue Opportunities Up-Sell/Cross-Sell
Amtrak 61% Increase in Satisfaction 75% Increase in Automation Rate 90% Increase in Ticket Sales Thrifty Car Rental 40% increase in CSR productivity $1 million first year savings Merrill Lynch Automation rates from 82% to 90% First Year Savings $6.3M Call Center Examples
The Business Value of Speech for Operators Revenue In US$M The mobile operators need to make money from value-added services!
If you still doubt speech is good for the call center….
Why Speech at Microsoft? Natural UI, or the combination of speech recognition, natural language understanding, automatic learning... Those are the key technologies that will have the most impact over the next 15 years. Bill Gates, Microsoft Chairman
Microsoft Speech Server & SDK Visual Studio + ASP.NET + SALT Multiple Devices Call center + multimodal solution Unifies web & call center Reduces TCO
Speech in Mobile Devices Microsoft Smartphone & PocketPC Phones Rich Client 3% to 16% of WW mobile phone market Smartphones Thin Client 11% to 25% of WW mobile phone market Cellular Phones No Client 86% to 59% of WW mobile phone market SOURCE: Gartner, IDC, Microsoft
Thin Client Devices Over Voice Channel Web Server MS Speech Server PSTN SMS Messages Voice Only Apps
Grammars Prompts ASP.NET Dialogs Speech Engine Services Telephony App Services Rich Client Devices Over Data Channel Web Server MS Speech Server SMS Push for Brower Launch
Microsoft Voice Command Pocket PC voice-enabled applications: Voice Dialer, Contacts, Calendar, Media Player No connectivity necessary (100% embedded) No training needed, (speaker-independent) Continuous speech recognition “Call John at home”
Multimodal Interactive Pad (MIPAD)
Multimodal Map
Current Speech User Interfaces Need improved Speech user interfaces Even no-errors and fast processing not sufficient But errors occur: better error correction needed Social issues: Microphones can’t tether user Users more comfortable talking to phones, cars. Talking to computers not likely in meetings or cubicles
The Future of Natural User Interfaces
End User Needs Technology, Research Software Scenarios Bridging The Gap
Thank You!