Smart System Aarthi Natarajan Suhaib A. Obeidat Ganesh Sridharan
Outline Motivation High-level Design Functional Design message and address book sorting/prioritizing Device Control Using Voice Interface Implementation Performance Metrics Future work Conclusions
Motivation Reduce Human Distraction Free hands, eyes Less time to achieve tasks (user behavior, voice) Least amount of attention Provide more services and flexibility Access over a phone Remote control of devices Ability to prioritize.
HLD User Interface (GUI + Voice-Enhanced) User Behavior Monitor (capturer) SMTPIMAP System Interface Message Sorting and Prioritizing Device Control Module Arrival Notification Module
Display Management 1. Sorting of s – based on the user access patterns 2. Sorting the address book – based on the frequency of correspondence. 3. Notifying the user of new message arrival: i. Access to the user’s calendar ii. frequency of correspondence iii. user-assigned trust iv. subject keywords
Store s Addresses and Priorities s addresses, user access pattern, frequency of correspondence and trust level are stored as an XML element in an XML file. Whenever the client is active, it is loaded into the memory from the file system as stored in a hash table. Periodically, the XML file is updated from the hash table.
Device Control Using Reduces human distraction by eliminating the need for physical movement. Ability to control different appliances/devices remotely: Devices connected to serial or parallel port Why not USB ? future work. Auto-response to provide feedback.
Device Control Using -Cont An address is assigned for the purpose of hardware control (e.g., Sender address is checked A digital signature can be used for security purposes. Computer that has control of the h/w is on-line all time, with the e- mail client running. Upon message arrival, subject line is parsed: E.g. “A.C: Switch On” Create an message destined to the user, with the result of the operation (e.g., success, fail, device is not connected).
Voice Interface Speech Synthesis: Text-to-speech (TTS) Notification, message text, etc. Speech Recognition Speech-to-text (STT) Rule-based grammar (higher accuracy of recognition). As opposed to dictation free recognition. Temporal-awareness.
Speech Synthesis Synthesizers provide the computer with the ability to speak. Users and applications provide text to a speech synthesizer, which is then converted to audio. Bad news: does not sound natural. Application Speech Synthesizer “Computers can speak” Computers can speak
Speech Recognition 1.Structure Analysis: start and end of paragraphs, sentences, and other structures 2.Text pre-processing: abbreviations, dates, numbers, currency amounts (etc.) 3.Text-to-phoneme conversion: times t ay m s 4.Prosody analysis: determine appropriate prosody for the sentence 5.Waveform production: Concatenation of chunks of recorded human voice Formant synthesis: signal processing techniques based on knowledge of how phonemes sound and how prosody affects those phonemes
Why Voice Allows access over a phone User’s hands are occupied User has physical disability (e.g., limited use of hands). User’s eyes are looking at something other than the screen (e.g., driving, maintenance and repair, etc). User has physical disability (e.g., visual impairment)
Challenges Involved Transience: “what did you say” ? Invisibility: what actions to perform ? Asymmetry: people can speak faster than they can type, but listen much more slowly than they can read. Synthesis quality: recorded or synthesized ? Recognition: flexibility vs. accuracy.
Design Issues in Speech Applications Feedback and Latency: People read meaning in pauses Speech applications cause pauses in places where they do not naturally belong. Prompts: Assessing the tradeoff between flexibility and performance. Explicit prompts: when user must be tightly constrained Implicit prompts: when application is able to accept more flexible input Handling errors: no repetition, more constrained grammar.
Protocols-IMAP IMAP I nternet M essage A ccess P rotocol Incoming Mail Protocol Provide support for different access modes: Online:NSF-like (connection maintained throughout) Offline: download and delete from server (periodic connections. Disconnected (hybrid) :download, manipulate, upload. Offline paradigm allows minimum connect time Constructs to permit online performance optimization, especially over low-speed links.
Protocols-SMTP SMTP S imple M ail T ransfer P rotocol Sending messages between servers or from a client to a server Proved useful in the wireless domain (e.g., used in SMS). Drawback: not fast enough ( was not intended for wireless).
Protocols-MIME MIME M ulti-purpose I nternet M ail E xtensions. How messages must be formatted so that they can be exchanged between different systems. Compatible with WAP.
Implementation Java-based client. IMAP, SMTP and MIME XML-based display manager JSAPI for speech synthesis and recognition FreeTTS speech synthesizer(written entirely in Java). ViaVoice speech recognizer (from IBM). JavaComm for hardware control. Mapping device name to corresponding port. Interfacing from the mail system to the particular device.
Performance Metrics Subjective Tests: User Distraction. Usability study (I.e., Flexibility of Use, UI convenience). Accuracy (e.g., of the voice interface, the remote control of the hardware).
Future Work Extending the hardware control capabilities of the system E.g., wireless access to the different devices Harmonic control of the overall system. Providing speech recognition allowing for user- machine dialogues. Web-based Implementation Access over a phone
Conclusions Capabilities of a context-aware system can go way beyond the traditional functionality. Implementing voice-enhanced systems introduces many issues. Ubiquitous Computing is a revolutionary rather than evolutionary field.