Multimodal Interaction
Modalities vs Media Modalities are ways of encoding information e.g. graphics Media are instantiations of modalities e.g. a particular image
How Do Multimodal Systems Differ? Domain/application Available media Modeling of context/environment Modeling of user Focus of research
Example Multimodal Systems Not speech-centric MIT paintbrush, soundbrush Wearables
Example Multimodal Systems Speech-centric MSOIP COMIC SmartKom
MSOIP Keywords Multimodal mobile dialog Integration of speech and pen input User modeling for presentations Johnston et al. 2001
MATCH Video Scroll down to the bottom of the page
About MATCH What input modalities? What output modalities? What application(s)? What aspects of context?
COMIC Keywords Ambient intelligence HHI/HCI research Collaborative problem solving User modeling Avatar Alexandersson et al. 2004
COMIC Video
COMIC Video
About COMIC What input modalities? What output modalities? What applications? What aspects of context?
SmartKom Keywords Multimodal dialog across applications devices and situations Avatar Situation aware Alexandersson et al., Reithinger et al. 2003
SmartKom Video I showed the SK-Mobile one, but the other one is also interesting.
About SmartKom What input modalities? What output modalities? What applications? What aspects of context?
Parts of a Multimodal System Interpreter Dialog Manager Generator Knowledge Base Speech In Speech Out Gesture In Present Out Text In Text Out
HCI and Multimodal Systems Input integration/fusion Representations Effective help Quality presentations Managing context Understanding the user
Different Uses of Modalities Concurrent or sequential Redundant or Complementary or Contradicting
Input Integration/Fusion Key elements: Time Multiple uses of some modalities Error rates Typical approach is to map straight to semantics if possible
Representation Increasing use of XML-based languages (SMIL, EMMA) But these don’t solve the semantic problems Keep ‘backbone’ knowledge separate from ‘peripheral’ information (Alexandersson et al.)
Effective Help How do each of the systems provide the user with: Explicit help? Implicit help?
Quality Presentations Talking heads Advantages Disadvantages Informative presentations are key User modeling/adaptive presentations are a bonus These systems go beyond scripts
Managing Context What kinds of context are there in a mobile multimodal interaction?
Understanding the User What kinds of information can we gather about users in general? About one user in particular? How can we use this information?
Commercial Multimodal Systems Most are for research Military Training and battlefield Education Tutoring systems Commercial ones include: Wii: eitU eitU Microsoft surface: 6n0 6n0
TradeOffs You get: More intuitive technology More information, more easily Less (dumb stuff) for you to do You trade: Privacy Control
Towards the Future Design Multimodal systems in virtual worlds, or crossing over from virtual to real worlds Ambient multimodal interaction Implementation Mashups – user controlled Pervasive multimedia
Towards the Future JwURqpFWs ashups
SciFi? Lathe of Heaven by Ursula LeGuin Summa Technologiae by Stanislaw Lem Fast Times at Fairmont High by Vernor Vinge The Human Machine Merger, talk by Raymond Kurzweil (at main=memelist.html?m=6%23581) main=memelist.html?m=6%23581
Additional Info arch/multimodal+system.html