Forschungszentrum Telekommunikation Wien An initiative of the K plus Programme Multimodal applications for mobile devices in Java Michael Pucher (FTW Vienna) Georg Niklfeld (FTW Vienna), Robert Finan (Mobilkom Austria AG), Wolfgang Eckhart (Sonorys Vienna AG)
Contents Multimodality -History and types of multimodality -The importance of multimodality for mobile devices -Applications Architectures and Algorithms -Logical design of multimodal applications -Server and client side Speech processing -Java class architecture -Multimodal Integration algorithms in Java -Parsing and Integration -Servlet/Midlet architecture VoiceXML
History and types of multimodality Multimodality research since the 1980’s Early versus late fusion Types of multimodality -First order multimodality which allows sequential multimodal input -Second order modality allows uncoordinated, simultaneous multimodal input -Third order multimodality allows coordinated, simultaneous multimodal input
The importance of multimodality for mobile devices Multimodal communication is perceived as natural Disadvantages of unimodal interfaces for mobile devices -Small displays -No comfortable alphanumeric keyboards -Visual access to the display is not always possible Disadvantages cannot be overcome by increasing processor and memory capabilities
Applications List selection (e.g. Adresses) Map Navigation (Location Based Serices - GPS) Voice mail Car environments Advanced call managment Specialized applications for mobile working environments
Logical design of multimodal applications
Visual Browser
Voice Browser
Final architecture
Server and client side speech processing Server based ASR and TTS Embedded ASR and TTS Distributed Speech Recognition -ETSI standard -Feature extraction -Compression and error detection (4800bit/s)
Java class architecture
MMAction
MMReaction
MMRule
Multimodal integration algorithms in Java public MMReaction[] getReactions(String id) { Transaction trans = this.getTransaction(id); trans.removeOldObjects(); ListIterator actions = trans.getAllObjects(); while (actions.hasNext()) { MMAction mma = (MMAction) actions.next(); ListIterator rules = ruleList.listIterator(); while(rules.hasNext()) { ((MMRule)rules.next()).addMMAction(mma); } ListIterator rulesI = ruleList.listIterator(); while(rulesI.hasNext()) { ((MMRule)rulesI.next()).integrateActions(); } ListIterator rulesR= ruleList.listIterator(); while(rulesR.hasNext()) { MMReaction[] mmreac = ((MMRule) rulesR.next()).getMMReaction(); if (mmreac!=null) return mmreac; } return null; } Handling Parsing and Integration in MMIntegrator
public void addMMAction(MMAction mmo) { if (mmo instanceof PointClick && actArray[0]==null) { this.intActionSize = this.intActionSize +1; actArray[0] = (MMAction)mmo; } else if (mmo instanceof PointClick && actArray[1]==null) { this.intActionSize = this.intActionSize +1; actArray[1] = (MMAction)mmo; } else if (mmo instanceof RouteShow && actArray[2]==null) { this.intActionSize = this.intActionSize +1; actArray[2] = (MMAction)mmo; } public void integrateActions() { if (this.intActionSize==3) { ShowRoute show = (ShowRoute)this.reacArray[0]; show.pc0 = (PointClick)this.actArray[0]; show.pc1 = (PointClick)this.actArray[1]; SayRoute say = (SayRoute)this.reacArray[1]; say.pc0 = (PointClick)this.actArray[0]; say.pc1 = ((PointClick)this.actArray[1]; } Handling Parsing and Integration in Route (MMRule)
public MMReaction[] getReactions(String id) {... while (actions.hasNext()) { MMAction mma = (MMAction) actions.next(); ListIterator rules = partialRuleList.listIterator(); while(rules.hasNext()) { ((MMRule)rules.next()).addMMAction(mma); } Optimizing Parsing and using probabilistic information 1.Adding a probability to each MMAction depending on empirical investigations. (usability studies) 2.Calculate the probability after the integration depending either on a specific rule for each MMRule or on a global rule, using the timestamp variable of MMObject. e.g. it is likely that the SpeechCommand occurs between the PointClick commands and not before it. public void integrateActions() {... ((ShowRoute)this.reacArray[0]).calcProb(); ((SayRoute)this.reacArray[1]).calcProb();... }
Servlet/Midlet architecture The act method is executed in the context of a Servlet public void act(Object obj) throws Exception { ((HttpServletResponse)obj).setContentType(res.getString( "contenttype")); PrintWriter out = ((HttpServletResponse)obj).getWriter(); out.println(res.getString("xmlversion")); out.println(res.getString("vxmlversion"));..... // Print VoiceXML page here..... } The act method is executed in the context of an Applet/Midlet The Applet/Midlet implements MapInterface. public void act(Object obj) throws Exception { ((MapInterface)obj).drawRoute(pc0.getPoint (),pc1.getPoint()); } Act method of SayRoute and ShowRoute
Servlet/Midlet architecture
VoiceXML Dialogs Sie können eine Nachricht hinterlassen eine Notiz abhören oder auf den Kalender zugreifen <submit method="get" enctype="application/x-www-form- urlencoded" next=" at.ftw.voicexml.GetVoiceXMLPageServlet" namelist="pagename" /> Grammars [ ( (?eine ?neue nachricht) ?[hinterlassen aufnehmen aufzeichnen hinterlegen] ?bitte ) { return("storemessage.vxml") } ]