TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School of Computing and Intelligent Systems, Faculty of Informatics, University of Ulster, Magee.
Objectives of Research To develop a system, TeleMorph, that dynamically morphs between output modalities depending on available network bandwidth: Wireless systems output presentation (unimodal/multimodal) depending on the network bandwidth available Implement TeleTuras, a tourist information guide for the city of Derry Receive and interpret questions from the user Map questions to multimodal semantic representation Match multimodal representation to database to retrieve answer Map answers to multimodal semantic representation Query bandwidth status Generate multimodal presentation based on bandwidth data
Wireless Telecommunications Generations of Mobile networks: 1G - Analog voice service with no data services 2G - Circuit-based, digital networks, capable of data transmission speeds averaging around 9.6K bps 2.5G (GPRS) - Technology upgrades to 2G, boosting data transmission speeds to around 56K bps. Allows packet based always on connectivity 3G (UMTS) - digital multimedia, different infrastructure required, data transmission speeds from 144K-384K-2M bps Positioning Systems: GPS DGPS GLONASS GSM
Mobile Intelligent MultiMedia Systems SmartKom Mobile, Public, Home/office Saarbrucken, Germany Combines speech, gesture and facial expressions on input & output Integrated trip planning, Internet access, communication applications, personal organising VoiceLog BBN technologies in Cambridge, Massachusettes Views/diagrams of military vehicles and direct connection to support Damage identified & ordering of parts using diagrams MUST MUltimodal multilingual information Services for small mobile Terminals EURESCOM, Heidelberg, Germany Future multimodal and multilingual services on mobile networks
Intelligent MultiMedia Presentation Flexibly generate various presentations to meet individual requirements of: 1) users, 2) situations, 3) domains Fine-grained coordination of communication media and modalities Key research problems: Semantic Representation Fusion, integration & coordination Synchronisation
Semantic representation - represents semantics Frame-based representations: -CHAMELEON -REA XML-based representations: -SmartKom -MUST Fusion, integration & coordination of modalities Integrating different media in a consistent and coherent manner Multimedia coordination leads to effective integrated multiple media in output Synchronisation of modalities Time threshold between modalities E.g. Input - What building is this?, Output - This is the Millenium forum Not synchronised => side effect is contradiction
Intelligent MultiMedia Presentation Systems Automatically generate coordinated intelligent multimedia presentations User-determined presentation COMET COordinated Multimedia Explanation Testbed Generates instructions for maintenance and repair of military radio receiver- transmitters Coordinates text and 3D graphics of mechanical devices WIP Intelligent multimedia authoring system presents instructions for assembling/using/maintaining/repairing devices (e.g. espresso machines, lawn mowers, modems) IMPROVISE Graphics generation system constructive/parameterised graphics generation approaches Uses an extensible formalism to represent a visual lexicon for graphics generation
Intelligent MultiMedia Interfaces & Agents Intelligent multimedia interfaces Parse integrated input and generate coordinated output CUBRICON Calspan-UB Research center Intelligent CONversationalist Air Force Command and Control Generates & recognises Speech; natural language text; displays graphics; interprets gestures made with a pointing device Intelligent multimedia agents Embodied Conversational Agents Natural human communication - speech, facial expressions, hand gestures, & body stance COLLAGEN COLLaborative AGENt object-oriented Java middleware for building collaborative interface agents MIT Media Laboratory work on embodied conversational agents
Project Proposal Research and implement a mobile intelligent multimedia presentation system called TeleMorph Dynamically generates a multimedia presentation determined by the bandwidth available TeleTuras tourist navigation aid providing testbed for TeleMorph incorporating: route planning, maps, spoken presentations, graphics of points of interest and animations Output modalities used Effectiveness of communication TeleTuras examples: Where is the Millenium forum? Take me to the GuildHall What buildings are of interest in this area? Is there a Chinese restaurant in this area?
Architecture of TeleMorph
Comparison of Intelligent MultiMedia Systems
Comparison of Mobile Intelligent MultiMedia Systems
Prospective Tools Development language - J2ME (Java 2 Micro Edition) Speech input/output - Java Speech API – IBMs implementation of JSAPI speech for Java US & UK English, French, German, Italian, Spanish, and Japanese Java Speech API Markup Language (JSML) Java Speech API Grammar Format (JSGF) Positioning system - GPS (Global Positioning System) provides the accurate location information necessary for a LBS (Location Based Service) Graphics input/output - The User Interface (UI) defined in J2ME is logically composed of two sets of APIs: Low-level UI API High-level UI API
Project Schedule
Conclusion A Mobile Intelligent MultiModal presentation System called TeleMorph will be developed Dynamically morphing between output modalities depending on available network bandwidth Bandwidth and Device determined Mobile MultiModal presentation TeleTuras will be used as a testbed for TeleMorph Corpora of questions to test TeleTuras (prospective users/tourists)