LINGTOUR: a PDA for tourists Alain Goyé, Eric Lecolinet, Mutsuko Tomokiyo, Gérard Chollet GET-ENST 46, rue Barrault 75634 Paris Cedex 13 goye | elc |

Slides:

Advertisements

Similar presentations

Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.

National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory

Technical and design issues in implementation Dr. Mohamed Ally Director and Professor Centre for Distance Education Athabasca University Canada New Zealand.

Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,

Improving System Safety through Agent-Supported User/System Interfaces: Effects of Operator Behavior Model Charles SANTONI & Jean-Marc MERCANTINI (LSIS)

Welcome to Mobile TEL A questionnaire will follow this presentation for you to evaluate the application.

Towards Adaptive Web-Based Learning Systems Katerina Georgouli, MSc, PhD Associate Professor T.E.I. of Athens Dept. of Informatics Tempus.

Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.

About the use of UNL in key words, key images and key concepts transcultural analysis.

1 Voice Command Generation for Teleoperated Robot Systems Authors : M. Ferre, J. Macias-Guarasa, R. Aracil, A. Barrientos Presented by M. Ferre. Universidad.

Class 6 LBSC 690 Information Technology Human Computer Interaction and Usability.

Single Display Groupware Ana Zanella - CPSC

These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.

1 IUT de Montreuil Université Paris 8 Emotion in Interaction: Embodied Conversational Agents Catherine Pelachaud.

Emotional Intelligence and Agents – Survey and Possible Applications Mirjana Ivanovic, Milos Radovanovic, Zoran Budimac, Dejan Mitrovic, Vladimir Kurbalija,

Promoting Success for All Students through Technology.

McGraw-Hill Technology Education © 2006 by the McGraw-Hill Companies, Inc. All rights reserved. 77 CHAPTER INPUT AND OUTPUT Page 150.

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

Chapter 7 Requirement Modeling : Flow, Behaviour, Patterns And WebApps.

Gesture Recognition Using Laser-Based Tracking System Stéphane Perrin, Alvaro Cassinelli and Masatoshi Ishikawa Ishikawa Namiki Laboratory UNIVERSITY OF.

Chapter 12 Designing the Inputs and User Interface.

Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.

Teaching with Multimedia and Hypermedia

GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.

1 Darmstadt, October 02, 2007 Amalia Ortiz Asociación VICOMTech Mikeletegi Pasealekua Donostia - San Sebastián (Gipuzkoa)

INJA-CEA Paris 1 CADUI' June FUNDP Namur A tool for adapting visual interfaces for blind people Miss Siwar FARHAT INJA-CEA PARIS.

Computer-Based Training Methods

11.10 Human Computer Interface www. ICT-Teacher.com.

COMPUTER ASSISTED / AIDED LANGUAGE LEARNING (CALL) By: Sugeili Liliana Chan Santos.

Chapter 7. BEAT: the Behavior Expression Animation Toolkit

© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.

Se Over the past decade, there has been an increased interest in providing new environments for teaching children about computer programming. This has.

PDA Applications for the Olympic Games. Consolidated collaborations GET & Tsinghua University: –(Prof. Ding Xiaoqing) LingTour Chinese character recognition.

MULTIMEDIA DEFINITION OF MULTIMEDIA

Interactive Spaces Huantian Cao Department of Computer Science The University of Georgia.

Screen design Week - 7. Emphasis in Human-Computer Interaction Usability in Software Engineering Usability in Software Engineering User Interface User.

Greta MPEG-4 compliant Script based behaviour generator system: Script based behaviour generator system: input - BML or APML input - BML or APML output.

1 Introduction to Software Engineering Lecture 1.

Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,

卓越發展延續計畫分項三 User-Centric Interactive Media ~ 主持人 : 傅立成共同主持人 : 李琳山，歐陽明，洪一平，陳祝嵩水美溫泉會館研討會

Editors And Debugging Systems Other System Software Text Editors Interactive Debugging Systems UNIT 5 S.Sharmili Priyadarsini.

Modeling and simulation of systems Simulation languages Slovak University of Technology Faculty of Material Science and Technology in Trnava.

ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.

Microsoft Assistive Technology Products Brought to you by... Jill Hartman.

School of something FACULTY OF OTHER Facing Complexity Using AAC in Human User Interface Design Lisa-Dionne Morris School of Mechanical Engineering

KAMI KITT ASSISTIVE TECHNOLOGY Chapter 7 Human/ Assistive Technology Interface.

Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,

AN INTELLIGENT ASSISTANT FOR NAVIGATION OF VISUALLY IMPAIRED PEOPLE N.G. Bourbakis*# and D. Kavraki # #AIIS Inc., Vestal, NY, *WSU,

Higher Vision, language and movement. Strong AI Is the belief that AI will eventually lead to the development of an autonomous intelligent machine. Some.

Yonglei Tao School of Computing & Info Systems GVSU Ch 7 Design Guidelines.

MPEG-4: Multimedia Coding Standard Supporting Mobile Multimedia System Lian Mo, Alan Jiang, Junhua Ding April, 2001.

1 ACM SAC’2008 – Fortaleza, Ceará, Brazil, March 16-20, 2008 An Intelligent Editor for Multi-Presentation User Interfaces Benoît Collignon 1, Jean Vanderdonckt.

Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,

1 Control Menus: Execution and Control in a Single Interactor Stuart Pook Eric Lecolinet Guy Vaysseix Emmanuel Barillot École Nationale Supérieure des.

ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation.

A Speech Interface to Virtual Environment Authors Scott McGlashan and Tomas Axling Swedish Institute of Computer Science.

WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.

What is Multimedia Anyway? David Millard and Paul Lewis.

MULTIMODAL AND NATURAL COMPUTER INTERACTION Domas Jonaitis.

Human Computer Interaction Lecture 21 User Support

Lesson 4 Alternative Methods Of Input.

Alternative Methods Of Input

Standard Methods of Input.

LCD Network Monitor.

Lesson 4 Alternative Methods Of Input.

Software engineering USER INTERFACE DESIGN.

Pilar Orero, Spain Yoshikazu SEKI, Japan 2018

Multimodal Human-Computer Interaction New Interaction Techniques 22. 1

Lesson 4 Alternative Methods Of Input.

Presentation transcript:

LINGTOUR: a PDA for tourists Alain Goyé, Eric Lecolinet, Mutsuko Tomokiyo, Gérard Chollet GET-ENST 46, rue Barrault Paris Cedex 13 goye | elc | lin | Catherine Pelachaud IUT de Montreuil - Université Paris 8 140, rue de la Nouvelle France Montreuil, France Ding Xiaoqing, Mao Yuhang Dept. of Electronic Engineering Tsinghua University Beijing, , China Ni Yang Institut National des Télécommunications Département Electronique et Physique 9,Rue Charles Fourier Evry Cedex-France

Interfaces multimodales pour un assistant au voyage LINGTOUR: an history Collaboration with TsingHua University : Collaboration with TsingHua University : –Memorandum of understanding (2000) –Vocal French-Chinese dictionary with Le Robert –Master thesis of Dong Qingfu: « Realization of Intelligent Camera Capable of Character Recognition and Translation »

Interfaces multimodales pour un assistant au voyage The LINGTOUR project Multilingual management of information Initially, a PDA for travellers : Initially, a PDA for travellers : –Virtual guide : access to multilingual information for tourists (practical and cultural) –Communication assistant: translation help, navigation within lexicon and access to typical conversations –Travel assistant : orientation and environment interpretation using local and positioning information A personal assistant (PDA or smartphone) with multimodal and ergonomic capabilities : A personal assistant (PDA or smartphone) with multimodal and ergonomic capabilities : –inputs (text, speech, stylus, images) –outputs (text, speech, images, video)

Interfaces multimodales pour un assistant au voyage Interactions PDA - server Multimodal navigation in maps and lexicon Tsinghua University Sound taking Selection / extraction of text Rafinement / corrections of the image Images, sound Images, sound, text Character recognition, Vocal recognition Multilingal translation, Speech synthesis Supervision

Interfaces multimodales pour un assistant au voyage Exploit the specificities of PDA One makes an optimal exploitation of possibilities of PDA for the multimodality : – –Use, jointly, without any keyboard, input of the tactile screen, microphone and camera, and – –Exploit alternatively or simultaneously the graphic qnd sound possibilities, according to the context, to represent the information. The PDA is connected as each time as possible to Internet: – –to download actuality informations – –to enable to export the tasks on a remote server: too complicated Or too high cost for memory – –To enable the intervention, if necessary, of a human operater

Interfaces multimodales pour un assistant au voyage 3 types of multimodal interface Gesture and voice : Gesture and voice : Combinaition of Control menus + vocal input –Controling zoomable interfaces towards graphic or text inputs Intelligent Camera : Intelligent Camera : Rafinement of images –Based on the correlation of a series of images –to improve character recognition Cultural agents : Cultural agents : Conversational agents animated and adapted to the culture Conversational agents animated and adapted to the culture –Adding to speech non-verbal behaviour: face, eyes, gestures, depending to the culture

Interfaces multimodales pour un assistant au voyage ZUIs and Menu control 2D Constraints of PDA : screen size Constraints of PDA : screen size ZUIs : user zoomable interfaces ZUIs : user zoomable interfaces –Concept of semantic zoom: Progressive revelation of levels of details Progressive revelation of levels of details Menus control [1] : Menus control [1] : –Selection + control of the action (movement, zoom) by only one gesture –No chang of context, no manipulation of multiple interactions for only one operation Gesture and voice [1] [1] Pook, S., Lecolinet, E., Vaysseix, G. et Barillot, E., Control Menus: Execution and Control in a Single Interactor. Proc. ACM conf. on Human Factors in Computing Systems (CHI) 2000, ACM Press.

Interfaces multimodales pour un assistant au voyage Characteristics of menu control –Combinning the selection and the control of an operation for only one gesture –Capable to integrate up to 2 bars of movements (vertical et horizontal) –The user concentrates his attention on the content –Capable to have sub-menus –Like the Pie menus [2] and the Marking menus [3], offering a beginner mode et an expert mode The spacious disposition of the menus helps the memorization The spacious disposition of the menus helps the memorization Quick gestures => the menus don’t appear on the screen Quick gestures => the menus don’t appear on the screen Implicit passage from a mode to the other Implicit passage from a mode to the other [2] [2] Hopkins, D., The design and implementation of Pie menus. Dr Dobb's journal of software tools, 1991, 16 (12), [3] [3] Kurtenbach, G. et al., The Hotbox: efficient access to a large number of menu-items. Proc. ACM – CHI, 1993, Gesture and voice

Interfaces multimodales pour un assistant au voyage Application of the menu control navigation in a map of town, navigation by a lexicon : – –Helpful words and clauses to tourists, – –hierarchized in categories such as : accomodation > hotel > reservation…. Gesture and voice

Interfaces multimodales pour un assistant au voyage The voice : multilingal recognition voice recognition engine: – –Limited vocabulary, but – –independant of speaker, – –No leaning. The recognition in different langages : – –sharing common acoustique models, one which facilitates the future extensions to new languages. – –Adaptable models to users and to usage conditions. French Chinese common acoustique models Models specific to the langage Gesture and voice

Interfaces multimodales pour un assistant au voyage The voice is associated with gestures… The vocal information is emploied differently according to the given context : Navigation in the map : « tap and talk » : access by a vocal menu to diverse informations on the pointed objet. Navigation by lexicon : – –like short cut access to categories, then – –to the access to input words or clauses. The translation will appear / be synthesized in the target language. Possibly, improvement by using keywords ("word spotting"). Gesture and voice

Interfaces multimodales pour un assistant au voyage The « intelligent » camera see, recognize and translate see, recognize and translate The character recognition – chinese in paticular – achieved now to high performance. to limit computing cost : – –Recognition made on a sub-part of the image. – –This sub-part can be chosen semi-automatically at the moment of delimitation phase and previous segmentation. The text once recognized can be translated : – –Locally to facilitate the translation, a vocal menu enables to choose the context : the notice of bus stops or street names, monuments, etc. – –Or by a remote server via a radiocommunication service. It’s also possible to be reproduced by vocal synthesis Intelligent camera

Interfaces multimodales pour un assistant au voyage The camera usage [4] capture reco translation Intelligent camera [4] [4] Mao, Y., Dong, Q., Qi Y. et Chollet, G. Realization of an Intelligent Camera capable of Character Recognition and Translation. Proc. of Sino-French Symp. on Speech and Language Processing, Beijing, October Disponible à l’adresse :

Interfaces multimodales pour un assistant au voyage Improve the image resolution Difficulty : Difficulty : – –image far obtained in the street – –Cheeper camera   quality/ insufficient resolution for the recognition Solution Solution : image rafinement – –correlation and reconstruction of a series of successive images. – –Exploitation of the small differences due to natural movement of the hand which keeps the camera.   image with superieur resolution to one of captures. Intelligent camera

Interfaces multimodales pour un assistant au voyage Principle of image rafinement Camera on the PDA Vibration of the hand Acquisition of image sequence Evaluation of movements (sub-pixel) Image of better resolution Recomposition of only one imageIntelligent camera camera

Interfaces multimodales pour un assistant au voyage Rafinement of images : results Notable improvement : – –Of visual quality – –of rate of character – –recognition Intelligentcamera

Interfaces multimodales pour un assistant au voyage Conversational agents : interest It enables to [5] tarnsfer an information in more attractive and more user-friendly manner than simple vocal synthesis. The nonverbal expressions enable : – –to disambiguate the meaning of an utterance, – –to emphasize certain words or utterance fragments… It supplies the informations with different levels: – –syntactic – –semantic – –emotionnal In a multicultural context, a visual demonstration can be also better vecter of teaching of certain usages. Cultural agents [5] [5] Pelachaud, C., Carofiglio, V., De Carolis, B. et de Rosis, F., Embodied Contextual Agent in Information Delivering Application, First Intl. Joint Conf. on Autonomous Agents & Multi-Agent Systems, Bologna, July 2002

Interfaces multimodales pour un assistant au voyage « Greta » : facial animation engine Objective : a model animated capable to simulate in quick and realistic manner the dynamic aspects of human face. Realization : a facial animation engine of which the model 3D forms a young woman behaviour. Greta is : – –the core of a decoder MPEG-4 – –Conform to specifications “Simple Facial Animation Object Profile" of the standard. – –capable : to generate the structure of an original model, To animate this, To reproduct in real time. Cultural Cultural agents agents

Interfaces multimodales pour un assistant au voyage Adopt the conversational agents Transport on PDA of animated agents. Transport on PDA of animated agents. –The power and the screen size of apparatus are limited –The complexity and the level of details of the animation have to be adapted. Adaptation of the behaviour to users : Adaptation of the behaviour to users : In spite of recent advance in material of realism, the actual agents know only one type of behaviour, which reflects often the occidental culture.  Cultural and social adaptation to the context : The same information must be delivered differently, for example: to a French and to a Chinese, to a French and to a Chinese, to a journalist and to a private. to a journalist and to a private. Cultural agents

Interfaces multimodales pour un assistant au voyage Conversational and cultural agents : semantic representation Base : semantic representation independant on the language, based on the standard XML-XSD. Base : semantic representation independant on the language, based on the standard XML-XSD. –description of the communicative fonction of gestures and signals composing the gestures. On-layer of the attributes specific to the culture, which influence on : On-layer of the attributes specific to the culture, which influence on : –the choice of a gesture (smile or shake/nod of the head), –the duration of a look… More generally, these influences can concern : –the definition of a signal (hiding of a signal by an other), –Intensity of sound, –Sound duration, etc.

Interfaces multimodales pour un assistant au voyage Conversational and cultural agents … in certain cultures, Not to watch his interlocuter can be interpreted as a lack of his attention /his interest… In other cultures, Watch straightforward in eyes can be interpreted as a form of agression… Cultural Culturalagents

Interfaces multimodales pour un assistant au voyage Results and what follows… At the end of the works which this project has enabled to initiate, we hope be in a position to demonstrate : 1) the possibility to integrate on a mobile terminal (PDA, smartphone…) using the diverse interfaces presented here : – –Menu control 2D, – –capture and recognition of text, – –Conversational agents. 2) the profits of the improvements which we recommend for each of these fonctionnalities: – –integration of vocal commands in the menus, – –rafinement of images by spatio-temporary correlation, – –enrichment of the agents by the cultural attributes. Gesture and voice Intelligent camera Cultural agents

Interfaces multimodales pour un assistant au voyage To evaluate these works within the EURO-CHINA programme … Collaboration engaged with Peer2Phone (voice on IP via WIFI) Collaboration engaged with Peer2Phone (voice on IP via WIFI) Presentation at the end of April in Beijing Presentation at the end of April in Beijing A proposal with our Chinese partnars for the Olympics in Beijing A proposal with our Chinese partnars for the Olympics in Beijing