Download presentation
Presentation is loading. Please wait.
Published byPreston Ramsey Modified over 9 years ago
1
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN, Gérard CHOLLET Presented here by Gérard CHOLLET chollet@tsi.enst.fr ENST/CNRS-LTCI http://www.tsi.enst.fr/~chollet@
2
Outline Rationale of the proposition Objectives The Beijing 2008 Olympics Approaches Intelligent Camera Bilingual Voice Communicator Need and relevance A PDA for tourists and travelling businessmen Conclusions and Perspectives
3
Rationale for the IP-KNOWLISTICS Logistics for knowledge Language independent knowledge representation and management Multimedia (text, speech, image, video) Multimodal access (text, speech, visual I/O) Distributed multimedia server accessible from mobile terminals (phone, PDA, PC,…) Primarily targetted to tourist applications initially 2008 Beijing Olympics as a field trial
4
Technical developments Language independent knowledge representation (using conceptual graphs and an Intermediate Representation Language like ‘Universal Networking Language’) Summarisation and reformulation of texts Generation in 12 target languages Speech synthesis and recognition VoiceXML-based interactive dialog agent ‘Intelligent camera’ with Chinese character recognition Cross-language ‘Multimodal communicator’ on a PDA Cross-language lexical access
5
Chinese character recognition
6
Intelligent camera from Tsinghua Univ. capture reco translation
7
Extracting text in scene images Complex color images Uncontrolled illumination Variations : size, fonts, orientation, texture Complex backgrounds, shadows
8
Text extraction Searching for character regions (Text has uniform color) Multi-channel decomposition Connected components analysis Grouping of components Alignment analysis (numb of horizontally or vertically aligned components) Text identification (language independant features : size, alignment, …) Detection rate : 84 % False alarm rate : 5.6 %
9
Cross-language Multimodal Communicator Use of a visual display (for ex. on a PDA) to intermediate the dialogue between 2 persons speaking different languages. Recognition of short utterances, display of a word graph, selection of keywords, visualisation (and synthesis) of the translation of key words and group of words. Specialised lexicon for dialog acts in typical touristic situations (in a restaurant, at the hotel, in the street, in public transport, about the Olympic games,…) UMTS access to an Information server with maps, photographs, video sequences,…
10
Generation in target languages Sharing of acoustic models between languages to simplify the extensibility to other languages. Combination of models for the phones with small amounts of data. Models adaptation to its user and environmental situations. French Chinese Shared acoustic models Language specific models
11
Knowledge representation A formal language for representating the meaning of natural language sentence. UNL (Universal Networking Language) introduced to describe natural language semantics. Language-independent context indexing, possible for cross-language information retrieval. Use of conceptual hierarchy of UNL to address the inherent ambiguity of natural languages. A set of semantic relations (linking concepts together) for a structured information pattern.
12
UNL representation “The cat has drunk the milk” agt(drink(icl>do(agt>thing, obj>liquid).@past.@entry, cat(icl>mammal>animal).@def) obj(drink(icl>do(agt>thing, obj>liquid).@past.@entry, milk(icl>beverage>food).@def) can be encoded by: agt, obj are binary semantic relations
13
Role of semantic contents representation in indexing Digital Audio Video Textual Multimedia platform User’s request UNL encoding User specific information UNL decoding
14
Application architecture UMTS server Speech synthesis Access information A word graph, list of keywords Translation
15
Digital Olympic Multi-Language Information Network Service System Project
16
Conclusions and Perspectives UNL representation of meanings of natural language sentence directly available for retrieval, indexing and knowledge extraction. UNL with multimedia contents (text, speech, image, video) and multimodal access (text, speech, visual I/O) to enrich the service for communication. Comprehensive and extensive information service on PDA with access to UMTS.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.