A Brief Summary of MISS Project Weiquan Liu Feiyu Xu the multilingual world before MISO.

Slides:



Advertisements
Similar presentations
Far-reaching Impact of MMT Zhendong Dong Center of Computer Language Center of Computer Language Information Engineering, CAS Information.
Advertisements

LingTour
Rocket Software, Inc. Confidential James Storey General Manager, OSS Unit Rocket Software APNOMS 2003: Managing Pervasive Computing and Ubiquitous Communications.
Speech-to-Speech Translation Hannah Grap Language Weaver, Inc.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.
China Office University of Edinburgh Nini Yang January 2014.
© Max von Zedtwitz, China Frontier Research 1 China Frontier Survey - Results Prof. Dr. Max von Zedtwitz GLORAD (B-55) School of Economics.
Hiroshi NAKAGAWA Information Technology Center, University of Tokyo,Japan Postal:
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Languages & The Media, 5 Nov 2004, Berlin 1 New Markets, New Trends The technology side Stelios Piperidis
Task-based Approach. Education Philosophy “Learning by doing” is the basic notion deep, significant learning can only take place through the learner’s.
Jumping Off Points Ideas of possible tasks Examples of possible tasks Categories of possible tasks.
Department of Computer Science, Tsinghua University Introduction to the PhD Program of the Department of Computer Science and Technology at Tsinghua.
1 Linguistic Resources needed by Nuance Jan Odijk Cocosda/Write Workshop.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
© 2014 The MITRE Corporation. All rights reserved. Stacey Bailey and Keith Miller On the Value of Machine Translation Adaptation LREC Workshop: Automatic.
CONFIDENTIAL | © Nuance Communications, Inc. All rights reserved. ENTERPRISE SOLUTIONS 1 Parteek Singh.
Initiation of Standardization on Network-based Speech-to-speech Translation at ITU-T SG16 National Institute of Information and Communications Technology,
Cloud Computing Introduction to China-cloud Project and Related Works in JSI Yi Liu Sino-German Joint Software Institute, Beihang Univ. May 2011.
Notes for CS3310 Artificial Intelligence Part 1: Overview Prof. Neil C. Rowe Naval Postgraduate School Version of January 2009.
Natural Language Processing Neelnavo Kar Alex Huntress-Reeve Robert Huang Dennis Li.
GOOD, MULTILINGUAL interpretation, translation, resources What can we do for the OG-08? Christian BOITET GETA, CLIPS, IMAG-campus UJF & CNRS, Grenoble,
Mobile and Pervasive Computing - 8 Natural Language Processing Presented by: Dr. Adeel Akram University of Engineering and Technology, Taxila,Pakistan.
Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research Feb 14 th 2003.
Design of a Speech Recognition System to Assist Hearing Impaired Students Richard Kheir 2 and Thomas P. Way Department of Computing Sciences, Villanova.
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
System Development Process Prof. Sujata Rao. 2Overview Systems development life cycle (SDLC) – Provides overall framework for managing system development.
Discovering Computers Fundamentals, Third Edition CGS 1000 Introduction to Computers and Technology Spring 2007.
Stimulating Educational Digital Content Development for the Irish Language Michael Hallissy, National Co-ordinator, NCTE Interactive Software in the Curriculum.
WebInfoMall: the Chinese Web Archive how we got started and how it is now Huang Lianen and Li Xiaoming Peking University, China Digital Archive Workshop.
PDA Applications for the Olympic Games. Consolidated collaborations GET & Tsinghua University: –(Prof. Ding Xiaoqing) LingTour Chinese character recognition.
Sue Ellen Reager, CEO Patent Holder.  Embedded applications  Telephone and telecom  IVR applications  Software & navigational  TTS devices  Voice.
Evolution of Machine Translation: systems and use John Hutchins [ homepages/WJHutchins] [
Mobile Finance Manager ™ - MFM Chris Doner & Mark Barish Access Softek, Inc.
PAN L10N NETWOK VISION BEYOND 2010 INDONESIAN PERSPECTIVES Mirna Adriani University of Indonesia 16 January 2009.
Utilizing Agricultural Technology to Alleviate Poverty The MOST Experience 陈文娟 Wenjuan Chen Texas A&M University
Case Study Summary Link Translation entered a partner agreement with Autodesk to provide translation solutions integrating human and machine translation.
Computer Science and Engineering at The University of North Texas.
Research Topics CSC Parallel Computing & Compilers CSC 3990.
Computer Science and Engineering at The University of North Texas.
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,
Improved Patent Search on SIPO’s Website Huabing Liu Intellectual Property Publishing House of SIPO April, 2008 Huabing Liu Intellectual.
Collaborator Revolutionizing the way you communicate and understand
ORACLE IN CHINA An Emerging Giant … #6 18 Years in China 7,000+ Customers (6,800+ Tech and 700+ Apps Customers) 600+ Partners 1,600+ Staff 150,000 Strong.
DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
The Multilingual Web – Where Are We? Next Generation Localisation Josef van Genabith, CNGL & NCLT, DCU.
Client-server communication Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Related Courses CMPT 411: Knowledge Representation. Mainly Logic. CMPT 413: Computational Linguistics. Dealing with Natural Language. CMPT 419/726: Often.
#APMP2016. Submitting proposals in more than one language: a survival guide Considering language and translation as a key component of your value proposition.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
Siri Voice controlled Virtual Assistant Haroon Rashid Mithun Bose 18/25/2014.
Nat 4/5 Computing Science Interfaces & Peripherals
Sundial Engineering Design Project (EDP)
Introduction to Machine Translation
Computer Hardware and Software
Artificial Intelligence and Society
A Country Report – COCOSDA Activities in China Data More and more companies on data resources and services suppliers are emerging in China: a new.
Practice the Power of Intel Embedded Technology ——基于Intel平台的嵌入式教学经验分享
Activities on NLP in Mainland of China
Patent Information Annual Conference 2011
Activities in Mainland of China
ONEs - OHT NMT Evaluation score
Mobile and Pervasive Computing - 7 Natural Language Processing
Introduction to Machine Translation
Idiap Research Institute University of Edinburgh
Open Source SUMMA Platform
Eligibility criteria & national priorities Germany
Active AI Projects at WIPO
Presentation transcript:

A Brief Summary of MISS Project Weiquan Liu Feiyu Xu the multilingual world before MISO

Even Before MISS... NLP Institutions –13: now involved in MISS –5: peking local Projects –2 international cooperations –Supported by 863 national high-tech research and development program Themes –Scattered research –No applications –No real-world applications Chinese Funding –863 every two years give a small amount of funds to many different institutes/groups

Role of CapInfo for MISS Key player of Beijing information technology system for the society Founded 1998 by government agencies

First Term of MISS (2002—2004) No more small projects (863) CapInfo as coordinator –Driving partners to work out real-world application oriented technologies –Leading the effort of building linguistic resources –Asking a neutral partner (three times) for evaluation of project results DARPA similar evaluation method After the two years, there are encouraging improvement of the language technologies, according to the offical evaluations

Partners of MISS and their Performances ASR –Automation Inst., Prof. Bo Xu, Beijing: command oriented approach for Chinese TTS –iFlytek, AnHui Province, Science & Technology Uni.: Chinese/English for PDA –SinoVoice, Beijing (SME) : talking browser for blind people –Tsinghua Univ.: embedded TTS for PDA Dialogue Management –Beijing University of Post and Telecom MT (Chinese – English): knowledge-based approach –ChinaSoft, Beijing –Huajian MT (Chinese Academy of Sciences): best of the group –Nanjing Univ., Nanjing –Xiamen Univ., Xiamen: second –Evaluation criteria: lexical coverage, precision, speed (hundred sentences/second) Linguistic Resources –Harbin Inst. of Tech, Harbin: speech data, parallel text data –Peking Univ., Beijing: lexicons IR –TRS, Beijing: more than 1000 customers (governments mainly) Evaluation –Computing Tech. Inst.

State of the Art of MISS Aided MT: offline –CN  EN, FR Speech-enabled PDA –Providing Location-based Services (LBS) –CN, EN Kiosk (TTS) –Input: touch screen –CN, EN SMS –Dialogue management –Domains: weather, hotel, restaurent, (game schedule)... –CN, EN Multilingual IR –Search –Result classification/clustering –CN, EN

ASR The ASR in the MISS project comes from Institute of Automation (Prof Xu Bo) embedded for PPC 2003 The foot print is less than 3MB It's a single engine for both Chinese and English (no German in the near future). It's for continues speech, the models can be changed in the fly.

TTS The TTS comes from SinoVoice, iFlytek both for MISS partners Chinese, English and Japanese, no plan for German PPC2003.

Machine Translation The translation software is from – Huajian, another MISS partner –PPC 2003 engine is for Chinese English –On the server side, our partners have more solutions, –limited German->Chinese support. –All are ready for Chinese, English and Japanese.

Constraints –Only the TTS supports SAPI –Both ASR and TTS support client/server structure, though it needs bandwidth. –the Wi-Fi wireless connection is ok, while GPRS is not enough.

Access hot spots that are available to the public. It's true that China Mobile, the biggest mobile operator, has hot spots at selected hotels and office buildings, but they are not popular. They has no plan to cover the city. The Wi-Fi solution is going to be abandoned. Since to cover a large area, the 3G is cheaper and stands for the trend.

Funding When the Chinese government funded the COMPASS project, the engine could be used by the demo system. Since the engines need to be trained or fine tuned. A rough estimation is that each ASR/TTS/MT engine needs 200k Yuan paid to the partner who provide them.