Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Brief Summary of MISS Project Weiquan Liu Feiyu Xu the multilingual world before MISO.

Similar presentations


Presentation on theme: "A Brief Summary of MISS Project Weiquan Liu Feiyu Xu the multilingual world before MISO."— Presentation transcript:

1 A Brief Summary of MISS Project Weiquan Liu Feiyu Xu the multilingual world before MISO

2 Even Before MISS... NLP Institutions –13: now involved in MISS –5: peking local Projects –2 international cooperations –Supported by 863 national high-tech research and development program Themes –Scattered research –No applications –No real-world applications Chinese Funding –863 every two years give a small amount of funds to many different institutes/groups

3 Role of CapInfo for MISS Key player of Beijing information technology system for the society Founded 1998 by government agencies

4 First Term of MISS (2002—2004) No more small projects (863) CapInfo as coordinator –Driving partners to work out real-world application oriented technologies –Leading the effort of building linguistic resources –Asking a neutral partner (three times) for evaluation of project results DARPA similar evaluation method After the two years, there are encouraging improvement of the language technologies, according to the offical evaluations

5 Partners of MISS and their Performances ASR –Automation Inst., Prof. Bo Xu, Beijing: command oriented approach for Chinese TTS –iFlytek, AnHui Province, Science & Technology Uni.: Chinese/English for PDA –SinoVoice, Beijing (SME) : talking browser for blind people –Tsinghua Univ.: embedded TTS for PDA Dialogue Management –Beijing University of Post and Telecom MT (Chinese – English): knowledge-based approach –ChinaSoft, Beijing –Huajian MT (Chinese Academy of Sciences): best of the group –Nanjing Univ., Nanjing –Xiamen Univ., Xiamen: second –Evaluation criteria: lexical coverage, precision, speed (hundred sentences/second) Linguistic Resources –Harbin Inst. of Tech, Harbin: speech data, parallel text data –Peking Univ., Beijing: lexicons IR –TRS, Beijing: more than 1000 customers (governments mainly) Evaluation –Computing Tech. Inst.

6 State of the Art of MISS Aided MT: offline –CN  EN, FR Speech-enabled PDA –Providing Location-based Services (LBS) –CN, EN Kiosk (TTS) –Input: touch screen –CN, EN SMS –Dialogue management –Domains: weather, hotel, restaurent, (game schedule)... –CN, EN Multilingual IR –Search –Result classification/clustering –CN, EN

7 ASR The ASR in the MISS project comes from Institute of Automation (Prof Xu Bo) embedded for PPC 2003 The foot print is less than 3MB It's a single engine for both Chinese and English (no German in the near future). It's for continues speech, the models can be changed in the fly.

8 TTS The TTS comes from SinoVoice, iFlytek both for MISS partners Chinese, English and Japanese, no plan for German PPC2003.

9 Machine Translation The translation software is from – Huajian, another MISS partner –PPC 2003 engine is for Chinese English –On the server side, our partners have more solutions, –limited German->Chinese support. –All are ready for Chinese, English and Japanese.

10 Constraints –Only the TTS supports SAPI –Both ASR and TTS support client/server structure, though it needs bandwidth. –the Wi-Fi wireless connection is ok, while GPRS is not enough.

11 Access hot spots that are available to the public. It's true that China Mobile, the biggest mobile operator, has hot spots at selected hotels and office buildings, but they are not popular. They has no plan to cover the city. The Wi-Fi solution is going to be abandoned. Since to cover a large area, the 3G is cheaper and stands for the trend.

12 Funding When the Chinese government funded the COMPASS project, the engine could be used by the demo system. Since the engines need to be trained or fine tuned. A rough estimation is that each ASR/TTS/MT engine needs 200k Yuan paid to the partner who provide them.


Download ppt "A Brief Summary of MISS Project Weiquan Liu Feiyu Xu the multilingual world before MISO."

Similar presentations


Ads by Google