Presentation is loading. Please wait.

Presentation is loading. Please wait.

Centro per la Ricerca Scientifica e Tecnologica Spoken language technologies: recent advances and future challenges Gianni Lazzari VIENNA July 26.

Similar presentations


Presentation on theme: "Centro per la Ricerca Scientifica e Tecnologica Spoken language technologies: recent advances and future challenges Gianni Lazzari VIENNA July 26."— Presentation transcript:

1 Centro per la Ricerca Scientifica e Tecnologica Spoken language technologies: recent advances and future challenges Gianni Lazzari VIENNA July 26

2 Centro per la Ricerca Scientifica e Tecnologica Focus on the use of Spoken Language Technologies for multilingual transcription and reporting tasks SUMMARY  Short introduction on SLT  Where are we today ?  TC-STAR and RAI projects  Outlook for the future

3 Spoken Language Technologies: recent advances and future challenges 3 Typical tasks in Human Language Technologies (HLT)  speech recognition (voice commands & speech transcription)  character recognition  object and gesture recognition  (spoken and written) language understanding  spoken dialog systems  speech synthesis  text summarization  document classification and information retrieval  syntactic analysis of natural language  speech and text translation ...

4 Spoken Language Technologies: recent advances and future challenges 4 General Spoken Language System Architecture input Recognition Understanding and dialog Generation and Synthesis answer MODELS acoustic language semantic dialog synthesis

5 Spoken Language Technologies: recent advances and future challenges 5 Speech Transcription System Architecture InputAudio: -Noise -Noise -Speech -Speech -Music -Music -….. -….. Recognition results: results: Enriched Text MODELS Acoustic Language Speakers Speech Music Noise

6 Spoken Language Technologies: recent advances and future challenges 6 Typical Transcription System

7 Spoken Language Technologies: recent advances and future challenges 7 Standard Automatic Speech Recognition Architecture

8 Spoken Language Technologies: recent advances and future challenges 8 Word error rate of different speech recognition tasks Dictation: 7%, well formed, computer, FBW Broadcast news: 12%, various, audience, FBW Switchboard : 20-30% spontaneous, person, TBW Voicemail: 30% spontaneous, person, TWB Meetings: 50-60% spontaneous, person FF The features characterizing these tasks are:  type of speech: well formed vs spontaneous  target of communication: computer, audience, person  bandwidth:  FWB, full bandwidth  TWB, telephone bandwidth  FF, far field.

9 Spoken Language Technologies: recent advances and future challenges 9 RAI Italian Broadcast news Transcription

10 Spoken Language Technologies: recent advances and future challenges 10 Evaluation of the Italian broadcast news transcription task.  Acoustic models are trained through a speaker adaptive acoustic modelling procedures  Two sets of acoustic models were trained, for wideband and narrowband speech: exploiting for each set about 140 hours of speech.  The LM was estimated on a 226M-word corpus including newspaper articles, for the largest part, and BN transcripts.  The LM is compiled into a static network with a shared- tail topology..

11 Spoken Language Technologies: recent advances and future challenges 11 Word error rate on the Italian broadcast news transcription task. WidebandNarrowbandOverall First Pass Second Pass First Pass Second Pass First Pass Second Pass Old15.514.225.222.417.616.0 New14.611.721.017.116.012.9 Relative reduction 5.8%17.6%16.7%23.7%9.1%19.4%

12 Spoken Language Technologies: recent advances and future challenges 12 STATISTICAL TRANSLATION BASED ON BAYESIAN DECISION RULE Speech recognitionTransformation Source language text Global Search TransformationSpeech synthesis target language text Lexicon model Alignment model Language model Vorrei prenotare un albergo a Francoforte I want to reserve a hotel room in Frankfurt

13 Spoken Language Technologies: recent advances and future challenges 13 Statistical Translation System

14 Spoken Language Technologies: recent advances and future challenges 14 Basic Research Technology Development Application Development Bottleneck Identification Research results in quantitative evaluation Technologies needed for applications Technologies validated for applications. Long term High risk Large ROI Evolutionary Usability Acceptability Research needed for improving technology Quantitative Evaluation Usage Evaluation Research Cycle

15 Spoken Language Technologies: recent advances and future challenges 15

16 Spoken Language Technologies: recent advances and future challenges 16 Experimental findings in HLT research (1973-2004)  statistical methods most successful: in particular: speech recognition, language translation, parsing, dialog systems,...  scientific foundations: methods of computer science, statistical modelling, information theory  handling huge amounts of data 200 hours of speech recordings, 100 Mio of running words,...  learning from data: fully automatic procedures more data than can be processed by human experts  efficient algorithms: search/decision algorithms for heuristic search ...

17 Spoken Language Technologies: recent advances and future challenges 17 Research on HLT: 1973-2004  speech recognition (1973-2004) most of the progress: by pure statistical modelling some progress: by weak acoustic-phonetic-linguistic knowledge,i.e. domain specific knowledge virtually no progress: by classical rule-based and AI methods  similar recent experience (1993-2004) machine translation, information extraction, dialog systems,...  expectation for future progress in HLT most important: methodology: computer science, statistical modelling, information theory domain-specific knowledge: acoustics, phonetics, linguistics,...

18 Spoken Language Technologies: recent advances and future challenges 18 Spoken language translation: joint projects (national, European, international: ATR, C-Star, Verbmobil, Eutrans, Nespole!, Fame, LC-Star, PF-Star, TC-STAR:  restricted domains:  appointment scheduling, conference registration, travelling, tourism information,...  vocabulary size: 3 000 – 10 000 words  best performing systems and approaches: data-driven example-based methods finite-state transducers statistical approaches e.g.: Verbmobil evaluation [June 2000]: better by a factor of 2  written language translation: US Tides project 2001-2004 unrestricted domain: press news, vocab.size »= 50 000 words language pairs: Chinese!English, Arabic!English performance [July 2003]: best statistical systems are better than conventional/commercial systems

19 TC-STAR Technology and Corpora for Speech to Speech Translation Contract Nr. FP6 506738 VI FRAMEWORK PROGRAM PRIORITY Multimodal Interfaces IST-2002-2.3.1.6

20 Spoken Language Technologies: recent advances and future challenges 20 PARTNERS

21 Spoken Language Technologies: recent advances and future challenges 21 TC-STAR Project focuses on advanced research in key technologies for speech to speech translation: -speech recognition (ASR) -spoken language translation (SLT) -speech synthesis (TTS) -Start: April 2004 -End: March 2007 -Grant: 11 M. Euro -METHODOLOGY: -COMPETITIVE EVALUATION -COOPERATION TC-STAR

22 Spoken Language Technologies: recent advances and future challenges 22 Vision Transcription and Translation of broadcast news, speeches, lectures and interviews Vocal access Web access Simultaneous Translation Hi, What do you think about

23 Spoken Language Technologies: recent advances and future challenges 23 Application Scenario A selection of unconstrained conversational speech domains: - Broadcast news - European Parliament Plenary Session A few languages important for Europe society and economy:  European Accented English  European Spanish  Chinese

24 Spoken Language Technologies: recent advances and future challenges 24 2005 FIRST EVALUATION RESULTS ON THE EUROPEAN PARLIAMENT PLENARY SESSION TASK The Evaluation Tasks and Databases translation tasks: – English to Spanish: EPPS: European Parliament Plenary Sessions – Spanish to English: EPPS: European Parliament Plenary Session Three types of input to SLT: – output of automatic speech recognition – verbatim manual transcriptions – final text editions (with punctuation marks)

25 Spoken Language Technologies: recent advances and future challenges 25 2005 FIRST EVALUATION RESULTS ON THE EUROPEAN PARLIAMENT PLENARY SESSION TASK Training data Sentence-aligned speeches and their translations Final text editions: from April 1996 to Oct. 4th, 2004 Verbatim transcriptions: from May 2004 to Oct. 4th, 2004 Development data Oct. 26, 2004 Evaluation data Nov. 14, 2004

26 Spoken Language Technologies: recent advances and future challenges 26 2005 FIRST EVALUATION RESULTS ON THE EUROPEAN PARLIAMENT PLENARY SESSION TASK

27 Spoken Language Technologies: recent advances and future challenges 27 2005 FIRST EVALUATION RESULTS ON THE EUROPEAN PARLIAMENT PLENARY SESSION TASK ASR EPPS DATA word error rate - wer - EUROEPAN ACCENTED ENGLISH: 9,5 % best TC-STAR - EUROPEAN SPANISH : 10,1 % best TC-STAR SLT EPPS DATA position independent - wer - ENGLISH TO SPANISH 49% best PARTNER result - SPANISH TO ENGLISH 46% best PARTNER result

28 Spoken Language Technologies: recent advances and future challenges 28 “ The spoken translation problem …….is still a significant challenge: Good text translation was hard enough to pull off. Speech to speech MT was beyond going to the Moon – it was Mars…” [Steve Silbermann, Wired Magazine].


Download ppt "Centro per la Ricerca Scientifica e Tecnologica Spoken language technologies: recent advances and future challenges Gianni Lazzari VIENNA July 26."

Similar presentations


Ads by Google