S1S1 S2S2 S3S3 ATraNoS Workshop 12 April 2002 Patrick Wambacq
S1S1 S2S2 S3S3 12 April 20022Atranos workshop Leuven ATraNoS l Automatic Transcription and Normalization of Speech l IWT-STWW TOP project, 2x2years, €1.25M l Started 1 October 2000 l Partners: ESAT/KULeuven, ELIS/UGent, CCL/KULeuven, CNTS/UIA
S1S1 S2S2 S3S3 12 April 20023Atranos workshop Leuven ATraNoS user commission l Function of mentor: guidance, feedback l Right to inspect results, not (co-)owner l Six-monthly meetings l Members: originally: L&H (now ScanSoft), Philips, T&I, (FLV-CELE); added later: VRT, L&C
S1S1 S2S2 S3S3 12 April 20024Atranos workshop Leuven Project aims l Automatic transcription of spontaneous speech l Conversion of transcriptions according to application, e.g. subtitling (test vehicle in this project)
S1S1 S2S2 S3S3 12 April 20025Atranos workshop Leuven Work packages l WP1: segmentation of audio stream in homogeneous segments (ELIS): –preprocessor for speech decoder –segments containing single type of signal (wideband speech, telephone speech, background, etc.) –label segments, cluster speakers –induce only small delay
S1S1 S2S2 S3S3 12 April 20026Atranos workshop Leuven Work packages (cont’d) l WP2: detection and handling of OOV words: –extension of the lexicon (CCL): compounding module reduce OOV rate –augment recognition results with confidence measures (ESAT): OOV detection –phoneme-to-grapheme conversion (CNTS): transcribe OOV words
S1S1 S2S2 S3S3 12 April 20027Atranos workshop Leuven Work packages (cont’d) l WP3: spontaneous speech problems: –detection of disfluencies (ELIS): use acoustic/prosodic features; supply info to HMM recognizer –statistical language model (ESAT): extend traditional trigram LM to incorporate hesitations, filled pauses, self-corrections, repetitions sequence of clean speech islands.
S1S1 S2S2 S3S3 12 April 20028Atranos workshop Leuven Work packages (cont’d) l WP4: subtitling: –data collection and automatic alignment (CNTS) –input/output specifications (CCL): linguistic characteristics –subtitling: statistical approach (CNTS) –subtitling: linguistic approach (CCL) –hybrid system possible?
S1S1 S2S2 S3S3 12 April 20029Atranos workshop Leuven Where are we? l WP1: baseline segmentation ready l WP2: compounding module for lexicon, confidence measures, p2g conversion ready l WP3: acoustic model and baseline statistical language model for Switchboard corpus ready l WP4: data collection and alignment nearly finished, I/O specs determined