Download presentation
Presentation is loading. Please wait.
Published byJoel Oliver Modified over 9 years ago
1
Notes on ICASSP 2004 Arthur Chan May 24, 2004
2
This Presentation (5 pages) Brief note of ICASSP 2004 NIST RT 04 Evaluation results Other interesting things relate to CALO
3
NIST RT 04 Meeting Transcription – Headlines. Meeting Transcription A challenge to core technology, evaluation and resource preparation. Core technology Speaker Segmentation Speech to Text (STT) Evaluation New evaluation scheme is deviced for overlapped speech. Resource preparation LDC has a big headache in preparing the data.
4
Speaker Segmentation Segmenting the speech Search for the number of speakers. Get speaker turns. Measured by Diarization rate. Insights: (from ISL) More speakers: the harder the task. A new measure called speaker speaking time entropy is proposed.
5
STT Very hard task ICSI, ISL use the state of the art technology +Constrained linear transform +Discriminative training (DT-MAP) +Speaker Adaptive Training. Individual headphone results WER: 34.8% for non-overlapping speech. Some meeting is very hard. Many people is speaking at the same time. Trained on 4 different subset of data, ICSI data is just one of them (70% of the total) Insights: (ICSI) feature-based technique doesn ’ t help too much Multiple-distance microphones and array microphones techniques help. Conclusion: we will also have a hard-time.
6
Evaluation and Resource Preparation Evaluation: Overlapped speech require different schemes for evaluation Will require multiple string matching. (Detail unknown yet.) Resource Preparation Currently, no tool can satisfy the need of transcribing multiple channels of speech with interaction Professional transcriber failed.
7
Other interesting news from ICASSP related to CALO Project EARS: Lightly supervised training 3000 hours close captioned speech is used Discriminative training is found to be useful for some sites. Others
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.