Presenter : Jen-Wei Kuo

Presenter : Jen-Wei Kuo
Sausage Lidia Mangu Eric Brill Andreas Stolcke Presenter : Jen-Wei Kuo 2004/ 9 /24

Referred Reference CSL’00 Finding Consensus in Speech Recognition : Word Error Minimization and other Applications of Confusion Networks Eurospeech’99 Finding Consensus among Words : Lattice-Based Word Error Minimization Eurospeech’97 Explicit Word Error Minimization in N-Best List Rescoring

Motivation The mismatch between the standard scoring paradigm (MAP) and the evaluation metric (WER). maximize sentence posterior probability  minimize sentence level error

An Example Correct answer : I’M DOING FINE

Word Error Minimization
Minimizing the expected word error under the posterior distribution potential hypothesis

N-best Approximation

Lattice-Based Word Error Minimization
Computational Problem Several orders of magnitude larger than in N-best lists of practical size. No efficient algorithm of this kind. Fundamental Difficulty Objective function is based on pairwise string distance, a nonlocal measure. Solution Replace pairwise string alignment with a modified multiple string alignment. WE (word error)  MWE (modified word error)

Lattice to Confusion Network
Multiple Alignment

Multiple Alignment Finding the optimal alignment is a problem for which no efficient solution is known (Gusfield, 1992) We resort to a heuristic approach based on lattice topology.

Algorithms Step1. Arc Pruning Step2. Same-Arc Clustering
Step3. Intra-Word Clustering Step4*. Same-Phones Clustering Step5. Inter-Word Clustering Step6. Adding null hypothesis Step7. Consensus-based Lattice Pruning

Arc Pruning

Intra-Word Clustering
Same-Arc Clustering Arcs with with same word_id, start frame and end frame would be merged first. Intra-Word Clustering Arcs with same word_id would be merged.

Same-Phones Clustering
Arcs with same phone sequences would be clustered in this stage.

Inter-Word Clustering
Remaining arcs be clustered at this stage finally.

Adding null hypothesis
For each equivalent class, if the sum of the posterior probabilities is less than threshold (0.6) than add the null hypothesis to the class.

Consensus-based Lattice Pruning
Standard Method  Likelihood-based Paths whose overall score differs by more than a threshold from the best-scoring path are removed from the word graph. Proposed Method  Consensus-based Firstly we construct a pruned confusion network. Then intersect the original lattice with the pruned confusion network.

Algorithm

An Example 我是 How to merge ? 我是我是是我是誰

Computational Issues Partial Order Stupid Method:
History-based Look-ahead Apply first-pass search to find the history arcs for each arc.  Generate the initial partial ordering. While clusters are merged, lots of computation for (recursive) updates are needed. Thousands of arcs  need lots of memory storage.

Computational Issues – An example
CA If we merge B and C, what happened? JA DA KA A B FA MA A C D F GA LA NA E G J H L N I K M

Experimental Set-up Lattices was built using HTK Training Corpus
Trained with about 60 hours of Switchboard speech. LM is a backoff trigram model trained on 2.2 million words of Switchboard transcripts. Testing Corpus Test set in the 1997 JHU

Experimental Results

Experimental Results Hypothesis F0 F1 F2 F3 F4 F5 FX Overall
Short utt. Long utt. MAP 13.0 30.8 42.1 31.0 22.8 52.3 53.9 33.1 33.3 31.5 N-best (center) 30.6 31.1 22.6 52.4 33.0 Lattice (consensus) 11.9 30.5 30.7 22.3 51.8 52.7 32.5

Confusion Network Analyses

Other Approaches ROVER (Recognizer Output Voting Error Reduction)

Presenter : Jen-Wei Kuo

Similar presentations

Presentation on theme: "Presenter : Jen-Wei Kuo"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Presenter : Jen-Wei Kuo

Similar presentations

Presentation on theme: "Presenter : Jen-Wei Kuo"— Presentation transcript:

Similar presentations

About project

Feedback