Presentation is loading. Please wait.

Presentation is loading. Please wait.

Progress Report of Sphinx in Q (Sep 1st to Dec 30th)

Similar presentations


Presentation on theme: "Progress Report of Sphinx in Q (Sep 1st to Dec 30th)"— Presentation transcript:

1 Progress Report of Sphinx in Q4 2004 (Sep 1st to Dec 30th)
By Arthur Chan

2 High-light Release of Sphinx 3.5 Further Speed-up of Sphinx 3.5
Adaptation Progress New tools: decode_anytopo : the slow decoder ep: high performance model –based end-pointer cepview: versatile cepstral viewer wave2feat: stand-alone tools for converting a wave file to its feature.

3 Release of Sphinx 3.5 Release Candidates Progress Features
Sphinx 3.5 RC II released at Oct 8, 2004 Sphinx 3.5 RC V released at Dec 19 Expected time official release: Beginning of January Features Stable live-mode APIs MLLR Further speed-up from Sphinx 3.4 Absolute CI GMM Selection Approximate computation of CI senones Complete Merging of s3.0 and s3.5 codebases. Both slow and fast decoders are available in the same codebase.

4 Further technical detail of Sphinx 3.5
Portable across OSes: Linux/Solaris/Mac OSX/Windows/BSD Platforms: Alpha/x86/Solaris/PPC Performance is tested extensively with vocabulary size varies from 10 to 10K Test was carried out on Linux only Results are repeatable in Windows/Linux.

5 Performance so far…… Live mode recognition results: TIDIGITS : 0.651%
WSJ 5K : 7.73% Communicator : 13.0%

6 Further Speed-up of Sphinx 3.5
Absolute CI GMM Selection Instead of using a beam, only compute a certain number of CD-senones Results in 20% speed gain on top of 3.4 improvement without loss of accuracy Speed-up of CI GMM computation Using fast GMM computation technique on CI as well. Better lower bound of speed. Outlook Implement tricks such as best GMM index and LDA. (ETA: February)

7 Adaptation Progress MLLR is proved to work in With 10-20% gain
RM1 WSJ (NAB) With 10-20% gain Experiment in unsupervised speaker adaptation is also performed in RM1 task. For detail, please read David Huggins-Daines report.

8 New tools for Sphinx 3.5 Merging of Sphinx 3.0 and Sphinx 3.5 are completed. S3.0 family: Very accurate batch-mode decoder: decode_anytopo: 5% better than non-optimized fast decoder Very slow 4xRT to 10xRT. Useful for batch model processing like in CALO scenario. Auxillary: align: aligner allphone: phoneme recognition dag: best-path finder in a lattice astar: n-best generator

9 New tools for Sphinx 3.5 (cont.)
Other tools: (New!) ep: Speed-up end-pointer of Ziad’s implementation For detail of ep, please read Ziad’s report. cepview: cepstral viewer Originally a standalone tool, now distributed with s3.5 wave2feat: stand-alone feature extraction routine. Exactly the same as the one in the live-mode decoder.

10 Outlook in Q1 2004 Official Release of s3.5 (ETA January)
Further Improvement in Speed (ETA February) Refactoring of Sphinx3.X and SphinxTrain (ETA March) It is time to turn our focus to accuracy.


Download ppt "Progress Report of Sphinx in Q (Sep 1st to Dec 30th)"

Similar presentations


Ads by Google