Download presentation
Presentation is loading. Please wait.
Published byEthelbert Miles Modified over 9 years ago
1
Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson
2
Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work
3
Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work
4
Speech Production Vocal Tract Frequency Reponse
5
Speech Production Periodic Source Vocal Tract Frequency Reponse
6
Speech Production Periodic Source Vocal Tract Frequency Reponse Nasal cavities contribute too Output
7
Speech Model: Basic Impulse Train Generator Pitch Period Glottal Pulse Model X Random Noise Generator X Vocal Tract Model Vocal Tract Parameters Voiced/Unvoiced Decision Gain
8
Speech Model: Klatt
9
Parameters Source characterization – Voiced or unvoiced – Frequency of periodic source – Energy distribution of a noise source Vocal tract model – Resonant frequency (Formants), antiresonant frequencies and bandwidths
10
Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work
11
Background - Xkl Developed in-house at MIT by Dennis Klatt in the 1980s, and was originally a command line tool on Vax systems. Later was ported to UNIX and an X11/Motif GUI was added. Currently runs on Linux. Praat has become a very versatile alternative to Xkl, but Xkl has functionality that Praat does not.
12
Xkl – Features Allows users to easily examine speech signals in fine detail. Automatically computes DFT and spectrogram. Can perform a variety of computations not available in other packages. Averages spectra over time or waveforms Smooth spectrum
13
Spectrogram and DFT in Xkl Spectrogram DFT and smoothed spectrum
14
Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work
15
Design Requirements Users surveyed wanted: 1. Pitch period estimator 2. An improved labeling system 3. Portability 1.Compatibility with multiple operating systems 2.Support for more audio file formats
16
Pitch Detection How rapidly the vocal tract is excited with periodic pulses. Carries lexical and prosodic information. During computation we must decide whether speech is voiced or unvoiced. – Errors in computation often occur during transitions between sounds. – Errors depend on type of pitch detector being used.
17
Pitch Detection: Design There are many different pitch detectors Praat's was chosen because it – Outperforms other detectors (SNR, HNR) – Is readily available
18
Pitch Detection: Algorithm Tone 4 Remove Hanning Window Sidelobe Praat Pitch Detector Compute Global Peak Value Process Frame To Obtain Local Optimal Choices Find Path with Globally Minimum Cost Time domain, autocorrelation method Frame processing determines strongest pitch candidates including unvoiced. Viterbi algorithm minimizes global cost from candidates.
19
Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work
20
Labeling Support for reading and saving TextGrid files, for interaction with Praat [1]. – Tiers for grouping labels Want labels to be displayed in same window as waveform – Different from Xkl's separated window layout
21
Labeling
22
Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work
23
Portability PortAudio – a cross-platform audio library – supports most operating systems – simplifies software maintenance Runs on OS X – Since it natively runs X11 Added support to open Microsoft.wav files.
24
Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Requirements – Alternatives – Final Design Future Work
25
Deploy to users for feedback Finalize – Labeling – Pitch Contour Fix bugs and add small features
26
Software Used Eclipse – Integrated Development Environment. Doxygen – A documentation generation system. SVN – A version control system. Open Motif – X Windows window managing system and widget library. GDB – The GNU debugger. GNU build system on OS X. PortAudio – A multiplatform audio library.
27
Thank you for your attention. Special thanks to: Professor Helen Hanson Dr. Stefanie Shattuck-Hufnagel (MIT) Dennis H. Klatt Survey Participants ECE Department
28
Questions?
29
References 1: Paul Boersma & David Weenink (2009): Praat: doing phonetics by computer (Version 5.1.05) [Computer program]. Retrieved May 1, 2009, from http://www.praat.org/David Weenink 2: Paul Boersma, Accurate Short-term analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound, 1993, http://www.fon.hum.uva.nl/paul/papers/Proceedings_1993.pdfhttp://www.fon.hum.uva.nl/paul/papers/Proceedings_1993.pdf
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.