Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson.

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Tools for Speech Analysis Julia Hirschberg CS4995/6998 Thanks to Jean-Philippe Goldman, Fadi Biadsy.
Tools for Speech Analysis Julia Hirschberg CS4995/6998 Thanks to Jean-Philippe Goldman, Fadi Biadsy.
Overview of Real-Time Pitch Tracking Approaches Music information retrieval seminar McGill University Francois Thibault.
Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
A 12-WEEK PROJECT IN Speech Coding and Recognition by Fu-Tien Hsiao and Vedrana Andersen.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
AN INTRODUCTION TO PRAAT Tina John M.A. Institute of Phonetics and digital Speech Processing - University Kiel Institute of Phonetics and Speech Processing.
SOME SIMPLE MANIPULATIONS OF SOUND USING DIGITAL SIGNAL PROCESSING Richard M. Stern demo August 31, 2004 Department of Electrical and Computer.
Presented By: Karan Parikh Towards the Automated Social Analysis of Situated Speech Data Watt, Chaudhary, Bilmes, Kitts CS546 Intelligent.
Looking at Spectrogram in Praat cs4706, Jan 30 Fadi Biadsy.
Tools for Speech Analysis 2 How do we choose? What kind of data? Which task?
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing.
Analysis & Synthesis The Vocoder and its related technology.
Praat Fadi Biadsy.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
A PRESENTATION BY SHAMALEE DESHPANDE
LE 460 L Acoustics and Experimental Phonetics L-13
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Automatic Pitch Tracking September 18, 2014 The Digitization of Pitch The blue line represents the fundamental frequency (F0) of the speaker’s voice.
Multimedia Specification Design and Production 2013 / Semester 2 / week 3 Lecturer: Dr. Nikos Gazepidis
Automatic Pitch Tracking January 16, 2013 The Plan for Today One announcement: Starting on Monday of next week, we’ll meet in Craigie Hall D 428 We’ll.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
Speech analysis with Praat Paul Trilsbeek DoBeS training course June 2007.
Power PMAC Tuning Tool Overview. Power PMAC Servo Structure Versatile, Allows complex servo algorithms be implemented Allows 2 degree of freedom control.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
Pitch Estimation by Enhanced Super Resolution determinator By Sunya Santananchai Chia-Ho Ling.
(Extremely) Simplified Model of Speech Production
Sound Waveforms Neil E. Cotter Associate Professor (Lecturer) ECE Department University of Utah CONCEPT U AL TOOLS.
IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and.
Department of Mechanical Engineering, LSUSession VII MATLAB Tutorials Session VII Introduction to SIMULINK Rajeev Madazhy
Performance Comparison of Speaker and Emotion Recognition
More On Linear Predictive Analysis
Predicting Voice Elicited Emotions
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Linear Prediction.
HW2-2 Speech Analysis TA: 林賢進
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Visual Programming Borland Delphi. Developing Applications Borland Delphi is an object-oriented, visual programming environment to develop 32-bit applications.
Praat: doing phonetics by computer Introductory tutorial Kyuchul Yoon Division of English Kyungnam University.
Fluency in Oral Interaction Workshop (FLOW)
Vocoders.
Speech Analysis TA:Chuan-Hsun Wu
Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
The Vocoder and its related technology
Digital Systems: Hardware Organization and Design
Linear Prediction.
Multimodal Caricatural Mirror
EE Audio Signals and Systems
Speech Processing Final Project
Tools for Speech Analysis
Looking at Spectrogram in Praat cs4706, Jan 30
Presentation transcript:

Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work

Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work

Speech Production Vocal Tract Frequency Reponse

Speech Production Periodic Source Vocal Tract Frequency Reponse

Speech Production Periodic Source Vocal Tract Frequency Reponse Nasal cavities contribute too Output

Speech Model: Basic Impulse Train Generator Pitch Period Glottal Pulse Model X Random Noise Generator X Vocal Tract Model Vocal Tract Parameters Voiced/Unvoiced Decision Gain

Speech Model: Klatt

Parameters Source characterization – Voiced or unvoiced – Frequency of periodic source – Energy distribution of a noise source Vocal tract model – Resonant frequency (Formants), antiresonant frequencies and bandwidths

Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work

Background - Xkl Developed in-house at MIT by Dennis Klatt in the 1980s, and was originally a command line tool on Vax systems. Later was ported to UNIX and an X11/Motif GUI was added. Currently runs on Linux. Praat has become a very versatile alternative to Xkl, but Xkl has functionality that Praat does not.

Xkl – Features Allows users to easily examine speech signals in fine detail. Automatically computes DFT and spectrogram. Can perform a variety of computations not available in other packages. Averages spectra over time or waveforms Smooth spectrum

Spectrogram and DFT in Xkl Spectrogram DFT and smoothed spectrum

Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work

Design Requirements Users surveyed wanted: 1. Pitch period estimator 2. An improved labeling system 3. Portability 1.Compatibility with multiple operating systems 2.Support for more audio file formats

Pitch Detection How rapidly the vocal tract is excited with periodic pulses. Carries lexical and prosodic information. During computation we must decide whether speech is voiced or unvoiced. – Errors in computation often occur during transitions between sounds. – Errors depend on type of pitch detector being used.

Pitch Detection: Design There are many different pitch detectors Praat's was chosen because it – Outperforms other detectors (SNR, HNR) – Is readily available

Pitch Detection: Algorithm Tone 4 Remove Hanning Window Sidelobe Praat Pitch Detector Compute Global Peak Value Process Frame To Obtain Local Optimal Choices Find Path with Globally Minimum Cost Time domain, autocorrelation method Frame processing determines strongest pitch candidates including unvoiced. Viterbi algorithm minimizes global cost from candidates.

Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work

Labeling Support for reading and saving TextGrid files, for interaction with Praat [1]. – Tiers for grouping labels Want labels to be displayed in same window as waveform – Different from Xkl's separated window layout

Labeling

Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Pitch Detection – Labeling – Portability Future Work

Portability PortAudio – a cross-platform audio library – supports most operating systems – simplifies software maintenance Runs on OS X – Since it natively runs X11 Added support to open Microsoft.wav files.

Outline Introduction to speech analysis – Production mechanism – Models of speech production Background about Xkl Design – Requirements – Alternatives – Final Design Future Work

Deploy to users for feedback Finalize – Labeling – Pitch Contour Fix bugs and add small features

Software Used  Eclipse – Integrated Development Environment.  Doxygen – A documentation generation system.  SVN – A version control system.  Open Motif – X Windows window managing system and widget library.  GDB – The GNU debugger.  GNU build system on OS X.  PortAudio – A multiplatform audio library.

Thank you for your attention. Special thanks to: Professor Helen Hanson Dr. Stefanie Shattuck-Hufnagel (MIT) Dennis H. Klatt Survey Participants ECE Department

Questions?

References 1: Paul Boersma & David Weenink (2009): Praat: doing phonetics by computer (Version ) [Computer program]. Retrieved May 1, 2009, from Weenink 2: Paul Boersma, Accurate Short-term analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound, 1993,