Topic: Pitch Extraction

Slides:



Advertisements
Similar presentations
An Approach in Reproducing the Auto-Tune Effect Mentees: Dong-San Choi & Tejas Rawal Mentor: David Jun.
Advertisements

Acoustic/Prosodic Features
Digital Signal Processing
Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Spectral envelope analysis of TIMIT corpus using LP, WLSP, and MVDR Steve Vest Matlab implementation of methods by Tien-Hsiang Lo.
PITCH DETECTION Shaan Patel HoangMinh Nguyen Richard King.
Fundamental Frequency & Jitter Lab 2. Fundamental Frequency Pitch is the perceptual correlate of F 0 Perception is not equivalent to measurement: –Pitch=
Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09.
Final Year Project Pat Hurney Digital Pitch Correction for Electric Guitars.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Vocal microtremor in normophonic and mildly dysphonic speakers Jean Schoentgen Université Libre Bruxelles Brussels - Belgium.
ACOUSTICAL THEORY OF SPEECH PRODUCTION
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
L 17 The Human Voice. The Vocal Tract epiglottis.
Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
SOME SIMPLE MANIPULATIONS OF SOUND USING DIGITAL SIGNAL PROCESSING Richard M. Stern demo August 31, 2004 Department of Electrical and Computer.
Pole Zero Speech Models Speech is nonstationary. It can approximately be considered stationary over short intervals (20-40 ms). Over thisinterval the source.
Anatomic Aspects Larynx: Sytem of muscles, cartileges and ligaments.
1 Lab Preparation Initial focus on Speaker Verification –Tools –Expertise –Good example “Biometric technologies are automated methods of verifying or recognising.
Communications & Multimedia Signal Processing Analysis of Effects of Train/Car noise in Formant Track Estimation Qin Yan Department of Electronic and Computer.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
Human Psychoacoustics shows ‘tuning’ for frequencies of speech If a tree falls in the forest and no one is there to hear it, will it make a sound?
Representing Acoustic Information
CS 551/651: Structure of Spoken Language Lecture 1: Visualization of the Speech Signal, Introductory Phonetics John-Paul Hosom Fall 2010.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Kinect Player Gender Recognition from Speech Analysis
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Automatic Pitch Tracking September 18, 2014 The Digitization of Pitch The blue line represents the fundamental frequency (F0) of the speaker’s voice.
1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.
MUSIC 318 MINI-COURSE ON SPEECH AND SINGING
Automatic Pitch Tracking January 16, 2013 The Plan for Today One announcement: Starting on Monday of next week, we’ll meet in Craigie Hall D 428 We’ll.
1 ELEN 6820 Speech and Audio Processing Prof. D. Ellis Columbia University Midterm Presentation High Quality Music Metacompression Using Repeated- Segment.
ECE 598: The Speech Chain Lecture 7: Fourier Transform; Speech Sources and Filters.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
Audio Thumbnailing of Popular Music Using Chroma-Based Representations Matt Williamson Chris Scharf Implementation based on: IEEE Transactions on Multimedia,
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Structure of Spoken Language
Speech Science VI Resonances WS Resonances Reading: Borden, Harris & Raphael, p Kentp Pompino-Marschallp Reetzp
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Digital Signal Processing January 16, 2014 Analog and Digital In “reality”, sound is analog. variations in air pressure are continuous = it has an amplitude.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Pitch Estimation by Enhanced Super Resolution determinator By Sunya Santananchai Chia-Ho Ling.
Performance Comparison of Speaker and Emotion Recognition
SRINIVAS DESAI, B. YEGNANARAYANA, KISHORE PRAHALLAD A Framework for Cross-Lingual Voice Conversion using Artificial Neural Networks 1 International Institute.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
P105 Lecture #27 visuals 20 March 2013.
And application to estimating the left-hand fingering (automatic tabulature generation) Caroline Traube Center for Computer Research in Music and Acoustics.
CS 591 S1 – Computational Audio -- Spring, 2017
Figure 11.1 Linear system model for a signal s[n].
Automatic Speech Processing Project
ARTIFICIAL NEURAL NETWORKS
CS 591 S1 – Computational Audio -- Spring, 2017
Linear Predictive Coding Methods
Measured Period VOICE SIGNAL
Pitch Estimation By Chih-Ti Shih 12/11/2006 Chih-Ti Shih.
Remember me? The number of times this happens in 1 second determines the frequency of the sound wave.
Linear Prediction.
Speech Processing Final Project
Harmonic Motion Motion in Cycles.
Auditory Morphing Weyni Clacken
Presentation transcript:

Topic: Pitch Extraction Kishore Prahallad Email: skishore@cs.cmu.edu Carnegie Mellon University & International Institute of Information Technology Hyderabad Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Objective of this Lecture Describe the extraction of pitch using autocorrelation function Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu) Pitch Period Time Taken to complete one cycle of vibration of vocal folds Pitch Period is also referred to as fundamental frequency or F0 Measured as time difference between two major peaks in the voiced speech signal Pitch is observed only in voiced regions Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu) Pitch Marks Pitch extraction is done through autocorrelation based algorithm Implementation details may be necessary to tune the pitch Tune the parameters of pitch extraction to tune to the specific speaker (your voice talent) Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Algorithm: Autocorrelation based Pitch Extraction 1. Filter the speech signal Pitch range is 40 – 400 Hz (40-200 Hz for male, and 200-400 Hz for female) Use a low pass filter - restore the frequency components less than 800 Hz Use a high pass filter – restore the frequency components greater than 40 Hz Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu) Algorithm.. 2. Divide the signal into shorter analysis window 3. For each short analysis window Take autocorrelation of the signal Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Autocorrelation Function r[k] Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Autocorrelation of a Vowel signal Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Autocorrelation of a Vowel signal Remember - Pitch is the time difference between two major peaks. Here first peak is at 0, and the second peak is at 110. Hence pitch is 110 samples Pitch in Sec (t) = 110/16000, 16000 – sampling freq Pitch in Hz = 1 / t = 16000/110 = 145 Hz Pitch = 145 Hz implies this is a male speech r[0] r[110] Autocorrelation Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

How do you know r[110] is the second major peak? To know that r[110] is closer to r[0], we use peak picking algorithm on the autocorrelation function. Peak picking algorithm search for a peak in a *specified* region. This region is the tunable parameter 40 – 200 Hz Range: r[400] – r[80] for male The numbers in r[] are calculated for 16000 Hz sampling frequency 200 – 400 Hz Range: r[80] – r[40] for female Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu) Algorithm contd… 2. Divide the signal into shorter analysis window 3. For each short analysis window 3.1 Take autocorrelation of the signal 3.2 Pick the second major peak and obtain pitch information in sec. Note: This is a simple yet a good pitch extraction algorithms. (We use this method for all our discussion) There are *many* ways of extracting pitch from the speech signal Ex: FFT based, linear prediction based etc. Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)