Automatic Pitch Tracking January 16, 2013 The Plan for Today One announcement: Starting on Monday of next week, we’ll meet in Craigie Hall D 428 We’ll.

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

Harmonics October 29, 2012 Where Were We? Were halfway through grading the mid-terms. For the next two weeks: more acoustics Its going to get worse before.
Tom Lentz (slides Ivana Brasileiro)
Digital Signal Processing
Spectral Analysis Feburary 24, 2009 Sorting Things Out 1.TOBI transcription homework rehash. And some structural reminders. 2.On Thursday: back in the.
5/5/20151 Acoustics of Speech Julia Hirschberg CS 4706.
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
Frequency, Pitch, Tone and Length October 15, 2012 Thanks to Chilin Shih for making some of these lecture materials available.
A Robust Algorithm for Pitch Tracking David Talkin Hsiao-Tsung Hung.
Vowel Acoustics, part 2 March 12, 2014 The Master Plan Today: How resonance relates to vowels (= formants) On Friday: In-class transcription exercise.
Tone, Accent and Stress February 14, 2014 Practicalities Production Exercise #2 is due at 5 pm today! For Monday after the break: Yoruba tone transcription.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson.
Vowel Acoustics, part 2 November 14, 2012 The Master Plan Acoustics Homeworks are due! Today: Source/Filter Theory On Friday: Transcription of Quantity/More.
Pitch Tracking + Prosody January 20, 2009 The Plan for Today One announcement: On Thursday, we’ll meet in the Tri-Faculty Computer Lab (SS 018) Section.
Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.
Overview What is in a speech signal?
Basic Acoustics + Digital Signal Processing September 11, 2014.
Source/Filter Theory and Vowels February 4, 2010.
LE 460 L Acoustics and Experimental Phonetics L-13
Computer Science 121 Scientific Computing Winter 2014 Chapter 13 Sounds and Signals.
Lab #8 Follow-Up: Sounds and Signals* * Figures from Kaplan, D. (2003) Introduction to Scientific Computation and Programming CLI Engineering.
Basic Acoustics October 12, 2012 Agenda The Final Exam schedule has been posted: Tuesday, December 18 th, from 8-10 am Location TBD I will look into.
ACOUSTICS AND THE ELEMENTS OF MUSIC Is your name and today’s date at the top of the worksheet now?
Voice Quality Feburary 11, 2013 Practicalities Course project reports to hand in! And the next set of guidelines to hand out… Also: the mid-term is on.
Automatic Pitch Tracking September 18, 2014 The Digitization of Pitch The blue line represents the fundamental frequency (F0) of the speaker’s voice.
Resonance, Revisited March 4, 2013 Leading Off… Project report #3 is due! Course Project #4 guidelines to hand out. Today: Resonance Before we get into.
Vowels, part 4 March 19, 2014 Just So You Know Today: Source-Filter Theory For Friday: vowel transcription! Turkish, British English and New Zealand.
Vowel Acoustics November 2, 2012 Some Announcements Mid-terms will be back on Monday… Today: more resonance + the acoustics of vowels Also on Monday:
Basics of Digital Audio Outline  Introduction  Digitization of Sound  MIDI: Musical Instrument Digital Interface.
Harmonics November 1, 2010 What’s next? We’re halfway through grading the mid-terms. For the next two weeks: more acoustics It’s going to get worse before.
Male Cheerleaders and their Voices. Background Information: What Vocal Folds Look Like.
The end of vowels + The beginning of fricatives November 19, 2012.
Georgia Institute of Technology Introduction to Processing Digital Sounds part 1 Barb Ericson Georgia Institute of Technology Sept 2005.
Pitch Tracking + Prosody January 17, 2012 The Plan for Today One announcement: On Thursday, we’ll meet in the Craigie Hall D 428 We’ll be working on.
Formants, Resonance, and Deriving Schwa March 10, 2009.
Frequency, Pitch, Tone and Length October 16, 2013 Thanks to Chilin Shih for making some of these lecture materials available.
Resonance October 23, 2014 Leading Off… Don’t forget: Korean stops homework is due on Tuesday! Also new: mystery spectrograms! Today: Resonance Before.
Digital Signal Processing January 16, 2014 Analog and Digital In “reality”, sound is analog. variations in air pressure are continuous = it has an amplitude.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Vowel Acoustics March 10, 2014 Some Announcements Today and Wednesday: more resonance + the acoustics of vowels On Friday: identifying vowels from spectrograms.
From Resonance to Vowels March 10, Fun Stuff (= tracheotomy) Peter Ladefoged: “To record the pressure of the air associated with stressed as opposed.
Spectral Analysis Feburary 23, 2010 Sorting Things Out 1.On Thursday: back in the computer lab. Craigie Hall D 428 Analysis of Korean stops. 2.Remember:
Frequency, Pitch, Tone and Length February 12, 2014 Thanks to Chilin Shih for making some of these lecture materials available.
Resonance January 28, 2010 Last Time We discussed the difference between sine waves and complex waves. Complex waves can always be understood as combinations.
Intro-Sound-part1 Introduction to Processing Digital Sounds part 1 Barb Ericson Georgia Institute of Technology Oct 2009.
Tone, Accent and Quantity October 19, 2015 Thanks to Chilin Shih for making some of these lecture materials available.
Fricatives November 20, 2015 The Road Ahead Formant plotting + vowel production exercises are due at 5 pm today! Monday and Wednesday of next week: fricatives,
More On Linear Predictive Analysis
AUDITORY TRANSDUCTION SEPT 4, 2015 – DAY 6 Brain & Language LING NSCI Fall 2015.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
Voicing + Basic Acoustics October 14, 2015 Agenda Production Exercise #2 is due on Friday! No transcription exercise this Friday! Today, we’ll begin.
Vowels, part 4 November 16, 2015 Just So You Know Today: Vowel remnants + Source-Filter Theory For Wednesday: vowel transcription! Turkish and British.
Session 18 The physics of sound and the manipulation of digital sounds.
Topic: Pitch Extraction
Traveling Waves Standing Waves Musical Instruments Musical Instruments all work by producing standing waves. There are three types of instrument.
Basic Acoustics + Digital Signal Processing January 11, 2013.
Spectral Analysis March 3, 2016 Mini-Rant I have succeeded in grading your course project reports. Things to keep in mind: A table of stop phonemes is.
Harmonics October 28, Where Were We? Mid-terms: our goal is to get them back to you by Friday. Production Exercise #2 results should be sent to.
Acoustics of Speech Julia Hirschberg CS /7/2018.
Analyzing the Speech Signal
Linear Predictive Coding Methods
Analyzing the Speech Signal
Pitch Estimation By Chih-Ti Shih 12/11/2006 Chih-Ti Shih.
Remember me? The number of times this happens in 1 second determines the frequency of the sound wave.
ESTIMATED INVERSE SYSTEM
Acoustics of Speech Julia Hirschberg CS /2/2019.
Presentation transcript:

Automatic Pitch Tracking January 16, 2013

The Plan for Today One announcement: Starting on Monday of next week, we’ll meet in Craigie Hall D 428 We’ll be working on intonation transcription… The plan for today: Automatic Pitch Tracking On Friday: 1.(Brief) suprasegmentals review 2.The basics of English intonation

The Digitization of Pitch The blue line represents the fundamental frequency (F0) of the speaker’s voice. Also known as a pitch track How can we automatically “track” F0 in a sample of speech? Praat can give us a representation of speech that looks like:

Pitch Tracking Voicing: Air flow through vocal folds Rapid opening and closing due to Bernoulli Effect Each cycle sends an acoustic shockwave through the vocal tract …which takes the form of a complex wave. The rate at which the vocal folds open and close becomes the fundamental frequency (F0) of a voiced sound.

Voicing Bars

Individual glottal pulses

Voicing = Complex Wave Note: voicing is not perfectly periodic. …always some random variation from one cycle to the next. How can we measure the fundamental frequency of a complex wave?

The basic idea: figure out the period between successive cycles of the complex wave. Fundamental frequency = 1 / period duration = ???

Measuring F0 To figure out where one cycle ends and the next begins… The basic idea is to find how well successive “chunks” of a waveform match up with each other. One period = the length of the chunk that matches up best with the next chunk. Automatic Pitch Tracking parameters to think about: 1.Window size (i.e., chunk size) 2.Step size 3.Frequency range (= period range)

Window (Chunk) Size Here’s an example of a small window

Window (Chunk) Size Here’s an example of a large(r) window

Initial window of the waveform is compared to another window (of the same duration) at a later point in the waveform

Matching The waveforms in the two windows are compared to see how well they match up. Correlation = measure of how well the two windows match ???

Autocorrelation The measure of correlation = Sum of the point-by-point products of the two chunks. The technical name for this is autocorrelation… because two parts of the same wave are being matched up against each other. (“auto” = self)

Autocorrelation Example Ex: consider window x, with n samples… What’s its correlation with window y? (Note: window y must also have n samples) x 1 = first sample of window x x 2 = second sample of window x … x n = nth (final) sample of window x y 1 = first sample of window y, etc. Correlation (R) = x 1 *y 1 + x 2 * y 2 + … + x n * y n The larger R is, the better the correlation.

By the Numbers Sample x y product Sum of products = -.48 These two chunks are poorly correlated with each other.

By the Numbers, part 2 Sample x z product Sum of products = 1.26 These two chunks are well correlated with each other. (or at least better than the previous pair) Note: matching peaks count for more than matches close to 0.

Back to (Digital) Reality The waveforms in the two windows are compared to see how well they match up. Correlation = measure of how well the two windows match ??? These two windows are poorly correlated

Next: the pitch tracking algorithm moves further down the waveform and grabs a new window

The distance the algorithm moves forward in the waveform is called the step size “step”

Matching, again The next window gets compared to the original. ???

Matching, again The next window gets compared to the original. ??? These two windows are also poorly correlated

The algorithm keeps chugging and, eventually… another “step”

Matching, again The best match is found. ??? These two windows are highly correlated

The fundamental period can be determined by the calculating the length of time between the start of window 1 and the start of (well correlated) window 2. period

Frequency is 1 / period Q: How many possible periods does the algorithm need to check? Frequency range (default in Praat: 75 to 600 Hz) Mopping up

Moving on Another comparison window is selected and the whole process starts over again.

would Uhm I like A flight to Seattle from Albuquerque The algorithm ultimately spits out a pitch track. This one shows you the F0 value at each step. Thanks to Chilin Shih for making these materials available

Pitch Tracking in Praat Play with F0 range. Create Pitch Object. Also go To Manipulation…Pitch. Also check out:

Summing Up Pitch tracking uses three parameters 1.Window size Ensures reliability In Praat, the window size is always three times the longest possible period. E.g.: 3 X 1/75 =.04 sec. 2.Step size For temporal precision 3.Frequency range Reduces computational load

Deep Thought Questions What might happen if: The shortest period checked is longer than the fundamental period? AND two fundamental periods fit inside a window? Potential Problem #1: Pitch Halving The pitch tracker thinks the fundamental period is twice as long as it is in reality.  It estimates F0 to be half of its actual value

Pitch Halving pitch is halved Check out normal file in Praat.