Download presentation
Presentation is loading. Please wait.
Published byRoderick Powers Modified over 9 years ago
1
Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005
2
Presentation Outline Motivating Music Transcription My Project Proposal Project Timeline
3
Motivating Music Transcription Given a musical recording, we wish to obtain a MIDI score for : –Performance (convert MIDI to a music score) –Analysis (evaluate intonation or number of missed or incorrect notes - useful for music education) –Comparison with other music (copyright infringement / search) –Replay on MIDI synthesizers (use different instruments / change settings / overlay tracks, etc...)
4
Recent Previous Work Multi-Instrument Musical Transcription Using A Dynamic Graphical Model, Michael Jordan, 2004 Automatic Transcription of Piano Music, Christopher Raphael, 2002, Univ. of Massachusetts, Amherst Polyphonic Pitch Extraction, Graham Poliner, E6820 Speech & Audio Signal Processing, Spring 2004 Many, many, many more…. Try searching Google for PDF documents with keywords : music transcription
5
Presentation Outline Motivating Music Transcription My Project Proposal Project Timeline
6
My Project Proposal Jordan presents a multi-instrument transcription system capable of listening to a recording in which two or more instruments are playing, and identifying both the notes that were played and the instruments that played them. The system models two musical instruments, each capable of playing at most one note at a time. My Goal : implement and improve upon Jordan’s Dynamic Graphical Model (DGM) approach. Whereas he made assumptions about how to model each instrument, I want to let the system learn what to look for by starting with a general model. Jordan uses a reduced set of states and parameters for efficiency. Try to use a larger model if possible.
7
Dynamic Graphical Model (DGM) - what is it? My Project Proposal Hidden State Variables Correspond to Discrete Set of Allowable Intensity and Pitch Values
8
Key Points in Jordan’s Approach –Use of a note-event timbre model that includes both a spectral model (in frequency) and a dynamic intensity versus time model (or a “time envelope model”). –We will perform inference (using the Viterbi Algorithm) on the DGM to compute the path of maximum posterior probability to find explicit note-on events. (note locations) My Project Proposal
9
Intensity Transition Model for Violin My Project Proposal
10
Intensity Transition Model for Piano My Project Proposal
11
General Intensity Transition Model My Project Proposal
12
Pitch Transition Model Build a pitch state conditional probability distribution as a function of both the previous pitch state and the previous intensity state. Transition probabilities are also based on Shephard's pitch helix : defines psycho- acoustic distance between pitches. My Project Proposal
13
Observation Model - explains the sound Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal
14
Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal
15
Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal
16
Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal
17
Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal
18
Evalution Metrics Note Error Rate (based on “minimum edit distance” in speech) = 100 x ( Insertions + Substitutions + Deletions ) / Total Number of Notes in Score. We want to minimize this. Dixon Success Score = 100 x (Correct Notes / ( Correct + False Positives + Deletions ). We want to maximize this. My Project Proposal
19
Presentation Outline Motivating Music Transcription My Project Proposal Project Timeline
20
Seven Weeks Left 3/14 - Collect MIDI Data + Convert to WAV Audio, Understand DGM 3/21 - Start building / understanding graphical models 3/28 - Continue building / understanding graphical models 4/04 - Finish building / understanding graphical models 4/11 - Evaluate Results / Fix Bugs 4/18 - Try new data / Fix bugs. Begin Preparing Final Presentation. 4/25 - Finish Preparing Final Presentation 4/27 - Final Presentation in Class
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.