Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.

Slides:

Advertisements

Similar presentations

Tamara Berg Advanced Multimedia

Advertisements

Analysis of recorded sounds using Labview

SWE 423: Multimedia Systems Chapter 3: Audio Technology (2)

Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.

Enabling Access to Sound Archives through Integration, Enrichment and Retrieval Report about polyphonic music transcription.

Tetsuro Kitahara* Masataka Goto** Hiroshi G. Okuno*

Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.

Patch to the Future: Unsupervised Visual Prediction

Copyright © 2011 by Denny Lin1 Simple Synthesizer Part 4 Based on Floss Manuals (Pure Data) “Building a Simple Synthesizer” By Derek Holzer Slides by Denny.

Soundprism An Online System for Score-informed Source Separation of Music Audio Zhiyao Duan and Bryan Pardo EECS Dept., Northwestern Univ. Interactive.

Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.

HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.

LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.

SUBJECTIVE ATTRIBUTES OF SOUND Acoustics of Concert Halls and Rooms Science of Sound, Chapters 5,6,7 Loudness, Timbre.

1 Robust Temporal and Spectral Modeling for Query By Melody Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University.

Part 4 b Forward-Backward Algorithm & Viterbi Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.

DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.

Learning to Align Polyphonic Music. Slide 1 Learning to Align Polyphonic Music Shai Shalev-Shwartz Hebrew University, Jerusalem Joint work with Yoram.

Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.

Recent Research in Musical Timbre Perception James W. Beauchamp University of Illinois at Urbana-Champaign Andrew B. Horner Hong University of Science.

/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,

infinity-project.org Engineering education for today’s classroom 53 Design Problem - Digital Band Build a digital system that can create music of any.

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Harmonically Informed Multi-pitch Tracking Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab,

1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.

Isolated-Word Speech Recognition Using Hidden Markov Models

Harmonic Series and Spectrograms

To download PhotoStory: Go to On the left side under Product Resources, click on Downloads.

Sound quality and instruments  Different notes correspond to different frequencies  The equally tempered scaled is set up off of 440 A  meaning the.

Creating Web Documents alt attribute Good and bad uses of ‘multimedia’ Sound files Homework: Discuss with me AND post announcement of Project II. Forms.

Multimodal Information Analysis for Emotion Recognition

MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

ENOMA - European Network of Online Musical Archives ENOMA Workshop – The Grieg Academy, UiB 26 May 2006 Leif Arne Rønningen and Lars Erik Løvhaug NTNU.

Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.

Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.

Sound Representation Digitizing Sound Sound waves through Air Different Voltages Voltage converted to numbers.

Polyphonic Transcription Bruno Angeles McGill University - Schulich School of Music MUMT-621 Fall /14.

Instrument Classification in a Polyphonic Music Environment Yingkit Chow Spring 2005.

Song-level Multi-pitch Tracking by Heavily Constrained Clustering Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio.

Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.

CS Statistical Machine learning Lecture 24

1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.

1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.

Learning and Acting with Bayes Nets Chapter 20.. Page 2 === A Network and a Training Data.

Photo Story. How to use Photo Story Photo Story 3 can be located in the Accessories folder on school computers. You will need to have your pictures already.

Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.

Dynamic Tuning Of Language Model Score In Speech Recognition Using A Confidence Measure Sherif Abdou, Michael Scordilis Department of Electrical and Computer.

1 Automatic Music Style Recognition Arturo Camacho.

1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.

Piano Music Transcription Wes “Crusher” Hatch MUMT-614 Thurs., Feb.13.

Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.

Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.

Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:

Other Models for Time Series. The Hidden Markov Model (HMM)

And application to estimating the left-hand fingering (automatic tabulature generation) Caroline Traube Center for Computer Research in Music and Acoustics.

Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.

Intro to Fourier Series BY JORDAN KEARNS (W&L ‘14) & JON ERICKSON (STILL HERE )

1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215.

Spell checking. Spelling Correction and Edit Distance Non-word error detection: – detecting “graffe” “ سوژن ”, “ مصواک ”, “ مداا ” Non-word error correction:

Yun, Hyuk Jin. Theory A.Nonuniformity Model where at location x, v is the measured signal, u is the true signal emitted by the tissue, is an unknown.

Automatic Transcription of Polyphonic Music

Rhythmic Transcription of MIDI Signals

Online Multiscale Dynamic Topic Models

Catherine Lai MUMT-611 MIR February 17, 2005

EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture

CS 591 S1 – Computational Audio -- Spring, 2017

EE513 Audio Signals and Systems

Measuring the Similarity of Rhythmic Patterns

Harmonically Informed Multi-pitch Tracking

Music Signal Processing

Presentation transcript:

Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005

Presentation Outline Motivating Music Transcription My Project Proposal Project Timeline

Motivating Music Transcription Given a musical recording, we wish to obtain a MIDI score for : –Performance (convert MIDI to a music score) –Analysis (evaluate intonation or number of missed or incorrect notes - useful for music education) –Comparison with other music (copyright infringement / search) –Replay on MIDI synthesizers (use different instruments / change settings / overlay tracks, etc...)

Recent Previous Work Multi-Instrument Musical Transcription Using A Dynamic Graphical Model, Michael Jordan, 2004 Automatic Transcription of Piano Music, Christopher Raphael, 2002, Univ. of Massachusetts, Amherst Polyphonic Pitch Extraction, Graham Poliner, E6820 Speech & Audio Signal Processing, Spring 2004 Many, many, many more…. Try searching Google for PDF documents with keywords : music transcription

Presentation Outline Motivating Music Transcription My Project Proposal Project Timeline

My Project Proposal Jordan presents a multi-instrument transcription system capable of listening to a recording in which two or more instruments are playing, and identifying both the notes that were played and the instruments that played them. The system models two musical instruments, each capable of playing at most one note at a time. My Goal : implement and improve upon Jordan’s Dynamic Graphical Model (DGM) approach. Whereas he made assumptions about how to model each instrument, I want to let the system learn what to look for by starting with a general model. Jordan uses a reduced set of states and parameters for efficiency. Try to use a larger model if possible.

Dynamic Graphical Model (DGM) - what is it? My Project Proposal Hidden State Variables Correspond to Discrete Set of Allowable Intensity and Pitch Values

Key Points in Jordan’s Approach –Use of a note-event timbre model that includes both a spectral model (in frequency) and a dynamic intensity versus time model (or a “time envelope model”). –We will perform inference (using the Viterbi Algorithm) on the DGM to compute the path of maximum posterior probability to find explicit note-on events. (note locations) My Project Proposal

Intensity Transition Model for Violin My Project Proposal

Intensity Transition Model for Piano My Project Proposal

General Intensity Transition Model My Project Proposal

Pitch Transition Model Build a pitch state conditional probability distribution as a function of both the previous pitch state and the previous intensity state. Transition probabilities are also based on Shephard's pitch helix : defines psycho- acoustic distance between pitches. My Project Proposal

Observation Model - explains the sound Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal

Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal

Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal

Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal

Observation Model Model the spectrum of a harmonic musical signal as a series of narrow bump functions that are harmonically spaced. That is, conditional on the fundamental frequency Pitch(t) of the musical signal, we model the spectrum as consisting of a series of bump functions located at integer multiples of Pitch(t). Each bump function is given a scale parameter alpha(n) that can depend on Pitch(t). The motivation for this is that the relative spectral content of an instrument can depend on what pitch is being played. The intensity envelope at time t scales all of the harmonics. My Project Proposal

Evalution Metrics Note Error Rate (based on “minimum edit distance” in speech) = 100 x ( Insertions + Substitutions + Deletions ) / Total Number of Notes in Score. We want to minimize this. Dixon Success Score = 100 x (Correct Notes / ( Correct + False Positives + Deletions ). We want to maximize this. My Project Proposal

Presentation Outline Motivating Music Transcription My Project Proposal Project Timeline

Seven Weeks Left 3/14 - Collect MIDI Data + Convert to WAV Audio, Understand DGM 3/21 - Start building / understanding graphical models 3/28 - Continue building / understanding graphical models 4/04 - Finish building / understanding graphical models 4/11 - Evaluate Results / Fix Bugs 4/18 - Try new data / Fix bugs. Begin Preparing Final Presentation. 4/25 - Finish Preparing Final Presentation 4/27 - Final Presentation in Class