Harmonically Informed Multi-pitch Tracking

Slides:



Advertisements
Similar presentations
Motivating Markov Chain Monte Carlo for Multiple Target Tracking
Advertisements

Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.
Active Appearance Models
Note-level Music Transcription by Maximum Likelihood Sampling Zhiyao Duan ¹ & David Temperley ² 1.Department of Electrical and Computer Engineering 2.Eastman.
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Source separation and analysis of piano music signals using instrument-specific sinusoidal model Wai Man SZETO and Kin Hong WONG
For Internal Use Only. © CT T IN EM. All rights reserved. 3D Reconstruction Using Aerial Images A Dense Structure from Motion pipeline Ramakrishna Vedantam.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
In this presentation you will: explore the nature of resonance
A Robust Algorithm for Pitch Tracking David Talkin Hsiao-Tsung Hung.
Doorjamb: Unobtrusive Room-level Tracking of People in Homes using Doorway Sensors Timothy W. Hnat, Erin Griffiths, Ray Dawson, Kamin Whitehouse U of Virginia.
Onset Detection in Audio Music J.-S Roger Jang ( 張智星 ) MIR LabMIR Lab, CSIE Dept. National Taiwan University.
Soundprism An Online System for Score-informed Source Separation of Music Audio Zhiyao Duan and Bryan Pardo EECS Dept., Northwestern Univ. Interactive.
Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.
Locally Constraint Support Vector Clustering
1 Robust Temporal and Spectral Modeling for Query By Melody Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University.
Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.
Hearing & Deafness (5) Timbre, Music & Speech Vocal Tract.
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
Hearing & Deafness (5) Timbre, Music & Speech.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,
Harmonically Informed Multi-pitch Tracking Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab,
Instrument Recognition in Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
Educational Software using Audio to Score Alignment Antoine Gomas supervised by Dr. Tim Collins & Pr. Corinne Mailhes 7 th of September, 2007.
FRIP: A Region-Based Image Retrieval Tool Using Automatic Image Segmentation and Stepwise Boolean AND Matching IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 7,
ALIGNMENT OF 3D ARTICULATE SHAPES. Articulated registration Input: Two or more 3d point clouds (possibly with connectivity information) of an articulated.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)
Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
G AUSSIAN M IXTURE M ODELS David Sears Music Information Retrieval October 8, 2009.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.
Extracting Melody Lines from Complex Audio Jana Eggink Supervisor: Guy J. Brown University of Sheffield {j.eggink
Polyphonic Transcription Bruno Angeles McGill University - Schulich School of Music MUMT-621 Fall /14.
A Region Based Stereo Matching Algorithm Using Cooperative Optimization Zeng-Fu Wang, Zhi-Gang Zheng University of Science and Technology of China Computer.
Instrument Classification in a Polyphonic Music Environment Yingkit Chow Spring 2005.
Song-level Multi-pitch Tracking by Heavily Constrained Clustering Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio.
Stable Multi-Target Tracking in Real-Time Surveillance Video
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.
Tracking Groups of People for Video Surveillance Xinzhen(Elaine) Wang Advisor: Dr.Longin Latecki.
Zhiyao Duan, Changshui Zhang Department of Automation Tsinghua University, China Music, Mind and Cognition workshop.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
Automatic Transcription of Polyphonic Music
Onset Detection, Tempo Estimation, and Beat Tracking
Semi-Supervised Clustering
Catherine Lai MUMT-611 MIR February 17, 2005
Interactive Offline Tracking for Color Objects
Introduction to Music Harmony & Texture
Introduction to Music Information Retrieval (MIR)
Collective Network Linkage across Heterogeneous Social Platforms
Pitch.
Advanced Higher Understanding Music
Vehicle Segmentation and Tracking from a Low-Angle Off-Axis Camera
Latent Space Model for Road Networks to Predict Time-Varying Traffic
Presented by Steven Lewis
Fine Arts section 1 pg.7-20 By david steen.
Weakly Supervised Action Recognition
Realtime Recognition of Orchestral Instruments
Realtime Recognition of Orchestral Instruments
Measuring the Similarity of Rhythmic Patterns
Presentation transcript:

Harmonically Informed Multi-pitch Tracking Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab, http://music.cs.northwestern.edu For presentation in ISMIR 2009, Kobe, Japan.

The Multi-pitch Tracking Task Given polyphonic music played by several monophonic harmonic instruments Estimate a pitch trajectory for each instrument Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Potential Applications Automatic music transcription Harmonic source separation Other applications Melody-based music search Chord recognition Music education …… Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

The 2-stage Standard Approach Stage 1: Multi-pitch Estimation (MPE) in each single frame Stage 2: Connect pitch estimates across frames into pitch trajectories … Time Frequency Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

State of the Art How far has existing work gone? Our contribution MPE is not very robust Form short pitch trajectories (within a note) according to local time-frequency proximity of pitch estimates Our contribution A new MPE algorithm A constrained clustering approach to estimate pitch trajectories across multiple notes Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

System Overview Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Multi-pitch Estimation in Single Frame A maximum likelihood estimation method Spectrum: peaks & the non-peak region Best F0 estimate (a set of F0s) Observed power spectrum F0 hypothesis, (a set of F0s) Frequency Amplitude Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Likelihood Definition Likelihood of observing these peaks Likelihood of not having any harmonics in the NP region True F0 True F0 F0 Hyp F0 Hyp is large is small is large is small Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Likelihood Definition Likelihood of observing these peaks Likelihood of not having any harmonics in the NP region True F0 F0 Hyp is large Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Pitch Trajectory Formation How to form pitch trajectories ? View it as a constrained clustering problem! We use two clustering cues Global timbre consistency Local time-frequency locality Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Global Timbre Consistency Objective function Minimize intra-cluster distance Harmonic structure feature Normalized relative amplitudes of harmonics Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Local Time-frequency Locality Constraints Must-link: similar pitches in adjacent frames Cannot-link: simultaneous pitches Finding a feasible clustering is NP-hard! … Time Frequency Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Our Constrained Clustering Process 1) Find an initial clustering Labeling pitches according to pitch order in each frame: First, second, third, fourth … Time Frequency … Time Frequency Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Our Constrained Clustering Process 2) Define constraints Must-link: similar pitches in adjacent frames and the same initial cluster: Notelet Cannot-link: simultaneous notelets … Time Frequency Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Our Constrained Clustering Process 3) Update clusters to minimize objective function Swap set: A set of notelets in two clusters connected by cannot-links Swap notelets in a swap set between clusters if it reduces objective function Iteratively traverse all the swap sets … Time Frequency Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Data Set Data set Ground-truth pitch trajectories 10 J.S. Bach chorales (quartets, played by violin, clarinet, saxophone and bassoon) Each instrument is recorded individually, then mixed Ground-truth pitch trajectories Use YIN on monophonic tracks before mixing Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Experimental Results Mean +- Std Precision (%) Recall (%) How many pitches are correctly estimated? Klapuri, ISMIR2006 87.2 +- 2.0 66.2 +- 3.4 Ours 88.6 +- 1.7 77.0 +- 3.5 How many pitches are correctly estimated and put into the correct trajectory? Chance Approx 0.0 76.9 +- 11.0 67.1 +- 11.9 How many notes are correctly estimated? 46.0 +- 5.5 54.3 +- 5.5 Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Ground Truth Pitch Trajectories J.S. Bach, “Ach lieben Christen, seid getrost” Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Our System’s Output J.S. Bach, “Ach lieben Christen, seid getrost” Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Conclusion Our multi-pitch tracking system Multi-pitch estimation in single frame Estimate F0s by modeling peaks and the non-peak region Estimate polyphony, refine F0s estimates Pitch trajectory formation Constrained clustering Objective: timbre (harmonic structure) consistency Constraints: local time-frequency locality of pitches A clustering algorithm by swapping labels Results on music recordings are promising Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Thanks you! Q & A Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

Possible Questions How much does our constrained clustering algorithm improve from the initial pitch trajectory (label pitches by pitch order)? Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu