David Sears MUMT November 2009

Slides:

Advertisements

Similar presentations

Onset Detection University of Montreal > IFT6080 Machine Learning > Onset Detection A tutorial on Definitions PreprocessingReductionComparisonPeak Picking.

Advertisements

Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute INTRODUCTION TO KNOWLEDGE DISCOVERY IN DATABASES AND DATA MINING.

1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.

Computational Rhythm and Beat Analysis Nick Berkner.

Franz de Leon, Kirk Martinez Web and Internet Science Group  School of Electronics and Computer Science  University of Southampton {fadl1d09,

The evaluation and optimisation of multiresolution FFT Parameters For use in automatic music transcription algorithms.

Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.

Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.

tracking beat tracking beat Mcgill university :: music technology :: mumt 611>>

Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수

Onset Detection in Audio Music J.-S Roger Jang ( 張智星 ) MIR LabMIR Lab, CSIE Dept. National Taiwan University.

Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom.

Evaluation of the Audio Beat Tracking System BeatRoot By Simon Dixon (JNMR 2007) Presentation by Yading Song Centre for Digital Music

Soundprism An Online System for Score-informed Source Separation of Music Audio Zhiyao Duan and Bryan Pardo EECS Dept., Northwestern Univ. Interactive.

Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.

Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego.

Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.

DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.

Xi’an Jiaotong University 1 Quality Factor Inversion from Prestack CMP data using EPIF Matching Jing Zhao, Jinghuai Gao Institute of Wave and Information,

Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06.

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh

/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.

Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.

Paper by Craig Stuart Sapp 2007 & 2008 Presented by Salehe Erfanian Ebadi QMUL ELE021/ELED021/ELEM021 5 March 2012.

Lecture 3 MATLAB LABORATORY 3. Spectrum Representation Definition: A spectrum is a graphical representation of the frequency content of a signal. Formulae:

Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.

MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.

Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.

LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.

Extracting Melody Lines from Complex Audio Jana Eggink Supervisor: Guy J. Brown University of Sheffield {j.eggink

Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.

Polyphonic Transcription Bruno Angeles McGill University - Schulich School of Music MUMT-621 Fall /14.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.

Introduction to Onset Detection Functions HAO-HSUN LI 1/30.

Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.

Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.

Performance Comparison of Speaker and Emotion Recognition

Data Mining and Decision Support

Sung-Won Yoon, David ChoiEE368C Project Proposal Bandwidth Extrapolation of Audio Signals Sung-Won Yoon David Choi February 8 th, 2001.

Query by Singing and Humming System

A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.

Speech Processing Using HTK Trevor Bowden 12/08/2008.

Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:

RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.

This research was funded by a generous gift from the Xerox Corporation. Beat-Detection: Keeping Tabs on Tempo Using Auto-Correlation David J. Heid

What is automatic music transcription? Transforming an audio signal of a music performance in a symbolic representation (MIDI or score). Aim: This prototype.

Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante,

Audio Processing Mitch Parry. Resource! Sound Waves and Harmonic Motion.

Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.

1 Tempo Induction and Beat Tracking for Audio Signals MUMT 611, February 2005 Assignment 3 Paul Kolesnik.

Onset Detection, Tempo Estimation, and Beat Tracking

Rhythmic Transcription of MIDI Signals

Catherine Lai MUMT-611 MIR February 17, 2005

A review of audio fingerprinting (Cano et al. 2005)

CS 9633 Machine Learning Inductive-Analytical Methods

Robust Data Hiding for MCLT Based Acoustic Data Transmission

Tracking parameter optimization

Applying Key Phrase Extraction to aid Invalidity Search

Memory and Melodic Density : A Model for Melody Segmentation

Two-Stage Mel-Warped Wiener Filter SNR-Dependent Waveform Processing

Bayesian Nonparametric Matrix Factorization for Recorded Music

Digital Systems: Hardware Organization and Design

Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611

Govt. Polytechnic Dhangar(Fatehabad)

Measuring the Similarity of Rhythmic Patterns

Music Signal Processing

Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems NDSS 2019 Hadi Abdullah, Washington Garcia, Christian Peeters, Patrick.

Presentation transcript:

David Sears MUMT 621 12 November 2009 Audio Beat Tracking David Sears MUMT 621 12 November 2009

Outline Introduction BeatRoot Context-Dependent Beat Tracking MIREX 2009 Beat Tracking Contest Analytical Applications

Introduction Beat tracking in audio is the task of identifying and synchronizing with the basic pulse of a piece of music (Dixon 2007).

Goals Davies et al (2007) suggest 4 main desirable properties for a beat tracker: Both audio and symbolic musical signals can be processed. No a priori knowledge of the input is required. Perceptually accurate beat locations can be identified efficiently and in real-time. Changes in tempo can be followed, thereby indicating the rate of change of tempo.

BeatRoot BeatRoot preprocesses the input audio file with an onset detection function. Initial iterations of BeatRoot used the “surfboard” method, which involves smoothing the signal to produce an amplitude envelope before finding peaks in its slope using linear regression (Dixon 2003). However, because this method is necessarily lossy (it fails to detect onsets in which simultaneously sounding notes mask one another), the current iteration of BeatRoot employs a spectral flux onset detection function, which sums the change in magnitude in each frequency bin where the energy is increasing. Dixon 2007

However, because this method is necessarily lossy (it fails to detect onsets in which simultaneously sounding notes mask one another), the current iteration of BeatRoot employs a spectral flux onset detection function, which sums the change in magnitude in each frequency bin where the energy is increasing. Peak picking is then performed using a set of empirically determined constraints. Spectral flux was chosen over the complex domain detection function and the weighted phase deviation based on its simplicity for programming, speed of execution and accuracy of correct onsets. Dixon 2003

The tempo induction algorithm computes clusters of inter-onset intervals and then combines them by recognizing the approximate integer relationships between clusters (Dixon 2007). Each of the clusters is then weighted based on their integer relationships, and a ranked list of tempo hypotheses is produced and passed to the beat tracking subsystem. Dixon 2007

Beat tracking is performed using a tempo hypothesis derived from the tempo induction stage and the first onset of a given window is used to then predict further beats. Subsequent onsets that fall within the prediction window are treated as beats by BeatRoot, while those falling outside the window are taken to be possible beats, or disregarded altogether. Dixon 2007

MIREX 2006

Context Dependent Beat Tracking Complex spectral difference onset detection function Stage 1: General State Switches metrical levels Switches between the on and off-beat Stage 2: Context-dependent state Unable to react to tempo changes Two-State Model Note onsets are emphasized either as a result of significant change in energy in the magnitude spectrum, and/or deviation from the expected phase values in the phase spectrum (ie change in pitch). The role of the general state is to infer the time between successive beats, the beat period (IOI), and the offset between the start of the frame and the locations of the beats, the beat alignment.

Davies et al 2007

MIREX 2009 Two Test Sets for Evaluation McKinney Collection: A collection of 160 musical excerpts, annotated by 40 different listeners. Sapp’s Mazurka Collection: 322 files

Results—McKinney Collection TeamID F-Measure Cemgil Goto P-Score CMLC CMLT AMLC AMLT D Dg units -> (%) (bits) DRP1 24.6 17.5 0.4 37.4 3.8 9.6 7.8 19.2 0.625 0.008 DRP2 35.4 26.1 3.1 46.5 7.5 16.9 13.8 31.7 1.073 0.062 DRP3 47.0 35.6 16.2 52.8 19.4 27.6 45.3 61.3 1.671 0.245 DRP4 48.0 36.3 19.5 53.9 22.2 29.1 50.8 64.0 1.739 0.255 GP1 54.8 41.0 59.0 26.0 35.5 49.1 66.6 1.680 0.294 GP2 53.7 40.1 20.9 57.9 25.1 33.7 47.6 63.8 1.649 0.281 GP3 54.6 40.9 22.0 48.8 66.3 1.695 0.287 GP4 54.5 40.8 21.6 59.2 26.4 48.2 64.7 1.685 0.289 OGM1 39.7 6.5 47.8 11.9 19.9 29.5 47.4 1.283 0.100 OGM2 41.5 30.4 5.7 49.2 11.7 28.2 46.4 1.340 0.092 TL 50.0 37.0 14.1 53.6 17.9 24.3 35.9 1.480 0.150

Results—Sapp Collection TeamID F-Measure Cemgil Goto P-Score CMLC CMLT AMLC AMLT D Dg units -> (%) (bits) DRP1 27.9 19.9 0.0 37.3 1.5 10.3 2.3 13.8 0.126 0.008 DRP2 48.4 32.5 55.5 2.6 26.0 3.1 28.1 0.683 0.183 DRP3 67.8 61.5 2.5 65.3 7.8 42.1 9.2 45.6 1.424 0.993 DRP4 45.3 37.6 0.6 50.1 3.9 24.9 5.3 28.5 0.393 0.198 GP1 54.2 42.5 59.5 4.4 33.2 5.6 36.6 0.447 0.252 GP2 54.7 43.1 59.9 4.5 33.7 5.7 37.0 0.467 0.269 GP3 44.7 34.8 51.2 3.2 24.7 4.0 0.262 0.111 GP4 45.4 35.6 51.9 3.3 25.2 28.4 0.284 0.124 OGM1 30.7 21.9 38.6 1.7 11.0 2.8 15.4 0.134 0.014 OGM2 32.1 23.1 39.7 11.5 2.7 15.6 0.016 TL 44.9 35.7 0.3 51.0 20.5 4.7 22.5 0.440 0.095

Analytical Applications Automatic extraction of expressive timing information can therefore provide a means for studying individual differences in performance nuance, as well as provide keen insights into note-level performance rules that can be observed using machine learning techniques.

References Davies, M. A. Robertson, and M. Plumbley. 2009. MIREX 2009 Audio beat tracking evaluation: Davies, Robertson and Plumbley. In Proceedings of the International Society for Music Information Retrieval. Davies, M. E. P., and M. D. Plumbley. 2007. Context-dependent beat tracking of musical audio. IEEE Transactions on Audio, Speech and Language Processing, 15(3): 1009-20. Dixon, S. 2003. On the analysis of musical expression in audio signals. Draft for SPIE/IS&T Conference on Storage and Retrieval for Media Databases. Online http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.15.5379&rep=rep1&type=pdf. Dixon, S. 2007. Evaluation of the Audio Beat Tracking System BeatRoot. Preprint for Journal of New Music Research, 36. Online http://www.elec.qmul.ac.uk/people/simond/pub/2007/jnmr07.pdf. Lee, T. 2009. Audio beat tracking. In Proceedings of the International Society of Music Information Retrieval. Peeters, G. 2009. MIREX-09 “Audio beat tracking” task: IRCAMBEAT submission. In Proceedings of the International Society for Music Information Retrieval. Tzanetakis, G. 2009. Marsyas submissions to MIREX 2009. In Proceedings of the International Society for Music Information Retrieval. Widmer, G., S. Dixon, W. Goebl, E. Pampalk, and A. Tobudic. 2003. In Search of the Horowitz Factor. AI Magazine, 24(3): 111-29.