Spectral centroid PianoFlute Piano Flute decayed not decayed F0-dependent mean function which captures the pitch dependency (i.e. the position of distributions.

Slides:

Advertisements

Similar presentations

Component Analysis (Review)

Advertisements

Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.

Easily extensible unix software for spectral analysis, display modification, and synthesis of musical sounds James W. Beauchamp School of Music Dept.

Pattern Recognition and Machine Learning

Timbre perception. Objective Timbre perception and the physical properties of the sound on which it depends Formal definition: ‘that attribute of auditory.

Timbre A description of the actual sounds that you hear. “Tone color” or “quality”

Music Appreciation Musical Instruments.

Harmonic Series and Spectrograms 220 Hz (A3) Why do they sound different? Instrument 1 Instrument 2Sine Wave.

Identifying Frequencies. Terms: LoudnessPitchTimbre.

Musical Instruments Contents: What is the difference between high and low notes? Why do different instruments sound different? What does it mean to play.

Tetsuro Kitahara* Masataka Goto** Hiroshi G. Okuno*

How to find concert pitch. C instruments – flute, oboe, trombone, bc baritone, tuba & percussion: No change: Concert C = C B ♭ instruments – clarinet,

AUTOMATIC SPEECH CLASSIFICATION TO FIVE EMOTIONAL STATES BASED ON GENDER INFORMATION ABSTRACT We report on the statistics of global prosodic features of.

G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.

Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.

Dr. Jie ZouPHY Chapter 21 Musical Sound Noise corresponds to an irregular vibration of the eardrum produced by some irregular vibration. The sound.

Recent Research in Musical Timbre Perception James W. Beauchamp University of Illinois at Urbana-Champaign Andrew B. Horner Hong University of Science.

Presented By: Miss Cross.  There are four main types of instruments: ◦ Winds ◦ Brass ◦ Strings ◦ Percussion  Which instruments belong to which categories?

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh

Database Construction for Speech to Lip-readable Animation Conversion Gyorgy Takacs, Attila Tihanyi, Tamas Bardi, Gergo Feldhoffer, Balint Srancsik Peter.

Instrument Recognition in Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield

Music Core Content.

Questions about Sound in pipes

This week: overview on pattern recognition (related to machine learning)

Presented by Tienwei Tsai July, 2005

Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis for Speech Recognition Bing Zhang and Spyros Matsoukas BBN Technologies Present.

International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.

Vibrating Strings and Resonance in Air Columns. String Instruments  In many musical instruments, the source sets a string into vibration  Standing waves.

2-4 Quality and Sound Sound quality is also called timbre Every sound produced has a fundamental tone which is the overall pitch or frequency of a tone.

Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.

Prognosis of Gear Health Using Gaussian Process Model Department of Adaptive systems, Institute of Information Theory and Automation, May 2011, Prague.

MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

What about the rubber bands determines pitch? Musical Instruments - Strings  The pitch or frequency of a string is determined by the string’s velocity.

CH. 21 Musical Sounds. Musical Tones have three main characteristics 1)Pitch 2) Loudness 3)Quality.

Singer similarity / identification Francois Thibault MUMT 614B McGill University.

P RW GEI: Poisson Random Walk based Gait Recognition Intelligent Systems Research Centre School of Computing and Intelligent Systems,

Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.

A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER

Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.

Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.

Speech Lab, ECE, State University of New York at Binghamton  Classification accuracies of neural network (left) and MXL (right) classifiers with various.

MSc Project Musical Instrument Identification System MIIS Xiang LI ee05m216 Supervisor: Mark Plumbley.

Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.

Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.

METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.

2D-LDA: A statistical linear discriminant analysis for image matrix

Neural Network Recognition of Frequency Disturbance Recorder Signals Stephen Tang REU Final Presentation July 22, 2014.

On the relevance of facial expressions for biometric recognition Marcos Faundez-Zanuy, Joan Fabregas Escola Universitària Politècnica de Mataró (Barcelona.

15.1 Properties and Detection of Sound Interference of sound waves.

Intro to Fourier Series BY JORDAN KEARNS (W&L ‘14) & JON ERICKSON (STILL HERE )

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Introduction to Music Musical Instruments

LECTURE 11: Advanced Discriminant Analysis

Instrumentation and Transposition

School of Computer Science & Engineering

Review – Standing Waves

Hybrid Features based Gender Classification

Mr. McVay 3rd Grade Beginning Music

Special Topics In Scientific Computing

ECE539 final project Instructor: Yu Hen Hu Fall 2005

Pattern Recognition and Machine Learning

Presentation on Timbre Similarity

CAMCOS Report Day December 9th, 2015 San Jose State University

Combination of Feature and Channel Compensation (1/2)

Hairong Qi, Gonzalez Family Professor

Presentation transcript:

Spectral centroid PianoFlute Piano Flute decayed not decayed F0-dependent mean function which captures the pitch dependency (i.e. the position of distributions of each F0) F0-normalized covariance which captures the non-pitch dependency Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution Tetsuro Kitahara*, Masataka Goto** and Hiroshi G. Okuno* (*Graduate School of Informatics, Kyoto University, Japan, **PRESTO JST / National Institute of Advanced Industrial Science and Technology, Japan) It is to obtain the names of musical instruments from sounds (acoustical signals). It is a kind of pattern recognition. It is useful for various applications. e.g.automatic music transcription, music information retrieval, MPEG-7 annotation, human-robot interaction via music, and many entertainment applications Its research began recently (since 1990s). 1. What is musical instrument identification? Feature Extraction (e.g. Decay speed, Spectral centroid) p(X|w flute ) p(X|w piano ) w = argmax p(w|X) = argmax p(X|w) p(w) piano 2. What is difficult in musical instrument identification? The pitch dependency of timbre e.g.Low-pitch piano sound = Slow decay High-pitch piano sound = Fast decay (a) Pitch = C2 (65.5Hz) time [s] (b) Pitch = C6 (1048Hz) time [s] In previous studies… The pitch dependency of timbre was pointed out, but was NOT dealt with explicitly. 3. How is the pitch dependency coped with? 1.Approximate the pitch dependency of each feature as a function of fundamental frequency (F0). 2.Estimate feature distributions of each F0 using this function.  F0-dependent multivariate normal distribution  The pitch dependency of timbre and its function approximation It is a distribution for representing musical sound features depending on the pitch. It has following two parameters: F0-dependent mean function: obtained by function approximation of the pitch dependency of each feature. F0-normalized covariance: obtained by normalizing the F0-dependent mean. The pitch dependency and the non-pitch dependency of timbre can be separated by estimating these parameters. 4. F0-dependent multivariate normal distribution 5. A musical instrument identification method using the F0-dependent multivariate normal distribution 1 st step: Feature extraction 129 features defined based on consulting literatures are extracted. e.g. Spectral centroid (which captures brightness of tones) Decay speed of power 2 nd step: Dimensionality reduction First: PCA (principal component analysis) 129-dimension  79-dimension (with the proportion value of 99%) Second: LDA (linear discriminant analysis) 79-dimension  18-dimension 3 rd step: Parameter estimation of the F0-dependent multivariate normal distribution First: the F0-dependent mean function is approxi- mated as a cubic polynomial. Second: the F0-normalized covariance is obtained by normalizing the F0-dependent mean. Final step: Applying the Bayes decision rule The instrument w satisfying w = argmax [log p(X|w; f) + log p(w; f)] is determined as the result. eliminating the pitch dependency Experimental conditions: Database: A subset of RWC-MDB-I-2001 Consists of solo tones of 19 real instruments with all pitch range. Contains 3 individuals and 3 intensities for each instrument. Contains normal articulation only. The number of all sounds is 6,247. Using the 10-fold cross validation. Evaluate the performance both at individual-instrument level and at category level. Experimental results (Recognition rates): The proposed method improved recognition rates: 75.73%  79.73% (at individual level) (Error reduction rate: 16.48% ) 88.20%  90.65% (at category level) (Error reduction rate: 20.67% ) Recognition rates of 6 instruments were improved by more than 7%. Recognition rates of the piano were best improved. (74.21%  83.27%)  Because the piano has the wide pitch range. The Bayes decision rule vs. k-NN rule - PCA+LDA+Bayes achieved the best performance. - LDA improved the performance. - Bayes with 79 dim. showed poor performance. (  # of training data is not enough.) 6. Experiments Piano GuitarsClassical Guitar, Ukulele, Acoustic Guitar StringsViolin, Viola, Cello BrassTrumpet, Trombone SaxophonesSoprano Sax, Alto Sax, Tenor Sax, Baritone Sax Double ReedsOboe, Faggoto Clarinet Air ReedsPiccolo, Flute, Recorder The above categorization is adopted for evaluating the performance at category level. 7. Conclusions To cope with the pitch dependency of timbre in musical instrument identification, the F0-dependent multivariate normal distribution is proposed. Experimental results of identifying 6,247 solo tones of 19 instruments show that the proposed method improved the recognition rate (75.73%  79.73%). Future works include evaluation against mixture of sounds and development of application systems using the proposed method. We adopted Bayes (18 dim; PCA+LDA) Bayes (18 dim; PCA only) Bayes (79 dim; PCA only) 3-NN (18 dim; PCA+LDA) 3-NN (18 dim; PCA only) 3-NN (79 dim; PCA only) The 4 th IEEE Int’l Conf. on Multimedia & Expo (6 th -9 th July 2003 in Baltimore, MD, USA)