Download presentation
Presentation is loading. Please wait.
Published byLeon Goodwin Modified over 9 years ago
1
A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 2, FEBRUARY 2008 Sung Eun Park 2009-11-20 Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea
2
Copyright 2008 by CEBT Contents Introduction Simple concept of the model Body Regression approach Model Explanation Evaluation Conclusion Discussion Contribution Q&A 2
3
Copyright 2008 by CEBT Brief Concept of the Model Thayer’s arousal-valence emotion plane. ♬♬♬♬ ♬ 3
4
Copyright 2008 by CEBT An application using this concept Musicovery based on the same concept of this model. click Find relevant music of the point 4
5
Copyright 2008 by CEBT Many good regressor(regression algorithms ) are readily available. Given N inputs (x i, y i ), 1≤ i ≤ N, where x i is a feature vector for the ith input sample, and y i ∈ R is the real value to be predicted for the ith sample, the regression system trains a regression algorithm(regressor) R(∙) such that the mean squared error ε is minimized. Regression Approach 5 minimizea feature vectorReal Value Find this!! Predicted Value
6
Copyright 2008 by CEBT The model ♬ ♬ ♬ ♬ ♬ Ground Truth Musical Features Regressor Reg.A and Reg.V Subjective test Feature Extraction Regression Emotion Visualization 6
7
Copyright 2008 by CEBT The model in detail Preprocessing Regressor Training Training Data Subjective Test Feature extraction Reg.AReg.V Preprocessing Emotion Visualization Test Data Reg.AReg.V Feature extraction 7
8
Copyright 2008 by CEBT The dependency between the two dimensions, arousal and valence What is the positive music? Then what is the energetic music? Principle Component Analysis(CPA) is common way of reducing the correlation between variables. An Issue of the Continuous Perspective 8 energetic calm Computed by PCA Original data Principle component
9
Copyright 2008 by CEBT Reducing Correlation Between Variables 9 AV plane: some dependency exists PC plane: no dependency exists Train regressor R p,R q Test in PQ plane and compare with AV plane Details follow in the later presentation
10
Copyright 2008 by CEBT Dataset Preprocessing Regressor Training Training Data Subjective Test Feature extraction Reg.AReg.V 195 popular songs selected from a number of Western, Chinese, and Japanese albums. 1) These songs should be distributed uniformly in each quadrant of the emotion plane. 2) Each music sample should express a certain dominant emotion. 10
11
Copyright 2008 by CEBT 253 volunteers from the campus Subjective Test Preprocessing Regressor Training Training Data Subjective Test Feature extraction Reg.AReg.V Is asked to listen to ten music samples randomly drawn from the music database and to label the AV values from –1.0 to 1.0 in 11 ordinal levels. Label the evoking emotion rather than the perceived one 11 Standard deviation of evaluation to the same song is 0.3( which is okay) Same person tend to label same with same music.
12
Copyright 2008 by CEBT Feature Extraction Preprocessing Regressor Training Training Data Subjective Test Feature extraction Reg.AReg.V 12
13
Copyright 2008 by CEBT Feature Extraction Preprocessing Regressor Training Training Data Subjective Test Feature extraction Reg.AReg.V Psysound aims to model parameters of Auditory sensation based on some psychoacoustic models. Earlier research found that 15 of the features are more closely related to emotion perception. 13
14
Copyright 2008 by CEBT Feature Extraction Preprocessing Regressor Training Training Data Subjective Test Feature extraction Reg.AReg.V Select features from all extracted features which is related to Emotion. RReliefF is used as a feature extraction algorithm(FSA). RRF m,n is a space with top-m and top-n selected features. 14
15
Copyright 2008 by CEBT Regression Algorithms Preprocessing Regressor Training Training Data Subjective Test Feature extraction Reg.AReg.V Three regression algorithms: 1.Multiple linear regression (MLR) Assumes lineal relationship Simple method 2.Support vector regression (SVR) Nonlinearly maps input features into higher dimensional feature space In many cases superior to existing machine learning methods 3. AdaBoost.RT (BoostR) Nonlinear regression algorithm A number of regression trees are trained iteratively and weighted according to the prediction accuracy 15
16
Copyright 2008 by CEBT Method R 2 Statistics : showing how much prediction and real value are close. AV and PC Plane comparison : The effect of variance dependency Evaluation 16 The best combination No significant difference <<
17
Copyright 2008 by CEBT Evaluation 17 Regressor Comparison A plane with no correlation Selected feature space
18
Copyright 2008 by CEBT Evaluation – The Prediction Accuracy + Ground Truth Prediction Result The best performance of the regression approach reaches 58.3% for arousal and 28.1% for valence by using PC RRF SVR 18
19
Copyright 2008 by CEBT Performance Evaluation Using same ground truth data and feature data 19 =100.3 =117.7
20
Copyright 2008 by CEBT Subjectivity issue Individual difference : influence of many factors. Cultural background, generation, sex, and personality. GWMER(Group-wise MER scheme) Personalization can be an alternative way. Discussion 20 R1 R… R2 R3 R4 Regressor G1 G… G2 G3 G4 Users Regressor Choosing
21
Copyright 2008 by CEBT Contribution One of the first attempts that develop an MER system from a continuous perspective.(Each song maps to a point in the emotion plane) A sound theoretical foundation is proposed. Regression theory. Extensive performance study. Several algorithms are tested Dealing with subjectivity issues of Music Emotion Retrieval(MER). Emotion is different from person to person Two demensions in emotion plane are not dependent. 21
22
Thank you… Q&A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.