Computational analysis on folk music of Cyprus Internal report

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Acoustic/Prosodic Features

ASSESS: a descriptive scheme for speech in databases Roddy Cowie.

KARAOKE FORMATION Pratik Bhanawat (10bec113) Gunjan Gupta Gunjan Gupta (10bec112)

Detection, segmentation and classification of heart sounds

Franz de Leon, Kirk Martinez Web and Internet Science Group  School of Electronics and Computer Science  University of Southampton {fadl1d09,

1 Copyright 2011 G.Tzanetakis Music Information Retrieval George Tzanetakis Associate Professor, IEEE Senior Member.

Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.

Department of Computer Science University of California, San Diego

Overview of Real-Time Pitch Tracking Approaches Music information retrieval seminar McGill University Francois Thibault.

A System for Hybridizing Vocal Performance By Kim Hang Lau.

Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.

Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.

Dual-domain Hierarchical Classification of Phonetic Time Series Hossein Hamooni, Abdullah Mueen University of New Mexico Department of Computer Science.

Lyric alignment in popular songs Luong Minh Thang.

Data Mining and Text Analytics in Music Audi Sugianto and Nicholas Tawonezvi.

Evaluation of the Audio Beat Tracking System BeatRoot By Simon Dixon (JNMR 2007) Presentation by Yading Song Centre for Digital Music

Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.

LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.

1 Robust Temporal and Spectral Modeling for Query By Melody Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University.

T.Sharon 1 Internet Resources Discovery (IRD) Music IR.

Computer Science Department A Speech / Music Discriminator using RMS and Zero-crossings Costas Panagiotakis and George Tziritas Department of Computer.

1 Manipulating Digital Audio. 2 Digital Manipulation  Extremely powerful manipulation techniques  Cut and paste  Filtering  Frequency domain manipulation.

Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.

LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Sound Applications Advanced Multimedia Tamara Berg.

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.

Lecture 1 Signals in the Time and Frequency Domains

Speech Processing Laboratory

Classification. An Example (from Pattern Classification by Duda & Hart & Stork – Second Edition, 2001)

1 ELEN 6820 Speech and Audio Processing Prof. D. Ellis Columbia University Midterm Presentation High Quality Music Metacompression Using Repeated- Segment.

Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.

Audio Thumbnailing of Popular Music Using Chroma-Based Representations Matt Williamson Chris Scharf Implementation based on: IEEE Transactions on Multimedia,

TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.

Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.

Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.

Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Extracting Melody Lines from Complex Audio Jana Eggink Supervisor: Guy J. Brown University of Sheffield {j.eggink

ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska

Singer similarity / identification Francois Thibault MUMT 614B McGill University.

Pitch Estimation by Enhanced Super Resolution determinator By Sunya Santananchai Chia-Ho Ling.

Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,

Melodic Similarity Presenter: Greg Eustace. Overview Defining melody Introduction to melodic similarity and its applications Choosing the level of representation.

Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.

Query by Singing and Humming System

DYNAMIC TIME WARPING IN KEY WORD SPOTTING. OUTLINE KWS and role of DTW in it. Brief outline of DTW What is training and why is it needed? DTW training.

1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.

Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.

A shallow description framework for musical style recognition Pedro J. Ponce de León, Carlos Pérez-Sancho and José Manuel Iñesta Departamento de Lenguajes.

Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante,

Audio Processing Mitch Parry. Resource! Sound Waves and Harmonic Motion.

1 Tempo Induction and Beat Tracking for Audio Signals MUMT 611, February 2005 Assignment 3 Paul Kolesnik.

David Sears MUMT November 2009

Rhythmic Transcription of MIDI Signals

Evaluating Classifiers

Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.

8.0 Search Algorithms for Speech Recognition

Memory and Melodic Density : A Model for Melody Segmentation

Presented by Steven Lewis

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Pitch Estimation By Chih-Ti Shih 12/11/2006 Chih-Ti Shih.

EE513 Audio Signals and Systems

Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611

Data Transformations targeted at minimizing experimental variance

EE 492 ENGINEERING PROJECT

Measuring the Similarity of Rhythmic Patterns

Harmonically Informed Multi-pitch Tracking

Music Signal Processing

Presentation transcript:

Computational analysis on folk music of Cyprus Internal report Andreas Neocleous University of Groningen University of Cyprus April 2013

Objective of the study: Folk music classification - predict the class label of an unseen folk tune [1] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2009 Objective of the study: Global features models and Event features models for the task of folk song classification. Conclusions on the robustness of each feature model A global feature set summarizes a piece as a feature vector, which can be viewed as a data point in a feature space.

Folk music classification - predict the class label of an unseen folk tune [1] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2009 Global features: Alicante set of 28 global features – selected 12 features. 92 features computed by the program called Feature ANalysis Technology Accessing STatistics [2] – selected 37 features. The Jesser set - 40 pitch and duration statistics [3]. The McKay set of 101 global features,developed for the classification of orchestrated MIDI files [4].

Folk music classification - predict the class label of an unseen folk tune [1] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2009 Event features: Excerpt of the Scottish jig “With a hundred pipers”, illustrating the difference between global features and event features.

Event features: Classification with n-gram models Folk music classification - predict the class label of an unseen folk tune [1] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2009 Event features: Classification with n-gram models Used in probability, communication theory, computational linguistics 1) The probability of a piece is obtained by computing the joint probability of the individual events in the piece: 2) For each class a separate model is built. 3) The predicted class of a piece is the class whose model generates the piece with the highest probability.

Folk music classification - predict the class label of an unseen folk tune [1] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2009 Europa-6 collection

Folk music classification - predict the class label of an unseen folk tune [1] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2009 Classification accuracies of the global feature sets on the Europa-6 collection, obtained by 10-fold cross validation. With a pentagram model of a linked viewpoint of melodic interval and duration, the obtained classification accuracy is 72.7%

Objective of the study: Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 Objective of the study: This investigation of the performance of three string methods Compare the performance of the string methods with the global feature models and event feature models Conclusions on the robustness of each feature model String methods rely on a sequential music representation which views a piece as a string of symbols. A pairwise similarity measure between the strings is computed and used to classify unlabeled pieces.

Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 Excerpt of the Scottish jig “With a hundred pipers”, illustrating the difference between global features, event features and the string representation.

String methods: (1) Sequence alignment Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 String methods: (1) Sequence alignment Estimation of the minimal cost of a transformation of one sequence into the other by means of edit operations, such as substition, insertion and deletion. Often referred to as “edit distance”, which is in fact the Levenshtein distance.

String methods: (2) Compression based distance Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 String methods: (2) Compression based distance K(x) is the Kolmogorov complexity of string x K(x|y) is the conditional complexity of string x given string y. How much information is not shared between the two strings relatively to the information that they could maximally share.

String methods: (3) String subsequence kernel (SSK) Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 String methods: (3) String subsequence kernel (SSK) Computes a similarity measure between strings based on the number and form of their common subsequences. Given any pair of two strings, SSK will find all common subsequences of a specified length k, also allowing non-contiguous matches, although these are penalized with a decay factor .

String methods: (3) String subsequence kernel (SSK) Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 String methods: (3) String subsequence kernel (SSK) SSK(k = 2, ‘ismir’,‘music’) = λ^5 + λ^6,

Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 Dance-9 collection:

Folk music classification - predict the class label of an unseen folk tune [5] Ruben Hillewaere, Bernard Manderick, Darrell Conklin, 2012 Results:

Folk songs - - + + Repeating parts - stanzas Segmentation Popular music Folk music - - + + Complex structure intro Verse Bridge chorus Professional productions Similar stanzas Repetitions Inaccurate singing of performers Variable tempo throughout the song Presence of noise Forget parts of lyrics or melody Switch to speaking

Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012

Preprocessing Detecting vocal pauses According to signal energy Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Preprocessing Detecting vocal pauses According to signal energy According to signal envelope According to relative difference of pitch

Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Preprocessing The input audio signal is mixed from stereo to a single channel The sample rate reduced to 11025 Hz The amplitude is normalized

Preprocessing Detecting vocal pauses According to signal energy Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Preprocessing Detecting vocal pauses According to signal energy Energy is below an experimentally determined threshold. Energy is computed on 200 ms long frames and the threshold is set to , where is the average energy of the signal. Consequent frames with energy values below the specified threshold are merged into one vocal pause. Vocal pauses shorter than times the average detected vocal pause length are ignored Parameters ξ1 and ξ2 were determined experimentally

Preprocessing Detecting vocal pauses According to signal envelope Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Preprocessing Detecting vocal pauses According to signal envelope The amplitude envelope of a signal is obtained by filtering the full-wave rectified signal using 4th order Butterworth filter with a normalized cutoff frequency of 0:001 Vocal pauses are parts of the signal where the envelope falls below the threshold ξ3 = -60dB

Preprocessing Detecting vocal pauses Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Preprocessing Detecting vocal pauses According to relative difference of pitch Detection of fundamental frequency (YIN algorithm [7]). Smooth fundamental frequencies with a low-pass filter. Parts of the signal that differ more than 20 semitones from the average signal frequency are selected as vocal pauses. endings of vocal pauses are used as candidates for stanza Beginnings.

Finding candidates for stanza boundaries Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Finding candidates for stanza boundaries Calculate 12 dimensional chromagrams A distance function between each pair of 12 dimensional chroma vectors (RMS) distance Where: c is the distance function between two chroma vectors a and b, ai and bi are i-th elements of chroma vectors

Finding candidates for stanza boundaries Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Finding candidates for stanza boundaries The defined distance function is used by the Dynamic Time Warping (DTW) algorithm for calculation of the total distance between the selected stanzas as: Where: p1 and p2 are candidate stanza beginnings. p1(l) and p2(l) are the corresponding chroma vectors The index l takes values from the first (1) to the last (L) chroma vector in the selected audio part.

Finding candidates for stanza boundaries Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Finding candidates for stanza boundaries The DTW is used for calculating the total distance between two stanza candidates: where cmin is the minimal cost between parts d0 and dj .

Finding candidates for stanza boundaries Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Finding candidates for stanza boundaries The chroma vectors are circularly shifted up to two semitones up and down to compensate for the outof-tune singing. We then select the lowest DTW distance as: where represents a rotation of chroma vectors for the selected stanza candidate from two semitones downwards to two semitones upwards in steps of one semitone.

Finding candidates for stanza boundaries Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Finding candidates for stanza boundaries Define a fitness function for scoring the candidate stanza beginnings ki as:

Finding candidates for stanza boundaries Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012 Finding candidates for stanza boundaries In the defined fitness function, peaks represent the most likely stanza beginnings, so all peaks above a global threshold, corresponding to the average value of the fitness function, are picked as the actual boundaries between stanzas.

Finding repeating stanzas in folk songs [6] Bohak C. and Marolt M., 2012

Music of Cyprus Fones Dances Religious Weak category with no sub-categories Κarpasitissa Αvgoritissa Paphididji Lyshiotissa Μariniotou Τyllirkotissa Ishia Κomitissa Αkathkiotissa Nekalisti Pegiotoua Zeimpekikos Kartsilamas Kalamatianos Syrtos Arapies

Music of Cyprus Preprocessing Fundamental frequency detection (YIN). Eliminate noise with an aperiodicity threshold. Eliminate silence with a loudness threshold. Octave/fifth errors: A common problem of frequency detection algorithms is the wrong octave detection, also referred to as octave errors, which implies that the fundamental frequency is confused with its multiples and/or other harmonics. To correct these errors a moving window was applied to detect and correct unexpected melodic jumps in the estimated pitch trajectory. Smoothing

Music of Cyprus Preprocessing Pitch track before pre-processing

Music of Cyprus Preprocessing Pitch track after pre-processing

Music of Cyprus Segmentation Detection of vocal pauses

Music of Cyprus Segmentation Detection of all peaks

Music of Cyprus Segmentation Detection of notes based on the difference of the peaks

Music of Cyprus Repetition

[1] Hillewaere R. , Manderick B. , Conklin D [1] Hillewaere R., Manderick B., Conklin D., Global feature versus event models for folk song classification. 10th International Society for Music Information Retrieval Conference (ISMIR), 2009. [2] D. Mullensiefen: FANTASTIC: Feature ANalysis Technology Accessing STatistics (In a Corpus): Technical Report v0.9, 2009. [3] B. Jesser: Interaktive Melodieanalyse, Peter Lang, Bern, 1991. [4] C. McKay and I., Fujinaga., Automatic genre classification using large high-level musical feature sets. Proceedings of the International Conference on Music Information Retrieval, pp. 525–530, 2004. [5] Hillewaere R., Manderick B., Conklin D., String methods for folk tune genre Classification. 13th International Society for Music Information Retrieval Conference (ISMIR), 2012. [6] Bohak C. and Marolt M., Finding repeating stanzas in folk songs. 13th International Society for Music Information Retrieval Conference [7] Cheveigne A. and a Kawahara H. YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4):1917–1930, 2002.