Ionian University Department of Informatics

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

Automatic Music Classification Cory McKay. 2/47 Introduction Many areas of research in music information retrieval (MIR) involve using computers to classify.
A Musical Data Mining Primer CS235 – Spring ’03 Dan Berger
Speaker Associate Professor Ning-Han Liu. What’s MIR  Music information retrieval (MIR) is the interdisciplinary science of retrieving information from.
Multimedia Database Systems
LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014.
Franz de Leon, Kirk Martinez Web and Internet Science Group  School of Electronics and Computer Science  University of Southampton {fadl1d09,
1 Copyright 2011 G.Tzanetakis Music Information Retrieval George Tzanetakis Associate Professor, IEEE Senior Member.
Department of Computer Science University of California, San Diego
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
A Music Search Engine Built upon Audio-based and Web-based Similarity Measures P. Knees, T., Pohle, M. Schedl, G. Widmer SIGIR 2007.
LYRIC-BASED ARTIST NETWORK METHODOLOGY Derek Gossi CS 765 Fall 2014.
Data Mining and Text Analytics in Music Audi Sugianto and Nicholas Tawonezvi.
Copyright Nov. 2002, George Tzanetakis Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon.
Postgraduate Department of Electrical Engineering PPGEE UFPR - Federal University of Paraná Luis Gustavo Weigert Machado
Enhancing discovery of the British Library’s audio collections Richard Ranft 23 June 2014 Making Metadata Work ISKO UK + IRSG + DCMI joint meeting.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Introduction to MIR Course Overview 1.
Exploring a million hours of sounds Richard Ranft, The British Library 27 November 2014 Search Solutions 2014.
Audio Retrieval David Kauchak cs458 Fall Administrative Assignment 4 Two parts Midterm Average:52.8 Median:52 High:57 In-class “quiz”: 11/13.
Sound Applications Advanced Multimedia Tamara Berg.
JSymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada.
Polyphonic Queries A Review of Recent Research by Cory Mckay.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
MACHINE LEARNING TECHNIQUES FOR MUSIC PREDICTION S. Grant Lowe Advisor: Prof. Nick Webb.
Music Emotion Recognition 許博智 謝承諺.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
August 12, 2004IAML - IASA 2004 Congress, Olso1 Music Information Retrieval, or how to search for (and maybe find) music and do away with incipits Michael.
JSymbolic Cedar Wingate MUMT 621 Professor Ichiro Fujinaga 22 October 2009.
Student: Mike Jiang Advisor: Dr. Ras, Zbigniew W. Music Information Retrieval.
Music Information Retrieval -or- how to search for (and maybe find) music and do away with incipits Michael Fingerhut Multimedia Library and Engineering.
Aspects of Music Information Retrieval Will Meurer School of Information University of Texas.
A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,
Multimodal Information Analysis for Emotion Recognition
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
Combining Audio Content and Social Context for Semantic Music Discovery José Carlos Delgado Ramos Universidad Católica San Pablo.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
Performance Comparison of Speaker and Emotion Recognition
1/12/ Multimedia Data Mining. Multimedia data types any type of information medium that can be represented, processed, stored and transmitted over.
Issues in Automatic Musical Genre Classification Cory McKay.
Progress Report - Year 2 Extensions of the PhD Symposium Presentation Daniel McEnnis.
Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.
1 Automatic Music Style Recognition Arturo Camacho.
Improving Music Genre Classification Using Collaborative Tagging Data Ling Chen, Phillip Wright *, Wolfgang Nejdl Leibniz University Hannover * Georgia.
1 / 22 jSymbolic Jordan Smith – MUMT 611 – 6 March 2008.
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Elements of Music. Melody Single line of notes heard in succession as unit Phrases Cadences—Points of arrival/rest Conjunct vs. disjunct motion Contour:
Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante,
Late 20th/21st Centaury Music
The Greek Audio Dataset
David Sears MUMT November 2009
Recognition of bumblebee species by their buzzing sound
Tomás Pérez-García, Carlos Pérez-Sancho, José M. Iñesta
Efficient Image Classification on Vertically Decomposed Data
Introduction to Music Information Retrieval (MIR)
Social Knowledge Mining
Efficient Image Classification on Vertically Decomposed Data
Aspects of Music Information Retrieval
Musical Style Classification
Data Warehousing and Data Mining
Multimedia Information Retrieval
AUDIO SURVEILLANCE SYSTEMS: SUSPICIOUS SOUND RECOGNITION
Sound, language, thought and sense integration
Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
Measuring the Similarity of Rhythmic Patterns
Music Signal Processing
Presentation transcript:

Ionian University Department of Informatics   Introducing the Greek Music Dataset Dimos Makris, Ioanis Karydis, and Spyros Sioutas

Music Information Retrieval (MIR) MIR refers to the interdisciplinary research of retrieving information from music. Involves musicology, psychology, academic music study, signal processing and machine learning. Applications: Recommender systems, Track separation and instrument recognition, Automatic music transcription (MIDI), Automatic categorization and Music generation.

Why we need musical data? What is a dataset? Collection of sound recordings, sheet music, lyrics as well as associated information to the musical content (i.e. metadata, social tags, etc) Why we need them? The requirement to experiment with the methods on real musical data is central. Allow researchers to compare and contrast their methods by testing them on commonly available collection of musical data.

Greek Music on MIR MIR requires data for all kinds of music. Although a number of widely used datasets do exist most of these are collections of mainstream English language music. Local music has numerous differences (different instruments and rhythms). Unique Genres like “Ρεμπέτικο”, “Λαϊκό” and “Έντεχνο”. Does not start from scratch. It is a continuation and extension of the Greek Audio Dataset [1]. [1] D. Makris, K. Kermanidis, and I. Karydis. The greek audio dataset. In Articial Intelligence Applications and Innovations, volume 437 of IFIP Advances in Information and Communication Technology, pages 165-173. Springer Berlin Heidelberg, 2014

Related Work regarding Datasets The construction of a music dataset is a tedious and demanding effort. Avoid containing music data but only metadata and information (large data, copyrights).

Contribution and Motivation The Greek Music Dataset 1400 songs Audio, lyrics & symbolic features for immediate use in MIR tasks Manually annotated labels pertaining to mood & genre styles of music. Metadata Manually selected MIDI files (currently available for 500 of the tracks). Manually selected link to a performance / audio content in YouTube is provided for further research

Greek Music Dataset vs Greek Audio Dataset +400 songs focused on traditional unique Greek genres 500 MIDI files with symbolic features sets Manually Multi Label Annotation on Genre tags Updated Audio Feature sets Lyric Feature sets Last FM ID tags for further extraction

Gathering the Content Audio: Broad range of Greek music, from traditional to modern. Removed 100 songs and added 500 new songs. Sources from best YouTube Links(Number of views, number of responses, best audio quality). Lyrics: Retrieved among various sources mainly from stixoi.info [2] Matches with the audio performance. Symbolic: MIDI files were collected from Greek Midi Database [3]. Preprocessed and checked manually for the music & performance's precise correspondence. [2] stixoi info: Greek lyrics for songs and poetry, http://www.stixoi.info/ [3] Greek Midi Database: George's Greek MIDI Site, http://http://www.greekmidi.com/

Genre Annotation Greek genre tags were taken from MyGreek.fm [4]. Greek musical culture oriented tags Rembetiko, Laiko, Entexno, Modern Laiko, Rock, Hip-Hop/R & B, Pop, Alternative Multi Label Assignment. Listening tests per song 2421 annotations 521 single label annotations from the 8 genre classes 748 double label annotations from 17 different combinations 119 triple label annotations from 15 different combinations 12 quad label annotations from 8 different combinations [4] Mygreek.fm: The biggest collection of Greek music on the Internet, with different styles and genres, http://www.mygreek.fm/

Mood Annotation Single Label Annotation. Measuring Valence (A-D) & Arousal (1-4) Mood information: The model of Thayer is adopted. 2 dimensional emotive plane with Valence (tension) and Arousal (energy). “Arousal" is the level/amount of physical response and “Valence" is the emotional "direction" of that emotion.

Audio Features Extraction from CD quality wave files (44,1KHz, 16 bit) using Marsyas software 454 Features divided in 4 sets. Timbral Texture Feature Sets Standard Timbral Set (68 features): Most commonly used feature set (MFCCs, Zero Crossing, Spectral features). Other Timbral Features (264 features): Combination which focus in magnitude spectrum. Rhythm Features Beat Histogram (18 features): A vector containing the most commonly rhythmic features (detecting and measuring peaks, bpm etc.) Pitch (Chroma) Content Features Chroma Set (104 features): Combination of Chroma and Linear Prediction Cepstral Coeficients (LPC) features. Mel-frequency cepstral coefficients είναι ένα σύνολο από αντιληπτά χαρακτηριστικά που έχουν χρησιμοποιηθεί ευρέως στην αναγνώριση ομιλίας Method of Moments consists of the first five statistical moments of the spectrograph

Lyric Features Selection of 5 feature sets based on the bag-of-words (BOW) model from Greek song lyrics. The most popular BOW features are various unigram, bigram, and trigram representations Metrics: GMD includes TF-IDF term weighting and TF (Term Frequency). 1. A unigram set of the top 250 words with the most occurrences. Includes “Function Words”. 2. A unigram set of the top 60 words with the most occurrences without counting the Function Words. 3. A bigram set of the top 100 bigram words with the most occurrences. 4. A trigram set of the top 60 trigram words with the most occurrences. 5. A unigram set of the top 60 function words with the most occurrences.

Symbolic Features High Level Features. Emphasize on the musical characteristics. Examples: Instruments present, melodic contour, chord frequencies and rhythmic density. More powerful than Audio Features. Rare use due to the lack of existing symbolic datasets. Feature extraction was done by Music21. 2 different feature sets. jSymbolic Set (78 features): It includes features regarding the instrumentation, rhythm, dynamics (loudness), chords and detecting melody variations or patterns. Native Music21 Set (17 features): Specialized and very high-level feature set. It requires a high level of musical harmony knowledge

Available Data + Metadata The GMD additionally includes for 621 of its tracks their equivalent Last.fm id aiming to facilitate information collection using the Last.fm's. Retrieve more information (social tags). The collection of the ids was made by manual processing GMD offers YouTube Links, lyrics and MIDI files for further feature extraction.

Dataset format The data is available in two formats, HDF5 and CSV. HDF5: Efficient for handling the heterogeneous types of information such as audio features in variable array lengths, names as strings, and easy for adding new types of features. Following Million Song Dataset (MSD) structure. CSV: Compatible for processing with Weka, RapidMiner and other similar data mining platforms. GMD provides the commonly used, on the discipline of MIR, audio feature sets in separate CSV files. Available download from the webpage of the Informatics in Humanistic and Social Sciences Lab http://di.ionio.gr/hilab/gmd

Future Directions The addition of the remaining tracks' symbolic information. MIDI and Audio Alignment. Incorporation of contextual information for each track from social networks. Addition of Last-FM ID tags (or similar) for further social tags extraction. Experimentation on data mining tasks using the dataset.

The Greek Music Dataset Thank you for your attention!