Automatic Transcription of Polyphonic Music

Slides:



Advertisements
Similar presentations
KARAOKE FORMATION Pratik Bhanawat (10bec113) Gunjan Gupta Gunjan Gupta (10bec112)
Advertisements

Multipitch Tracking for Noisy Speech
Computational Rhythm and Beat Analysis Nick Berkner.
Timbre perception. Objective Timbre perception and the physical properties of the sound on which it depends Formal definition: ‘that attribute of auditory.
Pitch Perception.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval Report about polyphonic music transcription.
Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom.
Chord Recognition EE6820 Speech and Audio Signal Processing and Recognition Mid-term Presentation JunHao Ip.
Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.
Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.
Exam and Assignment Dates Midterm 1 Feb 3 rd and 4 th Midterm 2 March 9 th and 10 th Final April 20 th and 21 st Idea journal assignment is due on last.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.
MUSCLE movie data base is a multimodal movie corpus collected to develop content- based multimedia processing like: - speaker clustering - speaker turn.
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,
Harmonics, Timbre & The Frequency Domain
LOOK 8/19/2015Theatre Arts 1(T) Sound: Properties and Functions Theatre Arts 1(T)
Harmonically Informed Multi-pitch Tracking Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab,
HCSNet December 2005 Auditory Scene Analysis and Automatic Speech Recognition in Adverse Conditions Phil Green Speech and Hearing Research Group, Department.
Instrument Recognition in Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
9 th Conference on Telecommunications – Conftele 2013 Castelo Branco, Portugal, May 8-10, 2013 Sara Candeias 1 Dirce Celorico 1 Jorge Proença 1 Arlindo.
Patent Liability Analysis Team 14: Faris Issa Sean King David Record Prateek Singhal.
Audio Scene Analysis and Music Cognitive Elements of Music Listening
Sound Pitch: (high and low) –Corresponds to size! Dynamics: (loud, soft) –Forte (f) –Mezzo Forte (mf) –Mezzo Piano (mp) –Piano (p) Timbre/Tone Color: (bright,
JSymbolic Cedar Wingate MUMT 621 Professor Ichiro Fujinaga 22 October 2009.
Student: Mike Jiang Advisor: Dr. Ras, Zbigniew W. Music Information Retrieval.
HOARSE Mid Term Review Coordinator’s Report Phil Green University of Sheffield, UK.
Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
Informing Multisource Decoding for Robust Speech Recognition Ning Ma and Phil Green Speech and Hearing Research Group The University of Sheffield 22/04/2005.
Chapter 12 Sound Hr Physics. Sound  Vibrations in matter. No one need be around to hear it.  Composed of Compressions & Rarefactions.  Compressions.
Extracting Melody Lines from Complex Audio Jana Eggink Supervisor: Guy J. Brown University of Sheffield {j.eggink
Sound Pitch: (high and low) –Corresponds to size! Dynamics: (loud, soft) –Forte (f) –Mezzo Forte (mf) –Mezzo Piano (mp) –Piano (p) Timbre/Tone Color: (bright,
Sound Representation Digitizing Sound Sound waves through Air Different Voltages Voltage converted to numbers.
Polyphonic Transcription Bruno Angeles McGill University - Schulich School of Music MUMT-621 Fall /14.
Instrument Classification in a Polyphonic Music Environment Yingkit Chow Spring 2005.
Sound Notes 3 Frequency, Pitch and Music. Frequency Frequency – the number of complete waves ______ _____________. Different sounds have ____________.
Song-level Multi-pitch Tracking by Heavily Constrained Clustering Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio.
Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
CLASSICAL MUSIC CHARACTERISTICS Melody is composed by means of symmetric and balanced musical phrases. Harmony becomes simple and regular.
Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.
MSc Project Musical Instrument Identification System MIIS Xiang LI ee05m216 Supervisor: Mark Plumbley.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
Rachel Gu. Wolfgang Amadeus Mozart Born in: 1756, Jan 27 th. Mozart showed talent since he was young. He mastered keyboard.
Piano Music Transcription Wes “Crusher” Hatch MUMT-614 Thurs., Feb.13.
Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University.
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
Elements of Classical Period. Elements Transition to classical period: (pre-classical period) Shift to more homophonic textures. Pioneers in.
BAROQUE AND CLASSICAL CHAMBER MUSIC – AOS2. This lesson… All of you will be able to name some features of Baroque and Classical Chamber music. All of.
Elements of Music. Melody Single line of notes heard in succession as unit Phrases Cadences—Points of arrival/rest Conjunct vs. disjunct motion Contour:
CS 445/656 Computer & New Media
Carmine Casciato MUMT 611 Thursday, March 13, 2005
Catherine Lai MUMT-611 MIR February 17, 2005
MUSICAL STRUCTURE ELEMENTS OF MUSIC.
RECURRENT NEURAL NETWORKS FOR VOICE ACTIVITY DETECTION
Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003
Carmine Casciato MUMT 611 Thursday, March 13, 2005
Pitch Estimation By Chih-Ti Shih 12/11/2006 Chih-Ti Shih.
Audio and Speech Computers & New Media.
AUDIO SURVEILLANCE SYSTEMS: SUSPICIOUS SOUND RECOGNITION
Measuring the Similarity of Rhythmic Patterns
Harmonically Informed Multi-pitch Tracking
Music Signal Processing
Understanding Standards An overview of course assessment
Automatic Prosodic Event Detection
Understanding Standards: An overview of course assessment
Presentation transcript:

Automatic Transcription of Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield {j.eggink g.brown}@dcs.shef.ac.uk

me Master in systematic musicology with computer science as a second subject, University of Hamburg, Germany, 2001 Interested in perception and computational auditory scene analysis And, of course, music Jana Eggink, Sheffield, UK Automatic Transcription of Music

Automatic Transcription of Music General idea: get a score from an audio file • Useful for: automatic music indexing and analysis, detection of copyright infringement, ‘query-by-humming’ systems... Many subtasks: F0 estimation, onset detection, instrument recognition flute Jana Eggink, Sheffield, UK Automatic Transcription of Music

Missing Data Ideas from speech recognition and speaker identification in noisy environments Missing data: ignore time frequency regions dominated by interfering sound sources Determination of unreliable of missing features based on F0s of interfering tones Works well with two simultaneous sounds But problems already with F0 estimation with 3 or 4 concurrent tones Jana Eggink, Sheffield, UK Automatic Transcription of Music

Change of Focus… Instead on using missing data masks based on interfering sound sources, concentrate on the signal you want to identify Task: Identify the solo instrument in accompanied sonatas and concertos Jana Eggink, Sheffield, UK Automatic Transcription of Music

Identify Solo Instrument • Instrument sounds are harmonic, energy is concentrated in partials which are least likely to be masked by other sounds • Only most dominant F0 needs to be identified • Features based only on frequency position and power of lowest 15 partials • Statistical recogniser (GMMs) trained on monophonic music flute clarinet oboe violin cello audio signal recog-niser features F0 and partials Jana Eggink, Sheffield, UK Automatic Transcription of Music

Results Instrument Identification • Solo instrument with accompaniment (piano or orchestra), commercially available CDs, 90 examples, 2-3 min. each • Instrument 86% correct, as good as other systems that only work with monophonic music Jana Eggink, Sheffield, UK Automatic Transcription of Music

Find the Melody (assuming the solo instrument is known) • Extract multiple F0 candidates • Include additional knowledge about instrument range, tone duration, likely interval transitions to pick correct candidate F0 strength (~loudness) F0 likelihood (absolute frequency | instrument range) instrument likelihood (recogniser output) LOCAL KNOWLEDGE TEMPORAL KNOWLEDGE tone length interval transitions find most likely ‘path’ through time- frequency space of F0 candidates AUDIO F0 candidates silence estimation (only accompaniment?) MELODY Jana Eggink, Sheffield, UK Automatic Transcription of Music

Results • Improving the number of correct frames from 40% to 54%, greatly reducing the number of spurious tones • Beginning of Mozart’s Clarinet Concerto, manually annotated F0s (gray) and estimated melody (black) Melody based on strongest F0 time (frames) F0 (Hz) F0 (Hz) Melody based on knowledge integrating path finding time (frames) Jana Eggink, Sheffield, UK Automatic Transcription of Music

Experiences in the Network Many techniques used for speech recognition are (with modifications) transferable to musical applications, but are often not well known among people who work with music, very good training in Sheffield Insight into other related areas in the HOARSE meetings, Which are also great for personal contacts Jana Eggink, Sheffield, UK Automatic Transcription of Music