/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17, 2005 1.

Slides:



Advertisements
Similar presentations
KARAOKE FORMATION Pratik Bhanawat (10bec113) Gunjan Gupta Gunjan Gupta (10bec112)
Advertisements

Computational Rhythm and Beat Analysis Nick Berkner.
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
Overview of Real-Time Pitch Tracking Approaches Music information retrieval seminar McGill University Francois Thibault.
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
tracking beat tracking beat Mcgill university :: music technology :: mumt 611>>
/25 Singer Similarity A Brief Literature Review Catherine Lai MUMT-611 MIR March 24,
Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom.
Evaluation of the Audio Beat Tracking System BeatRoot By Simon Dixon (JNMR 2007) Presentation by Yading Song Centre for Digital Music
Soundprism An Online System for Score-informed Source Separation of Music Audio Zhiyao Duan and Bryan Pardo EECS Dept., Northwestern Univ. Interactive.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.
Music Processing Roger B. Dannenberg. Overview  Music Representation  MIDI and Synthesizers  Synthesis Techniques  Music Understanding.
Music Processing Roger B. Dannenberg. Overview  Music Representation  MIDI and Synthesizers  Synthesis Techniques  Music Understanding.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Introduction to MIR Course Overview 1.
1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Polyphonic Queries A Review of Recent Research by Cory Mckay.
Instrument Recognition in Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
Project 1 : Eigen-Faces Applied to Speech Style Classification Brad Keserich, Senior, Computer Engineering College of Engineering and Applied Science;
Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.
Jacob Zurasky ECE5526 – Spring 2011
Using Blackboard Systems for Polyphonic Transcription A Literature Review by Cory McKay.
Automatic synchronisation between audio and score musical description layers Antonello D’Aguanno, Giancarlo Vercellesi Laboratorio di Informatica Musicale.
Fundamentals of Music Processing
Audio Thumbnailing of Popular Music Using Chroma-Based Representations Matt Williamson Chris Scharf Implementation based on: IEEE Transactions on Multimedia,
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.
Extracting Melody Lines from Complex Audio Jana Eggink Supervisor: Guy J. Brown University of Sheffield {j.eggink
Polyphonic Transcription Bruno Angeles McGill University - Schulich School of Music MUMT-621 Fall /14.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
©2009 Mladen Kezunovic. Improving Relay Performance By Off-line and On-line Evaluation Mladen Kezunovic Jinfeng Ren, Chengzong Pang Texas A&M University,
NON-NEGATIVE MATRIX FACTORIZATION FOR REAL TIME MUSICAL ANALYSIS AND SIGHT-READING EVALUATION Chih-Chieh Cheng, Diane J. Hu, and Lawrence K. Saul, UC San.
Song-level Multi-pitch Tracking by Heavily Constrained Clustering Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio.
Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.
Exploiting cross-modal rhythm for robot perception of objects Artur M. Arsenio Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory.
Performance Comparison of Speaker and Emotion Recognition
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.
Query by Singing and Humming System
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
Piano Music Transcription Wes “Crusher” Hatch MUMT-614 Thurs., Feb.13.
Alex Stabile. Research Questions: Could a computer learn to distinguish between different composers? Why does music by different composers even sound.
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante,
1 Tempo Induction and Beat Tracking for Audio Signals MUMT 611, February 2005 Assignment 3 Paul Kolesnik.
Automatic Transcription of Polyphonic Music
Onset Detection, Tempo Estimation, and Beat Tracking
David Sears MUMT November 2009
Rhythmic Transcription of MIDI Signals
Catherine Lai MUMT-611 MIR February 17, 2005
MATCH A Music Alignment Tool Chest
Introduction to Music Information Retrieval (MIR)
Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003
Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
Presenter: Shih-Hsiang(士翔)
Measuring the Similarity of Rhythmic Patterns
Music Signal Processing
Presentation transcript:

/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,

/14 Outline of Presentation Introduction –transcription of polyphonic music –targeted on specific instruments Current state-of-the-art: various approaches Recent published piano transcription systems –Dixon, 2000 –Raphael, 2002 –Monti and Sandler, 2002 –Marolt, 2004 Discussion and Conclusion Links to examples of transcription of piano music recordings Bibliography 2

/14 Introduction Transcription of polyphonic music –acoustical waveform --> parametric representation –extract pitches, starting times, durations First attempt by Moorer, 1975 –note range limitation –two voices constraint Martin, 1996 –piano transcription system up to four voices –chorale style of J.S. Bach (long durations with block chords) Future systems tackled limitations –targeted system on specific instruments Focus of this literature review: –automated transcription of polyphonic piano music 3

/14 Current State-of-the-Art: Various Approaches Automated transcription of polyphonic piano music –input: audio files containing polyphonic piano music –output: MIDI representing pitch, timing, volume Simon Dixon, “On the Computer Recognition of Solo Piano Music” –standard SP, adaptive peak-picking, pattern matching Christopher Raphael, “Automatic Transcription of Piano Music” –HMM Monti and Sandler, “Automatic Polyphonic Piano Note Extraction Using Fussy Logic in a Blackboard System” –blackboard algorithm Matija Marolt, 2004 “A connectionist approach to automatic transcription of polyphonic music” –neural network models 4

/14 Published Piano Transcription System Simon Dixon, “On the computer recognition of solo piano music” [standardized SP approach] 1st processing stage –low-filtering --> down-sampling signal (12kHz) Time-frequency representation –STFT --> power spectrum --> spectral peak extraction (local maxima > threshold, adaptive peak-picking algorithm) –frequency tracks --> grouping partials --> musical notes Evaluation: 13 Mozart piano sonata performed by a concert pianist –Bösendorfer SE290 computer-monitored piano --> MIDI Results: N=no. correctly i.d. notes; FP=no. note reported not played; FN=no. notes played not reported by system ; incorrectly I.d. note = FP and FN –score = N/(FP + FN + N) –recognition accuracy of 70-80% Future development: –accuracy of dynamic and offset times 5

/14 Published Piano Transcription System: Christopher Raphael, “Automatic transcription of piano music” [HMM] HMM- trained likelihood model –statistical pattern recognition and machine learning for structures Process –segment signal to frames; extract features (vector) from frames; assign label for content description Precise vector features –total energy (play or silent) –local “burstiness” (attack, steady behavior) –pitch configuration Label –sound pitches collection and re-articulation (attack, sustain, rest) Model setup –hidden process (label process); observable process (feature vector) –generate reasonable hypotheses for each frame and construct search graph of the hypotheses 6

/14 Published Piano Transcription System: Christopher Raphael, “Automatic transcription of piano music” [HMM] Experiment –Mozart piano sonata –limitations on range (c two octave below middle c to the f to two and a half octave above middle c) –number of voices 4 or less Evaluation –borrowed from speech evaluation of “Word Error Recognition Rate” –Error Rate = 100 * (Insertions + Deletions + Substitutions) / (Total Words in Truth Sentence) –preliminary results have a “Note Error Rate” of 39%  184 substitutions, 241 deletions, 108 insertions out of 1360 notes Future improvement –simple additions may yield better results  likelihood of chord sequence  informative note onsets acoustic cues 7

/14 Published Piano Transcription System: Monti and Sandler “Automatic polyphonic piano note extraction using fussy logic in a blackboard system” [Blackboard algorithm] Implementation –Polyphonic Note Recognition using a Fuzzy Inference System (FIS) as part of the Knowledge Sources (KSs) in a Blackboard system Blackboard model arrangement –hierarchy of data abstraction level –KSs dictate advancement and is activated by Scheduler FIS –take spectral peaks not selected –create new Note Candidates –evaluate Candidate by features  fundamental of note  harmonic rate  difference bt max peak in spectrum and Candidate’s fundamental energy 8 Blackboard system (Monti and Sandler, 2002)

/14 Published Piano Transcription System: Monti and Sandler “Automatic polyphonic piano note extraction using fussy logic in a blackboard system” [Blackboard algorithm] Evaluation –14 piano pieces by various composer including Beethoven, Mozart, Debussy, Ravel, and Scarlatti Results N=correctly i.d. notes; FP=note not played; FN=notes not reported by sys –score = N/(FP + FN + N) Dixon’s –detection success rate = 45% correct –75% = correctly detected note / total transcribed notes 9

/14 Published Piano Transcription System: Matija Marolt, 2004 “A connectionist approach to automatic transcription of polyphonic piano music.” [Neural networks approach] New model based on networks of adaptive oscillators was proposed and implemented in SONIC to partial tracking and note recognition 10 Marolt, neural networks; others tested multilayer perception, radial basis function, etc. 1. acoustical waveform -->time- feq space with an auditory model 2. auditory model output set of freq channel 3. periodicity in frequency channels is related to pitch perception 4. use adaptive oscillators to calculate periodicity in frequency channels 5.adaptive oscillators try to synchronize to signals in output freq channels of the auditory model by adjusting its phase and frequency 6. When synchronized to the output freq indicate the freq is periodic and a partial with feq sim to filter present

/14 Published Piano Transcription System: Matija Marolt, 2004 “A connectionist approach to automatic transcription of polyphonic piano music.” [Neural networks approach] Evaluation –tested on synthesized and real recordings of various genre Results –synthesized recoding around 90% of all notes –real recording results not as good (not available) –most common error (> 50%) octaves and rapidly played notes (e.g.arpeggios, trills) –greatest challenge very expressive playing  Chopin’s Nocturnes  quiet and almost inaudible left hand Further Development –detecting repeated notes 11 Marolt, 2004

/14 Discussion and Conclusion Various approaches proposed –standard S.P. techniques; HMM; blackboard algorithm; neural networks Common mistakes –octave, rapid passages, and quiet notes Difficulties –lack standard set of test examples –evaluation function  various constraints and formula -- > comparison difficult 12 Piano transcription systemPerformance results Dixon70-80% correct SONIC80-95% correct Raphael39% wrong Monti and Sandler74% correct

/14 Links to examples of transcription of piano music recordings (Marolt) (Dixon) 13

/14 Bibliography Dixon, S On the Computer Recognition of Solo Piano Music. Australasian Computer Music Conference Marolt, M A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 6, no. 3 (June): Martin, K A blackboard system for automatic transcription of simple polyphonic music. MIT Media Laboratory Perceptual Computing Section Technical Report No Montipi, G, and M. Sandler Automatic Polyphonic Piano Note Extraction Using Fuzzy Logic in a Blackboard System. Proceedings of the International Conference on Digital Audio Effects Moorer, J On the segmentation and analysis of continuous musical sound by digital computer. Ph.D. thesis, Stanford University, CCRMA. Raphael, C Automatic Transcription of Piano Music. Proceedings of the International Conference on Music Information Retrieval. 14