Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003

Slides:

Advertisements

Similar presentations

Evidential modeling for pose estimation Fabio Cuzzolin, Ruggero Frezza Computer Science Department UCLA.

Advertisements

Speech Enhancement through Noise Reduction By Yating & Kundan.

Multipitch Tracking for Noisy Speech

Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.

Enabling Access to Sound Archives through Integration, Enrichment and Retrieval Report about polyphonic music transcription.

1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.

Foreground Modeling The Shape of Things that Came Nathan Jacobs Advisor: Robert Pless Computer Science Washington University in St. Louis.

Multiple People Detection and Tracking with Occlusion Presenter: Feifei Huo Supervisor: Dr. Emile A. Hendriks Dr. A. H. J. Stijn Oomes Information and.

Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,

Single-Channel Speech Enhancement in Both White and Colored Noise Xin Lei Xiao Li Han Yan June 5, 2002.

A Bayesian algorithm for tracking multiple moving objects in outdoor surveillance video Department of Electrical Engineering and Computer Science The University.

16722 Sensing and Sensors Mel Siegel )

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

Communications & Multimedia Signal Processing Analysis of Effects of Train/Car noise in Formant Track Estimation Qin Yan Department of Electronic and Computer.

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh

/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,

Adaptive Signal Processing Class Project Adaptive Interacting Multiple Model Technique for Tracking Maneuvering Targets Viji Paul, Sahay Shishir Brijendra,

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Audio and Music Representations (Part 2) 1.

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.

A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST

Instrument Recognition in Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield

Multimodal Interaction Dr. Mike Spann

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.

1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.

University of Southern California Department Computer Science Bayesian Logistic Regression Model (Final Report) Graduate Student Teawon Han Professor Schweighofer,

Mapping and Localization with RFID Technology Matthai Philipose, Kenneth P Fishkin, Dieter Fox, Dirk Hahnel, Wolfram Burgard Presenter: Aniket Shah.

Crowdsourcing with Multi- Dimensional Trust Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department of Electrical.

1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.

Jacob Zurasky ECE5526 – Spring 2011

Audio Thumbnailing of Popular Music Using Chroma-Based Representations Matt Williamson Chris Scharf Implementation based on: IEEE Transactions on Multimedia,

Name : Arum Tri Iswari Purwanti NPM :

Experimental Results ■ Observations:  Overall detection accuracy increases as the length of observation window increases.  An observation window of 100.

Basics of Neural Networks Neural Network Topologies.

Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Using Feed Forward NN for EEG Signal Classification Amin Fazel April 2006 Department of Computer Science and Electrical Engineering University of Missouri.

Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.

A Trust Based Distributed Kalman Filtering Approach for Mode Estimation in Power Systems Tao Jiang, Ion Matei and John S. Baras Institute for Systems Research.

Performance Comparison of Speaker and Emotion Recognition

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Ch.9 Bayesian Models of Sensory Cue Integration (Mon) Summarized and Presented by J.W. Ha 1.

Tracking with dynamics

Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.

EE 551/451, Fall, 2006 Communication Systems Zhu Han Department of Electrical and Computer Engineering Class 15 Oct. 10 th, 2006.

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

Piano Music Transcription Wes “Crusher” Hatch MUMT-614 Thurs., Feb.13.

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

This research was funded by a generous gift from the Xerox Corporation. Beat-Detection: Keeping Tabs on Tempo Using Auto-Correlation David J. Heid

Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante,

Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.

Automatic Transcription of Polyphonic Music

A 2 veto for Continuous Wave Searches

Catherine Lai MUMT-611 MIR February 17, 2005

ARTIFICIAL NEURAL NETWORKS

Outlier Processing via L1-Principal Subspaces

Department of Civil and Environmental Engineering

Inconsistent Constraints

Markov ó Kalman Filter Localization

Dynamical Statistical Shape Priors for Level Set Based Tracking

Information-Theoretic Listening

Linear Predictive Coding Methods

PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD

Graduate School of Information Sciences, Tohoku University

EE513 Audio Signals and Systems

Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611

Dealing with Acoustic Noise Part 1: Spectral Estimation

Presenter: Shih-Hsiang(士翔)

Measuring the Similarity of Rhythmic Patterns

Presentation transcript:

Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003 Toward Automatic Transcription -- Pitch Tracking In Polyphonic Environment Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Outline Introduction Background problems in polyphonic pitch tracking Previous approaches Sinusoidal Modeling Auditory Modeling Current Approach Use of Prior Knowledge - Bayesian Probability Network Implementation Results Conclusion Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Introduction What do we have? What do we need? Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Pitch estimation Process: Segmentation /Rhythm tracking “pitch info” extraction Feature analysis Tone model Most probable F0 Candidates Eliminate interfering harmonics Best pitch estimate Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Problems with Polyphonic Pitch extraction Mathematically ambiguous problem Overlapping partials expressionist performance, not traceable Onset asynchronies Percussion sounds in “real world” signals Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Past work Sinusoidal Model: STFT, Constant Q transforms, Bounded Q transforms More focussed on forming a mathematical model of pitch perception Auditory Model: Lyon’s Cochlear Model, Meddis & Hewitt Model More focussed on laying a perceptual background Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Encountered Problems They do not eliminate the confusion due to overlapping partials Frame to Frame independent calculation Approach: Use higher level knowledge Cross frame data integration Probabilistic/ ‘belief based’ approach Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Current approach Step 1: Using Auditory model to extract sound as perceived by the ear Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Current Approach ( Contd. ) Step 2: Extract pertinent features of the sound ( Loudness, F0 & color)--use of Summary Auto-Correlation Function (SACF) Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Bayesian modeling Step3: use of the features as knowledge base Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Implementation Assign a priori pdfs to the parameters => The joint posterior probabilities are obtained as: Where M= 2q Q Hq, q ={q , Hq },  = N/2 +  , p(MQ) = p ( {q } q =1:Q 2 represents the expected SNR , Gc Composite basis matrix Reference :Wamsley Godsill & Rayner -1999 Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Implementation (Contd.) Avg frequency over the block and its variance For each multi-frame, Collect the peaks,multiply with the reliability vector pass the output through a weighted median filter Find error by comparing the evolving model and the observed data Update the reliability vector repeat the process to minimize the error Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Results Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Keerthi C Nagaraj, Department of Electrical & Computer Engineering Results (Contd.) Keerthi C Nagaraj, Department of Electrical & Computer Engineering

Conclusion & Future work Auditory model for pitch perception was implemented Hierarchy of music information was modeled as a simple Bayesian probability network Pitch tracking was done using auditory model front end processing and knowledge based resolving of partials Beat tracking can be done to shorten focus of pitch detection to the steady state areas of sound Other auditory “cues” can be added to the BPN. Musical Instrument models can be used to enhance the transcription process Feasibility of adding new parameters can be tested for impact on transcription. Keerthi C Nagaraj, Department of Electrical & Computer Engineering