1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. © Machine-Learning Based Classification Of Speech And Music Khan, MKS; Al-Khatib, WG SPRINGER, MULTIMEDIA SYSTEMS;

Slides:



Advertisements
Similar presentations
Speaker Associate Professor Ning-Han Liu. What’s MIR  Music information retrieval (MIR) is the interdisciplinary science of retrieving information from.
Advertisements

Hyeonsoo, Kang. ▫ Structure of the algorithm ▫ Introduction 1.Model learning algorithm 2.[Review HMM] 3.Feature selection algorithm ▫ Results.
E - BOOK FOR COLLEGE ALGEBRA King Fahd University of Petroleum & Minerals 2.5 E - BOOK FOR COLLEGE ALGEBRA King Fahd University of Petroleum & Minerals.
E - BOOK FOR COLLEGE ALGEBRA King Fahd University of Petroleum & Minerals 3.1 E - BOOK FOR COLLEGE ALGEBRA King Fahd University of Petroleum & Minerals.
2 1 Discrete Markov Processes (Markov Chains) 3 1 First-Order Markov Models.
Toward Automatic Music Audio Summary Generation from Signal Analysis Seminar „Communications Engineering“ 11. December 2007 Patricia Signé.
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
2004/11/161 A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition LAWRENCE R. RABINER, FELLOW, IEEE Presented by: Chi-Chun.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.

LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Computer Science Department A Speech / Music Discriminator using RMS and Zero-crossings Costas Panagiotakis and George Tziritas Department of Computer.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Pattern Recognition Applications Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Artificial Intelligence Techniques
Soft Margin Estimation for Speech Recognition Main Reference: Jinyu Li, " SOFT MARGIN ESTIMATION FOR AUTOMATIC SPEECH RECOGNITION," PhD thesis, Georgia.
Kinect Player Gender Recognition from Speech Analysis
CEN 592 PATTERN RECOGNITION Spring Term CEN 592 PATTERN RECOGNITION Spring Term DEPARTMENT of INFORMATION TECHNOLOGIES Assoc. Prof.
© Recognition Of Writer-Independent Off-Line Handwritten Arabic (Indian) Numerals Using Hidden Markov Models Mahmoud, S ELSEVIER SCIENCE BV, SIGNAL PROCESSING;
Audio classification Discriminating speech, music and environmental audio Rajas A. Sambhare ECE 539.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.
© The Influence Of Attitudes On Personal Computer Utilization Among Knowledge Workers: The Case Of Saudi Arabia Al-Khaldi, MA; Wallace,
7-Speech Recognition Speech Recognition Concepts
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
© Feasibility Study Of Hybrid Retrofits To An Isolated Off-Grid Diesel Power Plant Rehman, S; El-Amin, IM; Ahmad, F; Shaahid, SM; Al-Shehri, AM; Bakhashwain,
© Univariate Modeling And Forecasting Of Monthly Energy Demand Time Series Using Abductive And Neural Networks Abdel-Aal, RE PERGAMON-ELSEVIER SCIENCE.
1. 2. © Classifying Construction Contractors Using Unsupervised- Learning Neural Networks Elazouni, AM ASCE-AMER SOC CIVIL ENGINEERS, JOURNAL OF CONSTRUCTION.
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
ICASSP Speech Discrimination Based on Multiscale Spectro–Temporal Modulations Nima Mesgarani, Shihab Shamma, University of Maryland Malcolm Slaney.
© Predicting Defect-Prone Software Modules Using Support Vector Machines Elish, KO; Elish, MO ELSEVIER SCIENCE INC, JOURNAL OF SYSTEMS AND SOFTWARE; pp:
Hidden Markov Classifiers for Music Genres. Igor Karpov Rice University Comp 540 Term Project Fall 2002.
© Corrosion Properties Of HVOF-Coated Steel In Simulated Concrete Pore Electrolyte And Concentrated Chloride Environments Khaled, MM; Yibas, BS ELSEVIER.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
© Internet Banking And Quality Of Service - Perspectives From A Developing Nation In The Middle East Sohail, MS; Shaikh, NM EMERALD GROUP PUBLISHING LIMITED,
© Modeling And Forecasting Electric Daily Peak Loads Using Abductive Networks Abdel-Aal, RE ELSEVIER SCI LTD, INTERNATIONAL JOURNAL OF.
© Predictive Modeling Of Mercury Speciation In Combustion Flue Gases Using GMDH-Based Abductive Networks Abdel-Aal, RE ELSEVIER SCIENCE BV,
© Hourly Temperature Forecasting Using Abductive Networks Abdel-Aal, RE PERGAMON-ELSEVIER SCIENCE LTD, ENGINEERING APPLICATIONS OF.
Wavelets, Filter Banks and Applications Wavelet-Based Feature Extraction for Phoneme Recognition and Classification Ghinwa Choueiter.
Wavelets Anderson G Moura 05/29/2015. Introduction Biomedical signals usually consist of brief high-frequency components closely spaced in time, accompanied.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
© Algorithms And Structures Of Adaptive Filtering: A Review Zerguine, A; Bettayeb, M KING FAHD UNIV PETROLEUM.
© Anti-Swing Control Of Gantry And Tower Cranes Using Fuzzy And Time-Delayed Feedback With Friction Compensation Omar, HM; Nayfeh, AH IOS PRESS,
1. 2. © Resource-Based Approach To IT Shared Services In A Manufacturing Firm Goh, M; Prakash, S; Yeo, R EMERALD GROUP PUBLISHING LIMITED, INDUSTRIAL MANAGEMENT.
© Artificial Neural Network Models For Identifying Flow Regimes And Predicting Liquid Holdup In Horizontal Multiphase Flow.
© Performance Analysis And Comparison Of Interrupt-Handling Schemes In Gigabit Networks Salah, K; El-Badawi, K; Haidari, F ELSEVIER SCIENCE BV, COMPUTER.
1. © SVM Classification Of Contaminating Particles In Liquid Dielectrics Using Higher Order Statistics Of Electrical And Acoustic PD Measurements Sharkawy,
© Teas: Direct Test On Quality And Antioxidant Activity Using Electron Paramagnetic Resonance Spectroscopy Morsy, MA IOS PRESS, SPECTROSCOPY-AN.
© Joint Determination Of Order Quantity And Reorder Point Of Continuous Review Model Under Quantity And Freight Rate Discounts Darwish, MA PERGAMON-ELSEVIER.
© Improving Electric Load Forecasts Using Network Committees Abdel-Aal, RE ELSEVIER SCIENCE SA, ELECTRIC POWER SYSTEMS RESEARCH; pp:
Speech Enhancement based on
© RAZAN: A High-Performance Switch Architecture For ATM Networks Abd-El-Barr, M; Al-Tawil, K; Youssef, H; Al-Jarad, T JOHN WILEY SONS LTD, INTERNATIONAL.
© ANALYSIS OF MACRO-DATA-FLOW DYNAMIC SCHEDULING ON NONUNIFORM MEMORY ACCESS ARCHITECTURES ALMOUHAMED, M IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC,
© Dirac And Klein-Gordon Equations With Equal Scalar And Vector Potentials Alhaidari, AD; Bahlouli, H; Al-Hasan, A ELSEVIER.
© Fluid Flow And Heat Transfer Characteristics In Axisymmetric Annular Diffusers Shuja, SZ; Habib, MA PERGAMON-ELSEVIER SCIENCE LTD, COMPUTERS FLUIDS;
Christoph Prinz / Automatic Speech Recognition Research Progress Hits the Road.
© Effect Of Phase Transitions In Copper-Germanium Thin Film Alloys On Their Electrical Resistivity Tawancy, HM; Aboelfotoh, MO CHAPMAN HALL LTD, JOURNAL.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
© Adaptive Chip-Level Channel Estimation For IMT-DS System: DL And UL Shah, SFA; Sheikh, AUH SPRINGER, WIRELESS PERSONAL.
Project GuideBenazir N( ) Mr. Nandhi Kesavan RBhuvaneshwari R( ) Batch no: 32 Department of Computer Science Engineering.
© GMDH-Based Feature Ranking And Selection For Improved Classification Of Medical Data Abdel-Aal, RE ACADEMIC PRESS INC ELSEVIER SCIENCE, JOURNAL OF BIOMEDICAL.
BRAIN Alliance Research Team Annual Progress Report (Jul – Feb
Spoken Digit Recognition
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan.
3. Applications to Speaker Verification
Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.
Introduction to Radial Basis Function Networks
Review of Statistical Pattern Recognition
THE TOPICS AND TITLES OF RESEARCH
Presentation transcript:

© Machine-Learning Based Classification Of Speech And Music Khan, MKS; Al-Khatib, WG SPRINGER, MULTIMEDIA SYSTEMS; pp: 55-67; Vol: 12 King Fahd University of Petroleum & Minerals Summary The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. In this paper, we investigate audio features that have not been previously used in music-speech classification, such as the mean and variance of the discrete wavelet transform, the variance of Mel- frequency cepstral coefficients, the root mean square of a lowpass signal, and the difference of the maximum and minimum zero-crossings. We, then, employ fuzzy C- means clustering to the problem of selecting a viable set of features that enables better classification accuracy. Three different classification frameworks have been studied:Multi-Layer Perceptron (MLP) Neural Networks, radial basis functions (RBF) Neural Networks, and Hidden Markov Model (HMM), and results of each framework have been reported and compared. Our extensive experimentation have identified a subset of features that contributes most to accurate classification, and have shown that MLP networks are the most suitable classification framework for the problem at hand. References: BEIERHOLM T, 2004, P 17 INT C PATT REC, V2, P379 BEZDEK JC, 1981, PATTERN RECOGNITION BUGATTI A, 2002, EURASIP J APPL SIG P, V4, P372 CAREY MJ, 1999, P IEEE INT C AC SPEE, V1, P149 CHOU W, 2001, P ICASSP 01 SALT LAK, V2, P865 CYBENKO G, 1989, MATH CONTROL SIGNAL, V2, P303 DELFS C, 1998, P INT C AC SPEECH SI, V3, P1569 DUDA RO, 2001, PATTERN CLASSIFICATI ELMALEH K, 2000, P ICASSP2000 JUN, V4, P2445 HARB H, 2001, P 7 INT C DISTR MULT, P257 HARB H, 2003, P 7 INT C SIGN PROC, V2, P125 HOYT JD, 1994, P INT C NEUR NETW IE, V7, P4493 Copyright: King Fahd University of Petroleum & Minerals;

© KARNEBACK S, 2001, P EUR C SPEECH COMM, P1891 KHAN MKS, 2005, THESIS KING FAHD U P LAMBROU T, 1998, P INT C AC SPEECH SI, V6, P3621 LI DG, 2001, PATTERN RECOGN LETT, V22, P533 LIPP OV, 2004, EMOTION, V4, P233, DOI / LU L, 2001, P 9 ACM INT C MULT, P203 LU L, 2002, IEEE T SPEECH AUDI P, V10, P504, DOI /TSA LU L, 2003, ACM MULTIMEDIA SYSTE, V8, P MAMMONE RJ, 1994, ARTIFICIAL NEURAL NE 22. PANAGIOTAKIS C, 2004, IEEE T MULTIMEDIA 23. PARRIS ES, 1999, P EUROSPEECH 99 BUD, P PELTONEN V, 2001, THESIS TAMPERE U TEC 25. PINQUIER J, 2002, P ICSLP 02, V3, P PINQUIER J, 2002, P INT C AC SPEECH SI, V4, P PINQUIER J, 2003, P INT C AC SPEECH SI, V2, P RABINER LR, 1986, IEEE ASSP MAG, V3, P4 29. SAAD EM, 2002, P 19 NAT RAD SCI C N, P SAUNDERS J, 1996, P INT C AC SPEECH SI, V2, P SCHEIRER E, 1997, P ICASSP 97, V2, P SHAO X, 2003, P 4 INT C INF COMM S, V3, P SRINIVASAN SH, 2004, P INT C AC SPEECH SI, V4, P TZANETAKIS G, 1999, EUROMICRO WORKSH MUS, V2, P TZANETAKIS G, 2001, P INT S MUS INF RETR, P TZANETAKIS G, 2002, IEEE T SPEECH AUDI P, V10, P WANG WQ, 2003, P INF COMM SIGN PROC, V3, P1325 For pre-prints please write to: Copyright: King Fahd University of Petroleum & Minerals;