Hsin-Ju Hsieh 謝欣汝, Wen-hsiang Tu 杜文祥, Jeih-weih Hung 洪志偉 暨南國際大學電機工程學系 報告者:汪逸婷 2012/03/20.

Slides:



Advertisements
Similar presentations
[1] AN ANALYSIS OF DIGITAL WATERMARKING IN FREQUENCY DOMAIN.
Advertisements

Informatik 4 Lab 1. Laboratory Exercise Overview 1. Define size of 20 radius vectors 2. DCT transformation 3. Create Microsoft Excel spreadsheet 4. Create.
Time Series II.
Modulation Spectrum Factorization for Robust Speech Recognition Wen-Yi Chu 1, Jeih-weih Hung 2 and Berlin Chen 1 Presenter : 張庭豪.
A Matlab Playground for JPEG Andy Pekarske Nikolay Kolev.
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Oriented Wavelet 國立交通大學電子工程學系 陳奕安 Outline Background Background Beyond Wavelet Beyond Wavelet Simulation Result Simulation Result Conclusion.
HIWIRE MEETING Nancy, July 6-7, 2006 José C. Segura, Ángel de la Torre.
2003/02/13 Chapter 1 1頁1頁 工程數學 (3) : Complex Variables Analysis 91 學年度 第二學期 國立中興大學 電機系 授課教師 范志鵬 助理教授 Textboobs: “Complex Variables with Applications”,
Efficient Similarity Search in Sequence Databases Rakesh Agrawal, Christos Faloutsos and Arun Swami Leila Kaghazian.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
Signal Modeling for Robust Speech Recognition With Frequency Warping and Convex Optimization Yoon Kim March 8, 2000.
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
SWE 423: Multimedia Systems Chapter 7: Data Compression (3)
SWE 423: Multimedia Systems Chapter 7: Data Compression (5)
Digital Image Processing 3rd Edition
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.
Presented by Tienwei Tsai July, 2005
南台科技大學 資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member,
Robustness Studies For a Multi-Mode Information Embedding Scheme for Digital Images Daniel Eliades Mentor: Dr. Neelu Sinha Department of Math and Computer.
Survey of ICASSP 2013 section: feature for robust automatic speech recognition Repoter: Yi-Ting Wang 2013/06/19.
Extracting Barcodes from a Camera-Shaken Image on Camera Phones Graduate Institute of Communication Engineering National Taiwan University Chung-Hua Chu,
Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.
LOG-ENERGY DYNAMIC RANGE NORMALIZATON FOR ROBUST SPEECH RECOGNITION Weizhong Zhu and Douglas O’Shaughnessy INRS-EMT, University of Quebec Montreal, Quebec,
R. Ray and K. Chen, department of Computer Science engineering  Abstract The proposed approach is a distortion-specific blind image quality assessment.
資訊工程系智慧型系統實驗室 iLab 南台科技大學 1 A Static Hand Gesture Recognition Algorithm Using K- Mean Based Radial Basis Function Neural Network 作者 :Dipak Kumar Ghosh,
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Outline Kinds of Coding Need for Compression Basic Types Taxonomy Performance Metrics.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
Yi-zhang Cai, Jeih-weih Hung 2012/08/17 報告者:汪逸婷 1.
數位影像處理概論 課程名稱數位影像處理概論 課程編碼 30N06701 系所代碼 / 名稱 03 / 電子系 開課班級夜四技電子四甲 夜四技電子四乙 開課教師賴培淋 學分 3.0 時數 3 必選修選修 南台科技大學 課程資訊.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
AMSP : Advanced Methods for Speech Processing An expression of Interest to set up a Network of Excellence in FP6 Prepared by members of COST-277 and colleagues.
Content-Based Image Retrieval Using Block Discrete Cosine Transform Presented by Te-Wei Chiang Department of Information Networking Technology Chihlee.
Speech Enhancement for ASR by Hans Hwang 8/23/2000 Reference 1. Alan V. Oppenheim,etc., ” Multi-Channel Signal Separation by Decorrelation ”,IEEE Trans.
Subband Feature Statistics Normalization Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition Jeih-weih Hung, Member, IEEE, and.
1 An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method Presenter: Yo-Ping Huang.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
Fourier and Wavelet Transformations Michael J. Watts
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.
Feature Selection and Extraction Michael J. Watts
Dr. Abdul Basit Siddiqui FUIEMS. QuizTime 30 min. How the coefficents of Laplacian Filter are generated. Show your complete work. Also discuss different.
Noise Reduction in Speech Recognition Professor:Jian-Jiun Ding Student: Yung Chang 2011/05/06.
南台科技大學 資訊工程系 An effective solution for trademark image retrieval by combining shape description and feature matching 指導教授:李育強 報告者 :楊智雁 日期 : 2010/08/27.
On the relevance of facial expressions for biometric recognition Marcos Faundez-Zanuy, Joan Fabregas Escola Universitària Politècnica de Mataró (Barcelona.
Digital Image Processing Lecture 7: Image Enhancement in Frequency Domain-I Naveed Ejaz.
Frequency Domain Representation of Biomedical Signals.
Data statistics and transformation revision Michael J. Watts
Discrete Fourier Transform (DFT)
Spectral and Temporal Modulation Features for Phonetic Recognition Stephen A. Zahorian, Hongbing Hu, Zhengqing Chen, Jiang Wu Department of Electrical.
An Example of 1D Transform with Two Variables
Image Watermarking Chu, Hsi-Cheng.
Fourier and Wavelet Transformations
CSI-447: Multimedia Systems
4.2 Data Input-Output Representation
The Report of Monographic Study
4. DIGITAL IMAGE TRANSFORMS 4.1. Introduction
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
1-D DISCRETE COSINE TRANSFORM DCT
Missing feature theory
DCT-based Processing of Dynamic Features for Robust Speech Recognition Wen-Chi LIN, Hao-Teng FAN, Jeih-Weih HUNG Wen-Yi Chu Department of Computer Science.
Govt. Polytechnic Dhangar(Fatehabad)
The Report of Monographic Study
Source: Pattern Recognition Letters, Article In Press, 2007
Discrete Fourier Transform
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

Hsin-Ju Hsieh 謝欣汝, Wen-hsiang Tu 杜文祥, Jeih-weih Hung 洪志偉 暨南國際大學電機工程學系 報告者:汪逸婷 2012/03/20

1. Introduction 2. Introduction of DCT and the effect of noise on the DCT of the speech feature streams 3. The propposed DCT-base compensation approaches 4. The recognition experiment results and discussions 5. Summary 6. Conclusion and Future Work

 Two type of enviornmental distortions : ◦ Channel distortion ◦ Additive noise  Feature Compensation ◦ Normalize the statistics of temporal domain feature sequence. ◦ Reduce the mismatch by enhancing some components which are not easily affected by noise.  DFT

 Discrete Cosine Transform(DCT) ◦ Fourier related transform similar to DFT. ◦ One of the most powerful analysis tool of signal processing. ◦ Transform coding and speech feature extraction  Transform the signal from the time domain into the frequency domain.(periodicity)  Reducing the correlation of features and results in a more compact feature representation.

 AURORA 2.0 database  13-dimensional MFCC(c0~c12) sequence  13 new features + first 13 + second-order 13 = final 39-dimensional feature vector

1. Performing DCT-magnitude substitution adaptively 2. Integrating the proposed new methods with some other feature normalization techniques 3. Investigating how to determine the optimal trade-off between the noise reduction and the speech distortion that always exists among the noise-robustness techniques