Sustained Phones (/aa/, /ae/, /eh/, /sh/, /z/, /f/, /m/, /n/)

Slides:



Advertisements
Similar presentations
Acousteen Herman J.M. Steeneken Subjective Intelligibility Assessment Dr. Herman J.M. Steeneken.
Advertisements

Analysis and Digital Implementation of the Talk Box Effect Yuan Chen Advisor: Professor Paul Cuff.
Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
Hierarchy of Design Voice Controlled Remote Voice Input Control Path Speech Processing IR Interface.
Fundamental Frequency & Jitter Lab 2. Fundamental Frequency Pitch is the perceptual correlate of F 0 Perception is not equivalent to measurement: –Pitch=
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Vocal microtremor in normophonic and mildly dysphonic speakers Jean Schoentgen Université Libre Bruxelles Brussels - Belgium.
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
Chapter 17 Overview of Multivariate Analysis Methods
1 Speech Parametrisation Compact encoding of information in speech Accentuates important info –Attempts to eliminate irrelevant information Accentuates.
Speaker Adaptation for Vowel Classification
High Frequency Ultrasonic Characterization of Carrot Tissue Christopher Vick Advisor: Dr. Navalgund Rao Center for Imaging Science Rochester Institute.
Why is ASR Hard? Natural speech is continuous
Acoustic and Linguistic Characterization of Spontaneous Speech Masanobu Nakamura, Koji Iwano, and Sadaoki Furui Department of Computer Science Tokyo Institute.
Nonlinear Mixture Autoregressive Hidden Markov Models for Speech Recognition S. Srinivasan, T. Ma, D. May, G. Lazarou and J. Picone Department of Electrical.
Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2011 Kei Hashimoto, Shinji Takaki, Keiichiro Oura, and Keiichi Tokuda Nagoya.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
Speech Signal Processing
Schizophrenia and Depression – Evidence in Speech Prosody Student: Yonatan Vaizman Advisor: Prof. Daphna Weinshall Joint work with Roie Kliper and Dr.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Page 0 of 14 Dynamical Invariants of an Attractor and potential applications for speech data Saurabh Prasad Intelligent Electronic Systems Human and Systems.
Reconstructed Phase Space (RPS)
EEG Classification Using Maximum Noise Fractions and spectral classification Steve Grikschart and Hugo Shi EECS 559 Fall 2005.
NONLINEAR DYNAMIC INVARIANTS FOR CONTINUOUS SPEECH RECOGNITION Author: Daniel May Mississippi State University Contact Information: 1255 Louisville St.
Jacob Zurasky ECE5526 – Spring 2011
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
Daniel May Department of Electrical and Computer Engineering Mississippi State University Analysis of Correlation Dimension Across Phones.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Improving Speech Modelling Viktoria Maier Supervised by Prof. Hynek Hermansky.
Nonlinear Dynamical Invariants for Speech Recognition S. Prasad, S. Srinivasan, M. Pannuri, G. Lazarou and J. Picone Department of Electrical and Computer.
NOISE DETECTION AND CLASSIFICATION IN SPEECH SIGNALS WITH BOOSTING Nobuyuki Miyake, Tetsuya Takiguchi and Yasuo Ariki Department of Computer and System.
Chapter 17 Introduction to Survey Research. Surveys – why a survey? Surveys are conducted to describe the characteristics of a population. Examples of.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
Low-Dimensional Chaotic Signal Characterization Using Approximate Entropy Soundararajan Ezekiel Matthew Lang Computer Science Department Indiana University.
CCN COMPLEX COMPUTING NETWORKS1 This research has been supported in part by European Commission FP6 IYTE-Wireless Project (Contract No: )
Wavelets, Filter Banks and Applications Wavelet-Based Feature Extraction for Phoneme Recognition and Classification Ghinwa Choueiter.
1 Value of information – SITEX Data analysis Shubha Kadambe (310) Information Sciences Laboratory HRL Labs 3011 Malibu Canyon.
Indoor Location Detection By Arezou Pourmir ECE 539 project Instructor: Professor Yu Hen Hu.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
Microphone Array Project ECE5525 – Speech Processing Robert Villmow 12/11/03.
LIGO-G E Network Analysis For Coalescing Binary (or any analysis with Matched Filtering) Benoit MOURS, Caltech & LAPP-Annecy March 2001, LSC Meeting.
Applying Methods of Nonlinear Dynamics for Financial Time Series Analysis Yuri Khakhanov Finance Academy under the Government of the.
Analysis of Traction System Time-Varying Signals using ESPRIT Subspace Spectrum Estimation Method Z. Leonowicz, T. Lobos
Paula Agudelo Turbulence, Intermittency and Chaos in High-Resolution Data, Collected At The Amazon Forest.
Acoustic Phonetics 3/14/00.
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
Chapter 15: Classification of Time- Embedded EEG Using Short-Time Principal Component Analysis by Nguyen Duc Thang 5/2009.
Page 0 of 5 Dynamical Invariants of an Attractor and potential applications for speech data Saurabh Prasad Intelligent Electronic Systems Human and Systems.
Madhulika Pannuri Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Correlation Dimension.
Chapter 1: Introduction to audio signal processing KH WONG, Rm 907, SHB, CSE Dept. CUHK,
UNIT- II Rectifiers and Filters. Basic Rectifier setup, half wave rectifier, full wave rectifier, bridge rectifier, derivations of characteristics of.
Voice Activity Detection Based on Sequential Gaussian Mixture Model Zhan Shen, Jianguo Wei, Wenhuan Lu, Jianwu Dang Tianjin Key Laboratory of Cognitive.
Rectifiers and Filters
Structure of Spoken Language
Talking with computers
Structure of Spoken Language
Dimension Review Many of the geometric structures generated by chaotic map or differential dynamic systems are extremely complex. Fractal : hard to define.
9.3 Filtered delay embeddings
Measuring latent variables
Microphone Array Project
Ju Lin, Yanlu Xie, Yingming Gao, Jinsong Zhang
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
Presented by Chen-Wei Liu
BETONLINEBETONLINE A·+A·+
Measuring latent variables
Statistical Process Control
Phonetics and Phonemics
Statistical Process Control
Hao Zheng, Shanshan Zhang, Liwei Qiao, Jianping Li, Wenju Liu
Presentation transcript:

Sustained Phones (/aa/, /ae/, /eh/, /sh/, /z/, /f/, /m/, /n/) Second Order Kolmogorov Entropy Estimation of speech data – The setup Speech data, sampled at 22.5 KHz Sustained Phones (/aa/, /ae/, /eh/, /sh/, /z/, /f/, /m/, /n/) Short (single utterance) phones (/aa/, /ae/, /eh/, /sh/, /z/, /f/, /m/, /n/) CVC words (bat, beat, boat, boy) Output – Second order Kolmogorov Entropy Using these, we wish to analyze: The presence or absence of chaos in any time series. Their discrimination characteristics across different attractors for classification

The analysis setup Currently, this analysis includes estimates of K2 for different embedding dimensions Variation in entropy estimates with the neighborhood radius, epsilon was studied Variation in entropy estimates with SNR of the signal was studied Currently, the analysis was performed on 3 vowels, 2 nasals and 2 fricatives Results show that vowels and nasals have a much smaller entropy, as compared to fricatives Further, k2 consistently decreases with embedding dimension for vowles and nasals, while for fricatives, it consistently increases

The analysis setup (in progress / coming soon)… Data size (length of the time series): This is crucial for our purpose, since we wish to extract information from short time series (sample data from utterances). Speaker variation: We wish to study variations in the Kolmogorov entropy of phone or word level attractors across different speakers. across different phones/words across different broad phone classes

Correlation Entropy vs. Embedding Dimension – various epsilons

Correlation Entropy vs. Embedding Dimension – various epsilons

Correlation Entropy vs. Embedding Dimension – various epsilons

Correlation Entropy vs. Embedding Dimension – various SNRs

Correlation Entropy vs. Embedding Dimension – various SNRs

Correlation Entropy vs. Embedding Dimension – various SNRs