1 6-Speech Quality Assessment Quality Levels IntelligibilityNaturalness Subjective Tests Objective Tests.

Slides:



Advertisements
Similar presentations
Acousteen, Herman Steeneken 1 Past, Present and Future of STI Herman J. M. Steeneken (
Advertisements

MPEG-1 MUMT-614 Jan.23, 2002 Wes Hatch. Purpose of MPEG encoding To decrease data rate How? –two choices: could decrease sample rate, but this would cause.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Acousteen Herman J.M. Steeneken Subjective Intelligibility Assessment Dr. Herman J.M. Steeneken.
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Advanced Speech Enhancement in Noisy Environments
Room Acoustics: implications for speech reception and perception by hearing aid and cochlear implant users 2003 Arthur Boothroyd, Ph.D. Distinguished.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,
Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November,
Single-Channel Speech Enhancement in Both White and Colored Noise Xin Lei Xiao Li Han Yan June 5, 2002.
Introduction to Image Quality Assessment
1 E-Model & MOS Speaker: Cheng-lin Tsai Adviser: Quincy Wu Date:2009/07/02.
Relationship between perception of spectral ripple and speech recognition in cochlear implant and vocoder listeners L.M. Litvak, A.J. Spahr, A.A. Saoji,
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
Sound source segregation (determination)
Why is ASR Hard? Natural speech is continuous
Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum:
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
TTS Evaluation Julia Hirschberg CS TTS Evaluation Intelligibility Tests Mean Opinion Scores Preference Tests 9/7/20152 Speech and Language Processing.
Topics covered in this chapter
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Architectural Acoustics II Indoor Acoustical Phenomena Prof S K Tang.
7-Speech Recognition Speech Recognition Concepts
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
1 Auditory, tactile, and vestibular sensory systems n Perceptually relevant characteristics of sound n The receptor system: The ear n Basic sensory characteristics.
Chapter 3.2 Speech Communication Human Performance Engineering Robert W. Bailey, Ph.D. Third Edition.
Concepts of Multimedia Processing and Transmission IT 481, Lecture #4 Dennis McCaughey, Ph.D. 25 September, 2006.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:
Basics of Neural Networks Neural Network Topologies.
From last time …. ASR System Architecture Pronunciation Lexicon Signal Processing Probability Estimator Decoder Recognized Words “zero” “three” “two”
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimation method.
P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,
7-Speech Quality Assessment Quality Levels Subjective Tests Objective Tests IntelligibilityNaturalness.
AMSP : Advanced Methods for Speech Processing An expression of Interest to set up a Network of Excellence in FP6 Prepared by members of COST-277 and colleagues.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
Xianggang Putonghua Yanxishe Primary School of Science and Creativity
Autoregressive (AR) Spectral Estimation
7-Speech Quality Assessment Quality Levels Subjective Tests Objective Tests IntelligibilityNaturalness.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
The Story of Wavelets Theory and Engineering Applications
Figures for Chapter 8 Candidacy Dillon (2001) Hearing Aids.
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Fletcher’s band-widening experiment (1940) Present a pure tone in the presence of a broadband noise. Present a pure tone in the presence of a broadband.
Spectral subtraction algorithm and optimize Wanfeng Zou 7/3/2014.
The cat.
Speech and Singing Voice Enhancement via DNN
PATTERN COMPARISON TECHNIQUES
Fletcher’s band-widening experiment (1940)
Copyright © American Speech-Language-Hearing Association
ARTIFICIAL NEURAL NETWORKS
Copyright © American Speech-Language-Hearing Association
Spread Spectrum Audio Steganography using Sub-band Phase Shifting
Copyright © American Speech-Language-Hearing Association
Linear Prediction.
A Smartphone App-Based
Ningping Fan, Radu Balan, Justinian Rosca
8-Speech Recognition Speech Recognition Concepts
Human Speech Perception and Feature Extraction
Dealing with Acoustic Noise Part 1: Spectral Estimation
Presenter: Shih-Hsiang(士翔)
Music Signal Processing
Presentation transcript:

1 6-Speech Quality Assessment Quality Levels IntelligibilityNaturalness Subjective Tests Objective Tests

2 Quality Levels Synthetic Quality (Under 4.8 kbps) Communication Quality (4.8 to 13 kbps) Toll Quality (13 to 64 kbps) Broadcast Quality (Upper than 64 kbps)

3 Test Types IntelligibilityNaturalness Subjective (Test by user) Objective (Test by system)

4 First Class Subjective Intelligibility Tests Diagnostic Rhyme Test (DRT) –Selecting between two CVC by different first C –First C should have specific properties –Ex. hop - fop And than - dan Modified Rhyme Test (MRT) –Selecting between CVC’s by different first C –Ex. Cat, bat, rat, mat, fat, sat

5 First Class (Cont’d) Subjective Intelligibility tests DRT is very applicable and credible In this test user can hear the speech only once

6 Second Class Subjective Naturalness tests Mean Opinion Score (MOS) –MOS is very applicable and credible –In this test user can hear the speech a lot Diagnostic Acceptability Measure (DAM) –This test is very complex

7 Mean Opinion Score (MOS) Scores for MOS are like this ScoreSpeech Quality Not Acceptable Weak Medium Good Excellent

8 Diagnostic Acceptability Measure (DAM) This test is very complex In this test there is 19 different parameters for score. These parameters divide into 3 main groups: –Signal Quality –Background Quality –Total Quality

9 Objective Tests These tests can not be used for intelligibility. Because system couldn’t recognize speech intelligibility Objective tests can only be used for speech Naturalness

10 Objective Tests (Cont’d) Articulation Index (AI) Signal to Noise Ratio (SNR) –Global (Classic) SNR –Segmental SNR –Frequency Weighted Segmental SNR

11 Articulation Index (AI) AI assumes that different frequency bands distortion are independent, and measure signal quality in different bands. In each band determines percentage of perceptible signal by listener Bands HZ

12 Articulation index (Cont’d) Perceptible by user signal : –1- Upper than human hearing threshold –2- Lower than human pain threshold –3- Upper than Masking Noise level –In each case one of the states 1 or 3 is prevail

13 Articulation index (Cont’d) In AI SNR measured isolated in each band

14 Signal To Noise Ratio(SNR)

15 Segmental SNR j’th Frame SNR M : Number of frames

16 Frequency Weighted Segmental SNR K : Number of frequency bands M : Number of frames

17 Itakura Measure Is the envelope spectrum Use from All-Pole (AR) Model

18 Itakura Measure (Cont’d) This is based on the spectrum difference between main signal and assessment signal Autoregressive Coefficients Reflection Coefficients Autocorrelation Coefficients

19 Itakura Measure (Cont’d) m :Index of frame l : Number of coefficients

20 Itakura Measure (Cont’d) Is the l’th parameter of the frame that conduces m’th sample

21 Weighted Spectral Slope Measure (WSSM) Is STFT of k’th band of the frame that conduces m’th sample

22 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types