Speech Communications

Slides:



Advertisements
Similar presentations
Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.
Advertisements

Acoustic/Prosodic Features
Speech Perception Dynamics of Speech
ECE 4321: Computer Networks Chapter 3 Data Transmission.
Hearing and Deafness 2. Ear as a frequency analyzer Chris Darwin.
PHONETICS AND PHONOLOGY
Speaking Style Conversion Dr. Elizabeth Godoy Speech Processing Guest Lecture December 11, 2012.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
The Human Voice. I. Speech production 1. The vocal organs
Speech Perception Overview of Questions Can computers perceive speech as well as humans? Does each word that we hear have a unique pattern associated.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
Speech perception Relating features of hearing to the perception of speech.
SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?
William Stallings Data and Computer Communications 7th Edition (Selected slides used for lectures at Bina Nusantara University) Data, Signal.
Speech Communications Chapter 7. Speech Communications  The Nature of Speech    Criteria for Evaluating Speech    Components of Speech Communication.
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
PH 105 Dr. Cecilia Vogel Lecture 12. OUTLINE  Timbre review  Spectrum  Fourier Synthesis  harmonics and periodicity  Fourier Analysis  Timbre and.
Measurement of Sound Decibel Notation Types of Sounds
Human Psychoacoustics shows ‘tuning’ for frequencies of speech If a tree falls in the forest and no one is there to hear it, will it make a sound?
1 Live Sound Reinforcement Audio measurements. 2 Live Sound Reinforcement One of the most common terms you will come across when handling any type of.
Speech Communications (Chapter 7) Prepared by: Ahmed M. El-Sherbeeny, PhD 1.
Human Capabilities Part - B. Speech Communications (Chapter 7) Prepared by: Ahmed M. El-Sherbeeny, PhD 1.
IE341: Human Factors Engineering Prof. Mohamed Zaki Ramadan Lecture 6 – Auditory Displays.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
CSD 5400 REHABILITATION PROCEDURES FOR THE HARD OF HEARING Auditory Perception of Speech and the Consequences of Hearing Loss.
By: Sepideh Abolghasem Shabnam Alaghehband Mina Khorram May 2006.
Intensity, Intensity Level, and Intensity Spectrum Level
Chapter 7 SPEECH COMMUNICATIONS
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
8.1 Music and Musical Notes It’s important to realize the difference between what is music and noise. Music is sound that originates from a vibrating source.
Chapter 3.2 Speech Communication Human Performance Engineering Robert W. Bailey, Ph.D. Third Edition.
Speech Perception 4/4/00.
Syllables and Stress October 19, 2012 Practicalities Mid-sagittal diagrams to turn in! Plus: homeworks to hand back. Production Exercise #2 is still.
Sound Vibration and Motion.
Noise Pollution and Control
Parts of a Wave Crest Wavelength Trough Normal Rest Position Frequency = 2 waves per second.
SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
More Waves in Music and Sound Decibels, Interference and Doppler Effect.
1. Draw a square. 2. Divide in half, horizontally and vertically.
Unit 5 Phonetics and Phonology. Phonetics Sounds produced by the human speech organs are called the “phonic/auditory medium” Phonetics is the study of.
Speech Perception.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
Acoustic Phonetics 3/14/00.
Speech in the DHH Classroom A new perspective. Speech in the DHH Bilingual Classroom Important to look beyond the traditional view of speech Think of.
SOUND PRESSURE, POWER AND LOUDNESS
IE341: Human Factors Engineering Lecture 6 – Auditory Displays.
Speechreading Based on Tye-Murray (1998) pp
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Speech Audiometry Lecture 8.
Speech Intelligibility and Sentence Duration as a Function of Mode of Communication in Cochlear Implanted Children Nicole L. Wiessner 1, Kristi A. Buckley.
Chapter 5 Sound Analysis.
The Human Voice. 1. The vocal organs
Audiograms Degree, Type and Configuration
Beats.
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Overview Communication is the transfer of information from one place to another. This should be done - as efficiently as possible - with as much fidelity/reliability.
King Saud University College of Engineering IE – 341: “Human Factors Engineering” Fall – 2016 (1st Sem H) Human Capabilities Part – C. Speech.
The Human Voice. 1. The vocal organs
Speech Perception.
Sound & Sound Waves.
Sound and Hearing it.
Giovanni M. Di Liberto, James A. O’Sullivan, Edmund C. Lalor 
Sound Sound is a type of energy made by vibrations. When any object vibrates, it causes movement in the air particles. These particles bump into the particles.
Noise Aperiodic complex wave
Speech Perception (acoustic cues)
Sound, language, thought and sense integration
Sound and HOW WE Hear it.
Musical Sounds Chapter 21.
Sound.
Presentation transcript:

Speech Communications ECE 796/896 Chapter 7 Speech Communications

Defining Speech Types of speech sounds Phoneme: The shortest segment of speech, which if changed would change the meaning of the word. 13 vowel sounds, 25 consonants in english Diphthongs: special phonemes (oy in boy)

Depicting Speech Speech can be shown as, wave forms,spectrum, and sound spectrogram.

Criteria for Evaluating Speech 1. Intelligibility 2. Quality Components of Speech Communication Systems Speaker, Message,transmission system, noise environment, hearer

Speaker Enunciation: Superior speakers: Longer “syllable duration” Greater intensity Utilized more of the total time w/ speech sounds Varied speech more in fundamental frequencies

Message Phoneme Confusions: Certain speech sounds are more easily confused than others. DVPBGCET, FXSH, KJA,MN noise tends to confuse MNDGBVZ,TKPFS Avoid using single letters as codes

Message Cont. Word Characteristics Context Features Intelligibility is better with familiar words Letters vs. word-spelling letter Context Features Vocabulary small as possible Standard sentence constructions Avoid short words - word spelling alphabet Familiarize the receiver with words and sentence structure

Transmission System Frequency Distortion Filtering Amplitude Distortion Modifications of the time scale If intelligibility is important (Amp Dis.) can still result in acceptable intelligibility

Transmission cont. Effects of filtering on speech Low and High pass filters Effects of Amplitude Distortion on speech Peak clipping

Noise Environment Indices that evaluate the effects of noise on speech intelligibility. AI: Articulation index PSIL: Preferred-octave speech interference PNC: Preferred noise criteria

Articulation Index Articulation index was developed to predict speech intelligibility given a knowledge of the noise environment. AI is a good predictor of speech intelligibility for normal and hearing impaired with mild-moderate hearing loss.

AI Cont. One Technique for computing AI. 1. For each 1/3 octave band, plot their band level of the speech peak reaching the listener’s ear. 2. Plot the steady state noise. 3. Computer the difference in levels between the two. Noise > speech = 0 Noise +30db > speech = 30 Multiply this by value for each band Add the values and the sum = AI

AI Cont. An AI = .47 -> 75% of 1000 word vocabulary would be understood, while almost 96% of a 256 word would be understood. AI <.3 poor, .3 - .5 acceptable, > .5 good, >.7 excellent

PSIL - Preferred Octave Speech Interference Level PSIL is the numeric average of the noise level in three bands (500,1000,2000Hz). EX: 70,80,75 --> PSIL = 75 This is not an effective measure when there are noise peaks outside these bands,

Speech Interference Level SIL: an average of decibel level in the bands(600-1200,1200-2400,2400-4800).

Preferred Noise Criteria Curves Developed to evaluate noise environment inside office buildings. Based on the NC curves.

Guidelines of Synthesized Speech 1. Voice warnings should be presented in a voice that is qualitatively different from other voices in the environment. 2. There should be no alerting tones. 3. Directing attention to the voice warning. 4. Max. Intelligibility of the message 5. Make the voice as natural as possible. 6. Provide replay mode. 7. If spelling mode is provided, its quality may need to be better than that used for ther rest of the system. 8. Ability to interrupt, important for experienced users. 9. Provide training message. 10. Use sparingly and where it is acceptable.