Speech Communications

Speech Communications
ECE 796/896 Chapter 7 Speech Communications

Defining Speech Types of speech sounds
Phoneme: The shortest segment of speech, which if changed would change the meaning of the word. 13 vowel sounds, 25 consonants in english Diphthongs: special phonemes (oy in boy)

Depicting Speech Speech can be shown as, wave forms,spectrum, and sound spectrogram.

Criteria for Evaluating Speech
1. Intelligibility 2. Quality Components of Speech Communication Systems Speaker, Message,transmission system, noise environment, hearer

Speaker Enunciation: Superior speakers: Longer “syllable duration”
Greater intensity Utilized more of the total time w/ speech sounds Varied speech more in fundamental frequencies

Message Phoneme Confusions:
Certain speech sounds are more easily confused than others. DVPBGCET, FXSH, KJA,MN noise tends to confuse MNDGBVZ,TKPFS Avoid using single letters as codes

Message Cont. Word Characteristics Context Features
Intelligibility is better with familiar words Letters vs. word-spelling letter Context Features Vocabulary small as possible Standard sentence constructions Avoid short words - word spelling alphabet Familiarize the receiver with words and sentence structure

Transmission System Frequency Distortion Filtering
Amplitude Distortion Modifications of the time scale If intelligibility is important (Amp Dis.) can still result in acceptable intelligibility

Transmission cont. Effects of filtering on speech
Low and High pass filters Effects of Amplitude Distortion on speech Peak clipping

Noise Environment Indices that evaluate the effects of noise on speech intelligibility. AI: Articulation index PSIL: Preferred-octave speech interference PNC: Preferred noise criteria

Articulation Index Articulation index was developed to predict speech intelligibility given a knowledge of the noise environment. AI is a good predictor of speech intelligibility for normal and hearing impaired with mild-moderate hearing loss.

AI Cont. One Technique for computing AI.
1. For each 1/3 octave band, plot their band level of the speech peak reaching the listener’s ear. 2. Plot the steady state noise. 3. Computer the difference in levels between the two. Noise > speech = 0 Noise +30db > speech = 30 Multiply this by value for each band Add the values and the sum = AI

AI Cont. An AI = > 75% of 1000 word vocabulary would be understood, while almost 96% of a 256 word would be understood. AI <.3 poor, acceptable, > .5 good, >.7 excellent

PSIL - Preferred Octave Speech Interference Level
PSIL is the numeric average of the noise level in three bands (500,1000,2000Hz). EX: 70,80,75 --> PSIL = 75 This is not an effective measure when there are noise peaks outside these bands,

Speech Interference Level
SIL: an average of decibel level in the bands( , , ).

Preferred Noise Criteria
Curves Developed to evaluate noise environment inside office buildings. Based on the NC curves.

Guidelines of Synthesized Speech
1. Voice warnings should be presented in a voice that is qualitatively different from other voices in the environment. 2. There should be no alerting tones. 3. Directing attention to the voice warning. 4. Max. Intelligibility of the message 5. Make the voice as natural as possible. 6. Provide replay mode. 7. If spelling mode is provided, its quality may need to be better than that used for ther rest of the system. 8. Ability to interrupt, important for experienced users. 9. Provide training message. 10. Use sparingly and where it is acceptable.

Speech Communications

Similar presentations

Presentation on theme: "Speech Communications"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Speech Communications

Similar presentations

Presentation on theme: "Speech Communications"— Presentation transcript:

Similar presentations

About project

Feedback