Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

Tom Lentz (slides Ivana Brasileiro)
Acoustic Characteristics of Consonants
Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
ACOUSTICS OF SPEECH AND SINGING MUSICAL ACOUSTICS Science of Sound, Chapters 15, 17 P. Denes & E. Pinson, The Speech Chain (1963, 1993) J. Sundberg, The.
Speech Science XII Speech Perception (acoustic cues) Version
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
The Human Voice. I. Speech production 1. The vocal organs
Speech Perception Overview of Questions Can computers perceive speech as well as humans? Does each word that we hear have a unique pattern associated.
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
Structure of Spoken Language
PH 105 Dr. Cecilia Vogel Lecture 14. OUTLINE  consonants  vowels  vocal folds as sound source  formants  speech spectrograms  singing.
Eva Björkner Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing HUT, Helsinki, Finland KTH – Royal Institute of Technology.
Introduction to linguistics – The sounds of German R21118 Dr Nicola McLelland.
The Basic Actor’s Training Program: FREEING. An actor’s work in freeing is designed to limber, align, and strengthen an actor’s body in an integrated.
Introduction to Linguistics
THE PRODUCTION OF SPEECH SOUNDS
Phonetics (Part 1) Dr. Ansa Hameed.
Hossein Sameti Department of Computer Engineering Sharif University of Technology.
Overview What is in a speech signal?
SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?
Introduction to Speech Production Lecture 1. Phonetics and Phonology Phonetics: The physical manifestation of language in sound waves. –How sounds are.
Anatomic Aspects Larynx: Sytem of muscles, cartileges and ligaments.
Chapter 2 Introduction to articulatory phonetics
Digital signal Processing Digital signal Processing ECI Semester /2004 Telecommunication and Internet Engineering, School of Engineering, South.
Why is ASR Hard? Natural speech is continuous
Linguistics I Chapter 4 The Sounds of Language.
Representing Acoustic Information
Phonetics HSSP Week 5.
Physics 1251 The Science and Technology of Musical Sound Unit 3 Session 31 MWF The Fundamentals of the Human Voice Unit 3 Session 31 MWF The Fundamentals.
Speech Communications (Chapter 7) Prepared by: Ahmed M. El-Sherbeeny, PhD 1.
Speech Signal Processing
LE 222 Sound and English Sound system
Acoustic Phonetics 3/9/00. Acoustic Theory of Speech Production Modeling the vocal tract –Modeling= the construction of some replica of the actual physical.
MUSIC 318 MINI-COURSE ON SPEECH AND SINGING
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Speech Science Fall 2009 Oct 26, Consonants Resonant Consonants They are produced in a similar way as vowels i.e., filtering the complex wave produced.
Speech Science VII Acoustic Structure of Speech Sounds WS
ECE 598: The Speech Chain Lecture 7: Fourier Transform; Speech Sources and Filters.
Speech Science VI Resonances WS Resonances Reading: Borden, Harris & Raphael, p Kentp Pompino-Marschallp Reetzp
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
Introduction to Language Phonetics 1. Explore the relationship between sound and spelling Become familiar with International Phonetic Alphabet (IPA )
Phonetics Definition Speech Organs Consonants vs. Vowels
Fricatives November 20, 2015 The Road Ahead Formant plotting + vowel production exercises are due at 5 pm today! Monday and Wednesday of next week: fricatives,
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
Soran University- College of Education English Department Articulatory phonetics/Speech organs Talib M. Sharif Omer Assistant lecturer
Sounds in different patterns How do language organize sounds to distinguish different words? How do languages restrict, constrain of sounds? How are sounds.
Speech Generation and Perception
P105 Lecture #27 visuals 20 March 2013.
Speech Recognition with Matlab ® Neil E. Cotter ECE Department UNIVERSITY OF UTAH
Chapter 3 Stage & School Textbook
Whip Around  What 3 adjectives best describe you?  Think about this question and be prepared to share aloud with the class.
PHONETICS AND PHONOLOGY
Chapter 3: The Speech Process
an Introduction to English
Structure of Spoken Language
Chapter 3: The Speech Process
Speech is made up of sounds.
Speech Generation and Perception
Mobile Systems Workshop 1 Narrow band speech coding for mobile phones
Chapter 2 Phonology.
Evolution of human vocal production
Speech Perception (acoustic cues)
Speech Generation and Perception
INTRODUCTION TO PHONETICS for III H.E.C.E., V Semester Students
Presentation transcript:

Introduction to Digital Speech Processing Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro

 Speech is the primary method of human communication.  To transmit/store a speech waveform using as few bits as possible while retaining high quality 2 SP_1_intro

Speech Processing aims at modeling and manipulating the speech signal to be able to  transmit (code) speech efficiently  produce (synthesis) natural sounding voice  recognize (decode) spoken words Speech is a natural form of communication between humans and it reflects a lot of the variability and complexity of humans! This makes modeling speech an interesting and challenging task. The speech signal contains information from many levels and encodes information about the speaker and acoustic channel; the words and pronunciation; the language syntax and semantics, etc. Speech technology is becoming increasingly well established with quite sophisticated technology now incorporated into many widely deployed applications and speech technologists are much in demand! 3 SP_1_intro

Speech Processing Speech processing is the study of speech signals and the processing methods of these signals.speech signals  Speech is the way of choice for humans to communicate:  – no special equipment required  – no physical contact required  – no visibility required  – can communicate while doing something else 4 SP_1_intro

5

1- Production: 6 SP_1_intro

2- Propagation: the sound waves propagate through the air at a speed of 300 m/s, reaching the listener ’ s ears. 7 SP_1_intro

3- · Perception: the incoming sounds are deciphered by the listener into a received message, thereby completing the chain of events that culminated in the transfer of information from the speaker to the listener. 8 SP_1_intro

 Coding  Compression  Synthesis  Automatic Speech Recognition (ASR)  Speaker Recognition  Speech Recognition  Spoken Language recognition  Speech Enhancement  Echo Cancellation  Noise Cancellation … and more 9 SP_1_intro

Speech Processing Signal Processing Information Theory Phonetics Acoustics Algorithms (Programming) Fourier transforms Discrete time filters AR(MA) models Entropy Communication theory Rate-distortion theory Statistical SP Stochastic models Psychoacoustics Room acoustics Speech production 10 SP_1_intro

11 SP_1_intro

– Voiced: speech sounds where the vocal folds vibrate. – Vowels: no blockage of the vocal tract and no turbulence (e) – Consonants: non-vowels (s) – Plosives: consonants involving an explosion (p) 12 SP_1_intro

13 Speech Waveforms Extracts from “my speech” (a) start of “y” vowel (b) “ee” vowel (c) “s” consonant SP_1_intro

14 SP_1_intro

15 SPEAKER SP_1_intro

16 SP_1_intro

1- Turbulence: air moving quickly through a small hole (e.g./s/ in “size”) 2- Explosion: pressure built up behind a blockage is suddenly released (e.g. /p/ in “pop”) 3- Vocal Fold Vibration: like the neck of a balloon (e.g. /a/ in “hard”) – airflow through vocal folds (vocal cords) reduces the pressure and they snap shut (Bernoulli e ff ect) – muscle tension and air pressure build up force the folds open again and the process repeats – frequency of vibration (fx) determined by tension in vocal folds and pressure from lungs – for normal breathing and voiceless sounds (e.g. /s/) the vocal folds are held wide open and don’t vibrate 17 SP_1_intro

1-Voiced: speech sound where the vocal tract folds vibrate. 2-Vowels: no blockage of the vocal tract and no turbulence 18 SP_1_intro

3-Consonants: non-vowels. 4-Plosives: consonants involving an explosion 19 SP_1_intro

20 SP_1_intro

Ex: my speech 21 SP_1_intro

 The sound spectrum is modified by the shape of the vocal tract. This is determined by movements of the jaw, tongue and lips. The resonant frequencies of the vocal tract cause peaks in the spectrum called formants. The first two formant frequencies are roughly determined by the distances from the tongue hump to the larynx and to the lips respectively. 22 SP_1_intro

  SP_1_intro 23