Speech Coding PCM DPCM ADPCM LPC CELP A road map Page 1 of 30

Slides:



Advertisements
Similar presentations
Chapter 3: PCM Noise and Companding
Advertisements

Multimedia: Digitised Sound Data Section 3. Sound in Multimedia Types: Voice Overs Special Effects Musical Backdrops Sound can make multimedia presentations.
A Phonetician ’ s Guide to Audio Formats Chilin Shih University of Illinois at Urbana Champaign LSA 2006January 5-8, 2006.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
Quantization Prof. Siripong Potisuk.
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Multimedia communications EG-371Dr Matt Roach Multimedia Communications EG 371 and EG 348 Dr Matthew Roach Lecture 2 Digital.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
CSc 461/561 CSc 461/561 Multimedia Systems Part A: 1. Audio.
© 2006 Cisco Systems, Inc. All rights reserved. 2.2: Digitizing and Packetizing Voice.
Fundamental of Wireless Communications ELCT 332Fall C H A P T E R 6 SAMPLING AND ANALOG-TO-DIGITAL CONVERSION.
Department of Computer Engineering University of California at Santa Cruz Data Compression (2) Hai Tao.
UCB Source Coding Jean Walrand EECS. UCB Outline Compression Losless: Huffman Lempel-Ziv Audio: Examples Differential ADPCM SUBBAND CELP Video: Discrete.
COMP 249 :: Spring 2005 Slide: 1 Audio Coding Ketan Mayer-Patel.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Ch 6 Sampling and Analog-to-Digital Conversion
5. Multimedia Data. 2 Multimedia Data Representation  Digital Audio  Sampling/Digitisation  Compression (Details of Compression algorithms – following.
Waveform SpeechCoding Algorithms: An Overview
Digital Audio Multimedia Systems (Module 1 Lesson 1)
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
Vytautas Deksnys, Algimantas Čitavičius Kaunas University of Technology Dept. of Electronics Engineering.
CS :: Fall 2003 Audio Coding Ketan Mayer-Patel.
Speech coding. What’s the need for speech coding ? Necessary in order to represent human speech in a digital form Applications: mobile/telephone communication,
Fundamentals of Digital Communication
Sampling Terminology f 0 is the fundamental frequency (Hz) of the signal –Speech: f 0 = vocal cord vibration frequency (>=80Hz) –Speech signals contain.
ECE 4371, Fall, 2014 Introduction to Telecommunication Engineering/Telecommunication Laboratory Zhu Han Department of Electrical and Computer Engineering.
The science of sound. Contents What is sound? Digitising sound Sampling Bitdepth.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
ECE 4710: Lecture #9 1 PCM Noise  Decoded PCM signal at Rx output is analog signal corrupted by “noise”  Many sources of noise:  Quantizing noise »Four.
Speech and Audio Coding Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2013 Last updated
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
CE Digital Signal Processing Fall 1992 Waveform Coding Hossein Sameti Department of Computer Engineering Sharif University of Technology.
1 PCM & DPCM & DM. 2 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Compression No. 1  Seattle Pacific University Data Compression Kevin Bolding Electrical Engineering Seattle Pacific University.
Pulse Code Modulation PCM is a method of converting an analog signal into a digital signal. (A/D conversion) The amplitude of Analog signal can take any.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
Digital Multiplexing 1- Pulse Code Modulation 2- Plesiochronous Digital Hierarchy 3- Synchronous Digital Hierarchy.
PCM & DPCM & DM.
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
MPEG-1Standard By Alejandro Mendoza. Introduction The major goal of video compression is to represent a video source with as few bits as possible while.
Digital Audio III. Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent.
ECE 4371, 2009 Class 9 Zhu Han Department of Electrical and Computer Engineering Class 9 Sep. 22 nd, 2009.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
EE 551/451, Fall, 2006 Communication Systems Zhu Han Department of Electrical and Computer Engineering Class 13 Oct. 3 rd, 2006.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
COMMUNICATION SYSTEM EEEB453 Chapter 5 (Part III) DIGITAL TRANSMISSION Intan Shafinaz Mustafa Dept of Electrical Engineering Universiti Tenaga Nasional.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Digital Audio (2/2) S.P.Vimal CSIS Group BITS-Pilani
RICO HARTONO JAHJA SOURCE CODING: PART IV.
Digital Communications Chapter 13. Source Coding
Vocoders.
Multimedia: Digitised Sound Data
Discrete Signals Prof. Abid Yahya.
INTRODUCTION TO TELEPHONY BY : ITZIK CHOEN
PCM (Pulse Code Modulation)
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
CS 4594 Data Communications
Mobile Systems Workshop 1 Narrow band speech coding for mobile phones
Chapter 3: PCM Noise and Companding
PCM & DPCM & DM.
Speech coding.
Digital Audio Application of Digital Audio - Selected Examples
Presentation transcript:

Sharif University of Technology Speech Coding Basics A Tutorial Mahdi Amiri Supervisor Dr. H. R. Rabiee April 2009 Sharif University of Technology

Speech Coding PCM DPCM ADPCM LPC CELP A road map Page 1 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Basics Digital Representation of an Analog Signal Sampling and Quantization Parameters: Sampling Rate (Samples per Second) Quantization Levels (Bits per Sample) Page 2 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Why Call it PCM? 4-bit PCM Page 3 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Bit per Second (bit/s) How to choose proper… Sampling Rate 8 Khz ? Quantization Level 8 bit/sample ? Bit per Second for 8000 Hz 8 bit PCM 64 kbit/s Page 4 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Sampling Rate Human Hearing Frequency Range 20 Hz to 20 kHz Play with “HearTest” to test your hearing Most people will find that their hearing is most sensitive around 1-4 kHz and that it is less sensitive at high and low frequencies. Page 5 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Hearing Range Ferret = Persian: Raasoo GERBIL = Persian: Moosh Sahraaiee Hedgehog = Persian: Joojeh Tighi Possom = Like Opossums : Persian: Saarigh Seal = Persian: Fok Porpoise = Persian: Khook Daryaayee (Shabihe dolphin va nahang) Page 6 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Sampling Rate Human Vocal Range Normal: 80 Hz to 1100 Hz Charles Kellogg (14 KHz) (not verified) Guinness Book of Records Female: Georgia Brown (Eight octaves, 25087Hz) Male: Tim Storms (Six octaves) Georgia Brown's High Notes Georgia Brown incredibly screams the high notes that made her the woman with the largest vocal range on the planet www.youtube.com/watch?v=P6wSyIdwCFM Tim Storms Sings Eight Hertz Tim storms demonstrates his low range. He sings so low you can't even hear it. www.youtube.com/watch?v=___sG3AJaNc Page 7 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Common Sampling Rates 8,000 Hz: Telephone, adequate for human speech 11,025 Hz 22,050 Hz – radio 32,000 Hz - miniDV digital video camcorder, DAT (LP mode) 44,100 Hz - audio CD, also most commonly used with MPEG-1 audio (VCD, SVCD, MP3) 48,000 Hz - digital sound used for miniDV, digital TV, DVD, DAT, films and professional audio 96,000 or 192,000 Hz - DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray Disc) audio tracks, and HD-DVD (High-Definition DVD) audio tracks 2.8224 MHz - SACD, 1-bit sigma-delta modulation process known as Direct Stream Digital, co-developed by Sony and Philips” Page 8 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Quantization Levels Want to prevent human ear fatigue by minimizing quantization noise Signal-to-Noise Ratio = 6.02B dB SNR is approximately 6 dB per bit. 16-bit => 96 dB Above 36 dB is required Page 9 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Good to Know The average person cannot tell the difference between a bitrate above 192 kbit/s and the original CD/WAV. Even if your headphones seal really well around your ears, they will probably only give you about 20 to 25 dB insulation from the external sound. Page 10 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) Images Page 11 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) u-law, a-law Nonuniform quantizers: Difficult to make, Expensive. Solution: Companding  Uniform Q.  Expanding Page 12 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) U-law, A-law Page 13 of 30 Speech Coding Basics

Pulse-code Modulation (PCM) u-law, a-law North America and Japan Europe Page 14 of 30 Speech Coding Basics

Differential PCM (DPCM) Idea Unfortunately, this does not work on analog sources since dn != dn^ , and thus Pn != Pn^. Leads to error accumulation! Page 15 of 30 Speech Coding Basics

Differential PCM (DPCM) Basic Scheme General Predictive Coding Problem? Page 16 of 30 Speech Coding Basics

Differential PCM (DPCM) Better Structure Page 17 of 30 Speech Coding Basics

Adaptive DPCM (ADPCM) Idea Problem? Page 18 of 30 Speech Coding Basics Unfortunately, this does not work on analog sources since dn != dn^ , and thus Pn != Pn^. Leads to error accumulation! Page 18 of 30 Speech Coding Basics

Adaptive DPCM (ADPCM) Size of Quantization Step Page 19 of 30 Speech Coding Basics

Speech Compression Concepts Spectrogram, STFT 3D surface spectrogram of a part from a music piece. Page 20 of 30 Speech Coding Basics

Speech Compression Concepts Spectrogram Spectrogram of a male voice saying ‘nineteenth century’. Page 21 of 30 Speech Coding Basics

Speech Compression Concepts Spectrogram, Demonstration Bat Echolocation Call Flute by Jean Pierre Rampal Face! Singing Voice Page 22 of 30 Speech Coding Basics

Speech Compression Concepts Formant Page 23 of 30 Speech Coding Basics

Linear Predictive Coding (LPC) Modeling Page 24 of 30 Speech Coding Basics

Linear Predictive Coding (LPC) Modeling (Hiss or Buzz) Buzzer  Filter Chuncks: 30 thr. 50 frames/sec. Speech = Formants + Residue Predictor for each frame: Page 25 of 30 Speech Coding Basics

Linear Predictive Coding (LPC) Modeling (Hiss or Buzz) Page 26 of 30 Speech Coding Basics

Code Excited Linear Prediction CELP Problem of LPC Where there is both Hiss and Buzz Solution Encode residue Method Vector Quantization (Codebook) Page 27 of 30 Speech Coding Basics

Comparison Sample Speech A lathe is a big tool. Grab every dish of sugar. Page 28 of 30 Speech Coding Basics

Comparison Demonstration Original ADPCM LPC CELP Page 29 of 30 Speech Coding Basics

Thank You Speech Coding Basics A Tutorial FIND OUT MORE AT... 1. http://ce.sharif.edu/~m_amiri/ 2. http://www.aictct.ir/dml/ Page 30 of 30 Speech Coding Basics

Animated Title Title Abc Page 31 of 30 Speech Coding Basics

Definition of Vanishing Percentage (VP) Title Title Abc Definition of Vanishing Percentage (VP) Page 32 of 20 Speech Coding Basics