Digital Systems: Hardware Organization and Design Speech Coding

Slides:



Advertisements
Similar presentations
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Advertisements

Sampling and Pulse Code Modulation
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
Spatial and Temporal Data Mining
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Communication Systems
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
Lecture II-2: Probability Review
Pulse Modulation 1. Introduction In Continuous Modulation C.M. a parameter in the sinusoidal signal is proportional to m(t) In Pulse Modulation P.M. a.
Random Processes and LSI Systems What happedns when a random signal is processed by an LSI system? This is illustrated below, where x(n) and y(n) are random.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Formatting and Baseband Modulation
1 Chapter 8 The Discrete Fourier Transform 2 Introduction  In Chapters 2 and 3 we discussed the representation of sequences and LTI systems in terms.
1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.
Digital Systems: Hardware Organization and Design
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
CE Digital Signal Processing Fall 1992 Waveform Coding Hossein Sameti Department of Computer Engineering Sharif University of Technology.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
1 PCM & DPCM & DM. 2 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Definitions Random Signal Analysis (Review) Discrete Random Signals Random.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Speech Processing Speech Coding. 4 June 2016Veton Këpuska2 Speech Coding  Definition: Speech Coding is a process that leads to the representation of.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
1 Quantization Error Analysis Author: Anil Pothireddy 12/10/ /10/2002.
CHAPTER 3 DELTA MODULATION
CHAPTER 5 SIGNAL SPACE ANALYSIS
4.2 Digital Transmission Pulse Modulation Pulse Code Modulation
More On Linear Predictive Analysis
Autoregressive (AR) Spectral Estimation
Baseband Receiver Receiver Design: Demodulation Matched Filter Correlator Receiver Detection Max. Likelihood Detector Probability of Error.
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Performance of Digital Communications System
1 Chapter 8 The Discrete Fourier Transform (cont.)
Chapter 7. Classification and Prediction
12. Principles of Parameter Estimation
SIGNALS PROCESSING AND ANALYSIS
Analog to digital conversion
Digital Communications Chapter 13. Source Coding
Vocoders.
Topics discussed in this section:
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
UNIT II.
4.1 Chapter 4 Digital Transmission Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sampling rate conversion by a rational factor
Digital Systems: Hardware Organization and Design Speech Coding
Context-based Data Compression
Subject Name: Digital Communication Subject Code:10EC61
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Scalar Quantization – Mathematical Model
Digital Systems: Hardware Organization and Design Speech Coding
Digital Systems: Hardware Organization and Design Speech Coding
Microcomputer Systems 2
Linear Predictive Coding Methods
PCM & DPCM & DM.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
EE513 Audio Signals and Systems
Digital Systems: Hardware Organization and Design
Linear Prediction.
Tania Stathaki 811b LTI Discrete-Time Systems in Transform Domain Ideal Filters Zero Phase Transfer Functions Linear Phase Transfer.
12. Principles of Parameter Estimation
16. Mean Square Estimation
Presentation transcript:

Digital Systems: Hardware Organization and Design Speech Coding 11/15/2018 Speech Processing Speech Coding Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Speech Coding Definition: Speech Coding is a process that leads to the representation of analog waveforms with sequences of binary digits. Even though availability of high-bandwidth communication channels has increased, speech coding for bit reduction has retained its importance. Reduced bit-rates transmissions is required for cellular networks Voice over IP Coded speech Is less sensitive than analog signals to transmission noise Easier to: protect against (bit) errors Encrypt Multiplex, and Packetize Typical Scenario depicted in next slide (Figure 12.1) 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Telephone Communication System Digital Systems: Hardware Organization and Design 11/15/2018 Digital Telephone Communication System 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Categorization of Speech Coders Digital Systems: Hardware Organization and Design 11/15/2018 Categorization of Speech Coders Waveform Coders: Used to quantize speech samples directly and operate at high-bit rates in the range of 16-64 kbps (bps - bits per second) Vocoders Largely model-based and operate at a low bit rate range of 1.2-4.8 kbps. Tend to be of lower quality than waveform and hybrid coders. Hybrid Coders Are partially waveform coders and partly speech model-based coders and operate in the mid bit rate range of 2.4-16 kbps. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quality Measurements Quality of coding is viewed as the closeness of the processed speech to the original speech or some other desired speech waveform: Naturalness Degree of background artifacts Intelligibility Speaker identifiability Etc. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quality Measurements Subjective Measurement: Diagnostic Rhyme Test (DRT) measures intelligibility. Diagnostic Acceptability Measure and Mean Opinion Score (MOS) test provide a more complete quality judgment. Objective Measurement: Segmental Signal to Noise Ratio (SNR) – average SNR over a short-time segments Articulation Index – relies on an average SNR across frequency bands. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quality Measurements A more complete list and definition of subjective and objective measures can be found at: J.R. Deller, J.G. Proakis, and J.H.I Hansen, “Discrete-Time Processing of Speech”, Macmillan Publishing Co., New York, NY, 1993 S.R. Quackenbush, T.P. Barnwell, and M.A. Clements, “Objective Measures of Speech Quality. Prentice Hall, Englewood Cliffs, NJ. 1988 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Sampling and Reconstruction of Signals 15 November 2018 Veton Këpuska

Analog-to-Digital Conversion. Continuous Signal . Sampled signal with sampling period T satisfying Nyquist rate as specified by Sampling Theorem. Digital sequence obtained after sampling and quantization 15 November 2018 Veton Këpuska

Digital-to-Analog Conversion. Processed digital signal y[n]. Continuous signal representation ya(nT). Low-pass filtered continuous signal y(t). 15 November 2018 Veton Këpuska

Statistical Models for Quantization 15 November 2018 Veton Këpuska

Digital Systems: Hardware Organization and Design 11/15/2018 Statistical Models Speech waveform is viewed as a random process. Various estimates are important from this statistical perspective: Probability density Mean, Variance and autocorrelation One approach to estimate a probability density function (pdf) of x[n] is through histogram. Count up the number of occurrences of the value of each speech sample in pre-defined different ranges: for many speech samples over a long time duration. Normalize the area of the resulting curve to unity. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Statistical Models The histogram of speech (Davenport, Paez & Glisson) was shown to approximate a gamma density: where x is the standard deviation of the pdf. Simpler approximation is given by the Laplacian pdf of the form: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 PDF of Speech 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

PDF Models of Speech 15 November 2018 Veton Këpuska

Example of Distributions GAMMA DISTRIBUTION LAPLACIAN DISTRIBUTION 15 November 2018 Veton Këpuska

Example of Distributions GAUSSIAN DISTRIBUTION where x – in above relations is Standard Deviation. 15 November 2018 Veton Këpuska

Scalar Quantization 15 November 2018 Veton Këpuska

Digital Systems: Hardware Organization and Design 11/15/2018 Scalar Quantization Assume that a sequence x[n] was obtained from speech waveform that has been lowpass-filtered and sampled at a suitable rate with infinite amplitude precision. x[n] samples are quantized to a finite set of amplitudes denoted by . Associated with the quantizer is a quantization step size. Quantization allows the amplitudes to be represented by finite set of bit patterns – symbols. Encoding: Mapping of to a finite set of symbols. This mapping yields a sequence of codewords denoted by c[n] (Figure 12.3a). Decoding – Inverse process whereby transmitted sequence of codewords c’[n] is transformed back to a sequence of quantized samples (Figure 12.3b). 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Scalar Quantization 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Fundamentals Assume a signal amplitude is quantized into M levels. Quantizer operator is denoted by Q(x); Thus Where denotes M possible reconstruction levels – quantization levels, and 1≤i≤M xi denotes M+1 possible decision levels with 0≤i≤M If xi-1< x[n] < xi, then x[n] is quantized to the reconstruction level is quantized sample of x[n]. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Fundamentals Scalar Quantization Example: Assume there M=4 reconstruction levels. Amplitude of the input signal x[n] falls in the range of [0,1] Decision levels and Reconstruction levels are equally spaced: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Figure 12.4 in the next slide. Fundamentals M=4 Decision levels are (?) [0,1/4,1/2,3/4,1] Reconstruction levels assumed to be (?) [0,1/8,3/8,5/8,7/8] Figure 12.4 in the next slide. 15 November 2018 Veton Këpuska

11 10 01 00 15 November 2018 Veton Këpuska

Example of Uniform 2-bit Quantizer Digital Systems: Hardware Organization and Design 11/15/2018 Example of Uniform 2-bit Quantizer 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Uniform Quantizer A uniform quantizer is one whose decision and reconstruction levels are uniformly spaced. Specifically:  is the step size equal to the spacing between two consecutive decision levels which is the same spacing between two consecutive reconstruction levels (Exercise 12.1). Each reconstruction level is attached a symbol – the codeword. Binary numbers typically used to represent the quantized samples (Figure 12.4). 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Example 2.2 Assume there are L = 16 reconstruction levels. Assuming that input values fall within the range [xmin=-1, xmax=1] and that the each value in this range is equally likely. Decision levels and reconstruction levels are equally spaced; D=Di,= (xmax-xmin)/L i=0, …, L.-1, Decision Levels: Reconstruction Levels: 15 November 2018 Veton Këpuska

Example 15 November 2018 Veton Këpuska

Digital Systems: Hardware Organization and Design 11/15/2018 Uniform Quantizer Codebook: Collection of codewords. In general with B-bit binary codebook there are 2B different quantization (or reconstruction) levels. Bit rate is defined as the number of bits B per sample multiplied by sample rate fs: I=Bfs Decoder inverts the coder operation taking the codeword back to a quantized amplitude value (e.g., 01 → ). Often the goal of speech coding/decoding is to maintain the bit rate as low as possible while maintaining a required level of quality. Because sampling rate is fixed for most applications this goal implies that the bit rate be reduced by decreasing the number of bits per sample 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Uniform Quantizer Designing a uniform scalar quantizer requires knowledge of the maximum value of the sequence. Typically the range of the speech signal is expressed in terms of the standard deviation of the signal. Specifically, it is often assumed that: -4x≤x[n]≤4x where x is signal’s standard deviation. Under the assumption that speech samples obey Laplacian pdf there are approximately 0.35% of speech samples fall outside of the range: -4x≤x[n]≤4x. Assume B-bit binary codebook ⇒ 2B. Maximum signal value xmax = 4x. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Uniform Quantizer For the uniform quantization step size  we get: Quantization step size  relates directly to the notion of quantization noise. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Noise Two classes of quantization noise: Granular Distortion Overload Distortion x[n] unquantized signal and e[n] is the quantization noise. For given step size  the magnitude of the quantization noise e[n] can be no greater than /2, that is: Figure 12.5 depicts this property were: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Noise 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Noise Overload Distortion Maximum-value constant: xmax = 4x (-4x≤x[n]≤4x) For Laplacian pdf, 0.35% of the speech samples fall outside the range of the quantizer. Clipped samples incur a quantization error in excess of /2. Due to the small number of clipped samples it is common to neglect the infrequent large errors in theoretical calculations. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Noise Statistical Model of Quantization Noise Desired approach in analyzing the quantization error in numerous applications. Quantization error is considered an ergodic white-noise random process. The autocorrelation function of such a process is expressed as: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Error Previous expression states that the process is uncorrelated. Furthermore, it is also assumed that the quantization noise and the input signal are uncorrelated, i.e., E(x[n]e[n+m])=0,  m. Final assumption is that the pdf of the quantization noise is uniform over the quantization interval: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Error Stated assumptions are not always valid. Consider a slowly varying – linearly varying signal ⇒ then e[n] is also changing linearly and is signal dependent (see Figure 12.5 in the previous slide). Correlated quantization noise can be annoying. When quantization step  is small then assumptions for the noise being uncorrelated with itself and the signal are roughly valid when the signal fluctuates rapidly among all quantization levels. Quantization error approaches a white-noise process with an impulsive autocorrelation and flat spectrum. One can force e[n] to be white-noise and uncorrelated with x[n] by adding white-noise to x[n] prior to quantization. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Example For the periodic sine-wave signal use 3-bit and 8-bit quantizer values. The input periodic signal is given with the following expression: MATLAB fix function is used to simulate quantization. The following figure depicts the result of the analysis. 15 November 2018 Veton Këpuska

Example Plot a) represents sequence x[n] with infinite precision, b) represents quantized version , c) represents quantization error e[n] for B=3 bits (L=8 quantization levels), and d) is quantization error for B=8 bits (L=256 quantization levels). 15 November 2018 Veton Këpuska

Example (3.1415926535 …) x[n] = p xi-1< x[n] < xi e.g., xi-1=3 & xi=4; xi-1=3.1 & xi=3.2; xi-1=3.14 & xi=3.15; xi-1=3.141 & xi=3.142; xi-1=3.1415 & xi=3.1416; xi-1=3.14159 & xi=3.14160; xi-1=3.141592 & xi=3.141593; xi-1=3.1415926 & xi=3.1415927; xi-1=3.14159265 & xi=3.14159266; xi-1=3.141592653 & xi=3.141592654; xi-1=3.1415926535 & xi=3.1415926536; 15 November 2018 Veton Këpuska

First 10,000 digits of pi: 15 November 2018 Veton Këpuska 3.1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 3421170679 8214808651 3282306647 0938446095 5058223172 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196 4428810975 6659334461 2847564823 3786783165 2712019091 4564856692 3460348610 4543266482 1339360726 0249141273 7245870066 0631558817 4881520920 9628292540 9171536436 7892590360 0113305305 4882046652 1384146951 9415116094 3305727036 5759591953 0921861173 8193261179 3105118548 0744623799 6274956735 1885752724 8912279381 8301194912 9833673362 4406566430 8602139494 6395224737 1907021798 6094370277 0539217176 2931767523 8467481846 7669405132 0005681271 4526356082 7785771342 7577896091 7363717872 1468440901 2249534301 4654958537 1050792279 6892589235 4201995611 2129021960 8640344181 5981362977 4771309960 5187072113 4999999837 2978049951 0597317328 1609631859 5024459455 3469083026 4252230825 3344685035 2619311881 7101000313 7838752886 5875332083 8142061717 7669147303 5982534904 2875546873 1159562863 8823537875 9375195778 1857780532 1712268066 1300192787 6611195909 2164201989 3809525720 1065485863 2788659361 5338182796 8230301952 0353018529 6899577362 2599413891 2497217752 8347913151 5574857242 4541506959 5082953311 6861727855 8890750983 8175463746 4939319255 0604009277 0167113900 9848824012 8583616035 6370766010 4710181942 9555961989 4676783744 9448255379 7747268471 0404753464 6208046684 2590694912 9331367702 8989152104 7521620569 6602405803 8150193511 2533824300 3558764024 7496473263 9141992726 0426992279 6782354781 6360093417 2164121992 4586315030 2861829745 5570674983 8505494588 5869269956 9092721079 7509302955 3211653449 8720275596 0236480665 4991198818 3479775356 6369807426 5425278625 5181841757 4672890977 7727938000 8164706001 6145249192 1732172147 7235014144 1973568548 1613611573 5255213347 5741849468 4385233239 0739414333 4547762416 8625189835 6948556209 9219222184 2725502542 5688767179 0494601653 4668049886 2723279178 6085784383 8279679766 8145410095 3883786360 9506800642 2512520511 7392984896 0841284886 2694560424 1965285022 2106611863 0674427862 2039194945 0471237137 8696095636 4371917287 4677646575 7396241389 0865832645 9958133904 7802759009 9465764078 9512694683 9835259570 9825822620 5224894077 2671947826 8482601476 9909026401 3639443745 5305068203 4962524517 4939965143 1429809190 6592509372 2169646151 5709858387 4105978859 5977297549 8930161753 9284681382 6868386894 2774155991 8559252459 5395943104 9972524680 8459872736 4469584865 3836736222 6260991246 0805124388 4390451244 1365497627 8079771569 1435997700 1296160894 4169486855 5848406353 4220722258 2848864815 8456028506 0168427394 5226746767 8895252138 5225499546 6672782398 6456596116 3548862305 7745649803 5593634568 1743241125 1507606947 9451096596 0940252288 7971089314 5669136867 2287489405 6010150330 8617928680 9208747609 1782493858 9009714909 6759852613 6554978189 3129784821 6829989487 2265880485 7564014270 4775551323 7964145152 3746234364 5428584447 9526586782 1051141354 7357395231 1342716610 2135969536 2314429524 8493718711 0145765403 5902799344 0374200731 0578539062 1983874478 0847848968 3321445713 8687519435 0643021845 3191048481 0053706146 8067491927 8191197939 9520614196 6342875444 0643745123 7181921799 9839101591 9561814675 1426912397 4894090718 6494231961 5679452080 9514655022 5231603881 9301420937 6213785595 6638937787 0830390697 9207734672 2182562599 6615014215 0306803844 7734549202 6054146659 2520149744 2850732518 6660021324 3408819071 0486331734 6496514539 0579626856 1005508106 6587969981 6357473638 4052571459 1028970641 4011097120 6280439039 7595156771 5770042033 7869936007 2305587631 7635942187 3125147120 5329281918 2618612586 7321579198 4148488291 6447060957 5270695722 0917567116 7229109816 9091528017 3506712748 5832228718 3520935396 5725121083 5791513698 8209144421 0067510334 6711031412 6711136990 8658516398 3150197016 5151168517 1437657618 3515565088 4909989859 9823873455 2833163550 7647918535 8932261854 8963213293 3089857064 2046752590 7091548141 6549859461 6371802709 8199430992 4488957571 2828905923 2332609729 9712084433 5732654893 8239119325 9746366730 5836041428 1388303203 8249037589 8524374417 0291327656 1809377344 4030707469 2112019130 2033038019 7621101100 4492932151 6084244485 9637669838 9522868478 3123552658 2131449576 8572624334 4189303968 6426243410 7732269780 2807318915 4411010446 8232527162 0105265227 2111660396 6655730925 4711055785 3763466820 6531098965 2691862056 4769312570 5863566201 8558100729 3606598764 8611791045 3348850346 1136576867 5324944166 8039626579 7877185560 8455296541 2665408530 6143444318 5867697514 5661406800 7002378776 5913440171 2749470420 5622305389 9456131407 1127000407 8547332699 3908145466 4645880797 2708266830 6343285878 5698305235 8089330657 5740679545 7163775254 2021149557 6158140025 0126228594 1302164715 5097925923 0990796547 3761255176 5675135751 7829666454 7791745011 2996148903 0463994713 2962107340 4375189573 5961458901 9389713111 7904297828 5647503203 1986915140 2870808599 0480109412 1472213179 4764777262 2414254854 5403321571 8530614228 8137585043 0633217518 2979866223 7172159160 7716692547 4873898665 4949450114 6540628433 6639379003 9769265672 1463853067 3609657120 9180763832 7166416274 8888007869 2560290228 4721040317 2118608204 1900042296 6171196377 9213375751 1495950156 6049631862 9472654736 4252308177 0367515906 7350235072 8354056704 0386743513 6222247715 8915049530 9844489333 0963408780 7693259939 7805419341 4473774418 4263129860 8099888687 4132604721 5695162396 5864573021 6315981931 9516735381 2974167729 4786724229 2465436680 0980676928 2382806899 6400482435 4037014163 1496589794 0924323789 6907069779 4223625082 2168895738 3798623001 5937764716 5122893578 6015881617 5578297352 3344604281 5126272037 3431465319 7777416031 9906655418 7639792933 4419521541 3418994854 4473456738 3162499341 9131814809 2777710386 3877343177 2075456545 3220777092 1201905166 0962804909 2636019759 8828161332 3166636528 6193266863 3606273567 6303544776 2803504507 7723554710 5859548702 7908143562 4014517180 6246436267 9456127531 8134078330 3362542327 8394497538 2437205835 3114771199 2606381334 6776879695 9703098339 1307710987 0408591337 4641442822 7726346594 7047458784 7787201927 7152807317 6790770715 7213444730 6057007334 9243693113 8350493163 1284042512 1925651798 0694113528 0131470130 4781643788 5185290928 5452011658 3934196562 1349143415 9562586586 5570552690 4965209858 0338507224 2648293972 8584783163 0577775606 8887644624 8246857926 0395352773 4803048029 0058760758 2510474709 1643961362 6760449256 2742042083 2085661190 6254543372 1315359584 5068772460 2901618766 7952406163 4252257719 5429162991 9306455377 9914037340 4328752628 8896399587 9475729174 6426357455 2540790914 5135711136 9410911939 3251910760 2082520261 8798531887 7058429725 9167781314 9699009019 2116971737 2784768472 6860849003 3770242429 1651300500 5168323364 3503895170 2989392233 4517220138 1280696501 1784408745 1960121228 5993716231 3017114448 4640903890 6449544400 6198690754 8516026327 5052983491 8740786680 8818338510 2283345085 0486082503 9302133219 7155184306 3545500766 8282949304 1377655279 3975175461 3953984683 3936383047 4611996653 8581538420 5685338621 8672523340 2830871123 2827892125 0771262946 3229563989 8989358211 6745627010 2183564622 0134967151 8819097303 8119800497 3407239610 3685406643 1939509790 1906996395 5245300545 0580685501 9567302292 1913933918 5680344903 9820595510 0226353536 1920419947 4553859381 0234395544 9597783779 0237421617 2711172364 3435439478 2218185286 2408514006 6604433258 8856986705 4315470696 5747458550 3323233421 0730154594 0516553790 6866273337 9958511562 5784322988 2737231989 8757141595 7811196358 3300594087 3068121602 8764962867 4460477464 9159950549 7374256269 0104903778 1986835938 1465741268 0492564879 8556145372 3478673303 9046883834 3634655379 4986419270 5638729317 4872332083 7601123029 9113679386 2708943879 9362016295 1541337142 4892830722 0126901475 4668476535 7616477379 4675200490 7571555278 1965362132 3926406160 1363581559 0742202020 3187277605 2772190055 6148425551 8792530343 5139844253 2234157623 3610642506 3904975008 6562710953 5919465897 5141310348 2276930624 7435363256 9160781547 8181152843 6679570611 0861533150 4452127473 9245449454 2368288606 1340841486 3776700961 2071512491 4043027253 8607648236 3414334623 5189757664 5216413767 9690314950 1910857598 4423919862 9164219399 4907236234 6468441173 9403265918 4044378051 3338945257 4239950829 6591228508 5558215725 0310712570 1266830240 2929525220 1187267675 6220415420 5161841634 8475651699 9811614101 0029960783 8690929160 3028840026 9104140792 8862150784 2451670908 7000699282 1206604183 7180653556 7252532567 5328612910 4248776182 5829765157 9598470356 2226293486 0034158722 9805349896 5022629174 8788202734 2092222453 3985626476 6914905562 8425039127 5771028402 7998066365 8254889264 8802545661 0172967026 6407655904 2909945681 5065265305 3718294127 0336931378 5178609040 7086671149 6558343434 7693385781 7113864558 7367812301 4587687126 6034891390 9562009939 3610310291 6161528813 8437909904 2317473363 9480457593 1493140529 7634757481 1935670911 0137751721 0080315590 2485309066 9203767192 2033229094 3346768514 2214477379 3937517034 4366199104 0337511173 5471918550 4644902636 5512816228 8244625759 1633303910 7225383742 1821408835 0865739177 1509682887 4782656995 9957449066 1758344137 5223970968 3408005355 9849175417 3818839994 4697486762 6551658276 5848358845 3142775687 9002909517 0283529716 3445621296 4043523117 6006651012 4120065975 5851276178 5838292041 9748442360 8007193045 7618932349 2292796501 9875187212 7267507981 2554709589 0455635792 1221033346 6974992356 3025494780 2490114195 2123828153 0911407907 3860251522 7429958180 7247162591 6685451333 1239480494 7079119153 2673430282 4418604142 6363954800 0448002670 4962482017 9289647669 7583183271 3142517029 6923488962 7668440323 2609275249 6035799646 9256504936 8183609003 2380929345 9588970695 3653494060 3402166544 3755890045 6328822505 4525564056 4482465151 8754711962 1844396582 5337543885 6909411303 1509526179 3780029741 2076651479 3942590298 9695946995 5657612186 5619673378 6236256125 2163208628 6922210327 4889218654 3648022967 8070576561 5144632046 9279068212 0738837781 4233562823 6089632080 6822246801 2248261177 1858963814 0918390367 3672220888 3215137556 0037279839 4004152970 0287830766 7094447456 0134556417 2543709069 7939612257 1429894671 5435784687 8861444581 2314593571 9849225284 7160504922 1242470141 2147805734 5510500801 9086996033 0276347870 8108175450 1193071412 2339086639 3833952942 5786905076 4310063835 1983438934 1596131854 3475464955 6978103829 3097164651 4384070070 7360411237 3599843452 2516105070 2705623526 6012764848 3084076118 3013052793 2054274628 6540360367 4532865105 7065874882 2569815793 6789766974 2205750596 8344086973 5020141020 6723585020 0724522563 2651341055 9240190274 2162484391 4035998953 5394590944 0704691209 1409387001 2645600162 3742880210 9276457931 0657922955 2498872758 4610126483 6999892256 9596881592 0560010165 5256375678 15 November 2018 Veton Këpuska

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Error Process of adding white noise is known as Dithering. This decorrelation technique was shown to be useful not only in improving the perceptual quality of the quantization noise but also with image signals. Signal-to-Noise Ratio A measure to quantify severity of the quantization noise. Relates the strength of the signal to the strength of the quantization noise. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Error SNR is defined as: Given assumptions for Quantizer range: 2xmax, and Quantization interval: = 2xmax/2B, for a B-bit quantizer Uniform pdf, it can be shown that (see Exercise 12.2): 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Error Thus SNR can be expressed as: Or in decibels (dB) as: Because xmax = 4x, then SNR(dB)≈6B-7.2 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Quantization Error Presented quantization scheme is called pulse code modulation (PCM). B-bits per sample are transmitted as a codeword. Advantages of this scheme: It is instantaneous (no coding delay) Independent of the signal content (voice, music, etc.) Disadvantages: It requires minimum of 11 bits per sample to achieve “toll quality” (equivalent to a typical telephone quality) For 10000 Hz sampling rate, the required bit rate is: B=(11 bits/sample)x(10000 samples/sec)=110,000 bps=110 kbps For CD quality signal with sample rate of 20000 Hz and 16-bits/sample, SNR(dB) =96-7.2=88.8 dB and bit rate of 320 kbps. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Nonuniform Quantization 15 November 2018 Veton Këpuska

Nonuniform Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Nonuniform Quantization Uniform quantization may not be optimal (SNR can not be as small as possible for certain number of decision and reconstruction levels) Consider for example speech signal for which x[n] is much more likely to be in one particular region than in other (low values occurring much more often than the high values). This implies that decision and reconstruction levels are not being utilized effectively with uniform intervals over xmax. A Nonuniform quantization that is optimal (in a least-squared error sense) for a particular pdf is referred to as the Max quantizer. Example of a nonuniform quantizer is given in the figure in the next slide. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Nonuniform Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Nonuniform Quantization 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Nonuniform Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Nonuniform Quantization Max Quantizer Problem Definition: For a random variable x with a known pdf, find the set of M quantizer levels that minimizes the quantization error. Therefore, finding the decision and boundary levels xi and xi, respectively, that minimizes the mean-squared error (MSE) distortion measure: D=E[(x-x)2] E-denotes expected value and x is the quantized version of x. It turns out that optimal decision level xk is given by: ^ ^ ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Nonuniform Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Nonuniform Quantization Max Quantizer (cont.) The optimal reconstruction level xk is the centroid of px(x) over the interval xk-1≤ x ≤xk: It is interpreted as the mean value of x over interval xk-1≤ x ≤xk for the normalized pdf p(x). Solving last two equations for xk and xk is a nonlinear problem in these two variables. Iterative solution which requires obtaining pdf (can be difficult). ^ ~ ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Nonuniform Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Nonuniform Quantization 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Companding Alternative to the nonuniform quantizer is companding. It is based on the fact that uniform quantizer is optimal for a uniform pdf. Thus if a nonlinearity is applied to the waveform x[n] to form a new sequence g[n] whose pdf is uniform then Uniform quantizer can be applied to g[n] to obtain g[n], as depicted in the Figure 12.10 in the next slide. ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Companding 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Companding A number of other nonlinear approximations of nonlinear transformation that achieves uniform density are used in practice which do not require pdf measurement. Specifically and A-law and –law companding. -law coding is give by: CCITT international standard coder at 64 kbps is an example application of -law coding. -law transformation followed by 7-bit uniform quantization giving toll quality speech. Equivalent quality of straight uniform quantization achieved by 11 bits. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Adaptive Coding Nonuniform quantizers are optimal for a long term pdf of speech signal. However, considering that speech is a highly-time-varying signal, one has to question if a single pdf derived from a long-time speech waveform is a reasonable assumption. Changes in the speech waveform: Temporal and spectral variations due to transitions from unvoiced to voiced speech, Rapid volume changes. Approach: Estimate a short-time pdf derived over 20-40 msec intervals. Short-time pdf estimates are more accurately described by a Gaussian pdf regardless of the speech class. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Adaptive Coding A pdf derived from a short-time speech segment more accurately represents the speech nonstationarity. One approach is to assume a pdf of a specific shape in particular a Gaussian with unknown variance 2. Measure the local variance then adapt a nonuniform quantizer to the resulting local pdf. This approach is referred to as adaptive quantization. For a Gaussian we have: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Adaptive Coding Measure the variance x2 of a sequence x[n] and use resulting pdf to design optimal max quantizer. Note that a change in the variance simply scales the time signal: If E(x2[n]) = x2 then E[(x [n])2] = 2x2 Need to design only one nonuniform quantizer with unity variance and scale decision and reconstruction levels according to a particular variance. Fix the quantizer and apply a time-varying gain to the signal according to the estimated variance (scale the signal to match the quantizer). 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Adaptive Coding 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Adaptive Coding There are two possible approaches for estimation of a time-varying variance 2[n]: Feed-forward method (shown in Figure 12.11) where the variance (or gain) estimate is obtained from the input Feed-back method where the estimate is obtained from a quantizer output. Advantage – no need to transmit extra side information (quantized variance) Disadvantage – additional sensitivity to transmission errors in codewords. Adaptive quantizers can achieve higher SNR than the use of –law companding. –law companding is generally preferred for high-rate waveform coding because of its lower background noise when transmission channel is idle. Adaptive quantization is useful in variety of other coding schemes. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Presented methods are examples of instantaneous quantization. Those approaches do not take advantage of the fact that speech, music, … is highly correlated signal: Short-time (10-15 samples), as well as Long-time (over a pitch period) In this section methods that exploit short-time correlation will be investigated. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Short-time Correlation: Neighboring samples are “self-similar”, that is, not changing too rapidly from one another. Difference of adjacent samples should have a lower variance than the variance of the signal itself. This difference, thus, would make a more effective use of quantization levels: Higher SNR for fixed number of quantization levels. Predicting the next sample from previous ones (finding the best prediction coefficients to yield a minimum mean-squared prediction error  same methodology as in LPC of Chapter 5). Two approaches: Have a fixed prediction filter to reflect the average local correlation of the signal. Allow predictor to short-time adapt to the signal’s local correlation. Requires transmission of quantized prediction coefficients as well as the prediction error. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Illustration of a particular error encoding scheme presented in the Figure 12.12 of the next slide. In this scheme the following sequences are required: x[n] – prediction of the input sample x[n]; This is the output of the predictor P(z) whose input is a quantized version of the input signal x[n], i.e., x[n] r[n] – prediction error signal; residual r[n] – quantized prediction error signal. This approach is sometimes referred to as residual coding. ~ ^ ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Quantizer in the previous scheme can be of any type: Fixed Adaptive Uniform Nonuniform Whatever the case is, the parameter of the quantizer are determined so that to match variance of r[n]. Differential quantization can also be applied to: Speech, Music, … signal Parameters that represent speech, music, …: LPC – linear prediction coefficients Cepstral coefficients obtained from Homomorphic filtering. Sinewave parameters, etc. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Consider quantization error of the quantized residual: From Figure 12.12 we express the quantized input x[n] as: ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Quantized signal samples differ form the input only by the quantization error er[n]. Since the er[n] is the quantization error of the residual: ⇒ If the prediction of the signal is accurate then the variance of r[n] will be smaller than the variance of x[n] ⇒ A quantizer with a given number of levels can be adjusted to give a smaller quantization error than would be possible when quantizing the signal directly. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization The differential coder of Figure 12.12 is referred to: Differential PCM (DPCM) when used with a fixed predictor and fixed quantization. Adaptive Differential PCM (ADPCM) when used with Adaptive prediction (i.e., adapting the predictor to local correlation) Adaptive quantization (i.e., adapting the quantizer to the local variance of r[n]) ADPCM yields greatest gains in SNR for a fixed bit rate. The international coding standard CCITT, G.721 with toll quality speech at 32 kbps (8000 samples/sec x 4 bits/sample) has been designed based on ADPCM techniques. To achieve higher quality with lower rates it is required to: Rely on speech model-based techniques and The exploiting of long-time prediction, as well as Short-time prediction 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Important variation of the differential quantization scheme of Figure 12.12. Prediction has assumed an all-pole model (autoregressive model). In this model signal value is predicted from its past samples: Any error in a codeword due to for example bit errors over a degraded channel propagate over considerable time during decoding. Such error propagation is severe when the signal values represent speech model parameters computed frame-by frame (as opposed to sample-by-sample). Alternative approach is to use a finite-order moving-average predictor derived from the residual. One common approach of the use of the moving-average predictor is illustrated in Figure 12.13 in the next slide. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Differential and Residual Quantization Digital Systems: Hardware Organization and Design 11/15/2018 Differential and Residual Quantization Coder Stage of the system in Figure 12.13: Residual as the difference of the true value and the value predicted from the moving average of K quantized residuals: p[k] – coefficients of P(z) Decoder Stage: Predicted value is given by: Error propagation is thus limited to only K samples (or K analysis frames for the case of model parameters) 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Model Estimation Goal - Estimate : filter coefficients {a1, a2, …,ap}; for a particular order p, and A, Over a short time span of speech signal (typically 20 ms) for which the signal is considered quasi-stationary. Use linear prediction method: Each speech sample is approximated as a linear combination of past speech samples ⇒ Set of analysis techniques for estimating parameters of the all-pole model. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Model Estimation Consider z-transform of the vocal tract model: Which can be transformed into: In time domain it can be written as: Referred to us as a autoregressive (AR) model. Current Sample Input Past Samples Scaling Factor – Linear Prediction Coefficients 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Model Estimation Method used to predict current sample from linear combination of past samples is called linear prediction analysis. LPC – Quantization of linear prediction coefficients or of a transformed version of these coefficients is called linear prediction coding For ug[n]=0 This observation motivates the analysis technique of linear prediction. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Model Estimation: Definitions Digital Systems: Hardware Organization and Design 11/15/2018 Model Estimation: Definitions A linear predictor of order p is defined by: Estimate of s[n] Estimate of ak z 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Model Estimation: Definitions Digital Systems: Hardware Organization and Design 11/15/2018 Model Estimation: Definitions Prediction error sequence is given as difference of the original sequence and its prediction: Associated prediction error filter is defined as: If {k}={ak} s[n] P[z] e[n]=Aug[n] ˜ A(z) 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Model Estimation: Definitions Digital Systems: Hardware Organization and Design 11/15/2018 Model Estimation: Definitions Note 1: Input sequence Aug[n] can be recovered by passing s[n] through A(z). For the condition that ak ≈ ak the prediction error filter A(z) is called the inverse filter. s[n] Aug[n] 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Model Estimation: Definitions Recovery of s[n]: s[n] is Periodic Speech, and ak = ak , e[n] is impulse train (Aug[n]) and zero most of the time . Aug[n] s[n] 15 November 2018 Veton Këpuska

Model Estimation: Definitions Digital Systems: Hardware Organization and Design 11/15/2018 Model Estimation: Definitions Note 2: If Vocal tract contains finite number of poles and no zeros, Prediction order is correct, then {k}={ak}, and e[n] is an impulse train for voiced speech and for impulse speech e[n] will be just an impulse. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Model Estimation: Definitions Note 3: In a stochastic context, when Aug[n] is white noise, then the inverse filter “whitens” the input signal 15 November 2018 Veton Këpuska

Digital Systems: Hardware Organization and Design 11/15/2018 Example 5.1 Consider an exponentially decaying impulse response of the form h[n]=anu[n] where u[n] is the unit step. Response to the scaled unit sample A[n] is: Consider the prediction of s[n] using a linear predictor of order p=1. It is a good fit since: Prediction error sequence with 1=a is: The prediction of the signal is exact except at the time origin. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization Important question is: how to derive an estimate of the prediction coefficients ak, for a particular order p, that would be optimal in some sense. Optimality is measured based on a criteria. An appropriate measure of optimality is mean-squared error (MSE). Goal is to minimize the mean-squared prediction error: E defined as: In reality, a model must be valid over some short-time interval, say M samples on either side of n: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization Thus in practice MSE is time-depended and is formed over a finite interval as depicted in previous figure. [n-M,n+M] – prediction error interval. Alternatively: where 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization Determine {k} for which En is minimal: Which results in: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization Last equation can be rewritten by multiplying through: Define the function: Which gives the following: Referred to as the normal equations given in the matrix form bellow: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization The minimum error for the optimal solution can be derived as follows: Last term in the equation above can be rewritten as: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization Thus error can be expressed as: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization Remarks: Order (p) of the actual underlying all-pole transfer function is not known. Order can be estimated by observing the fact that a pth order predictor in theory equals that of a (p+1) order predictor. Also predictor coefficients for k>p equal zero (or in practice close to zero and model only noise-random effects). Prediction error en[m] is non-zero only “in the vicinity” of the time n: [n-M,n+M]. In predicating values of the short-time sequence sn[m], p –values outside of the prediction error interval [n-M,n+M] are required. Covariance method – uses values outside the interval to predict values inside the interval Autocorrelation Method – assumes that speech samples are zero outside the interval. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Error Minimization Matrix formulation Projection Theorem: Columns of Sn – basis vectors Error vector en is orthogonal to each basis vector: SnTen=0; where Orthogonality leads to: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method for LP 15 November 2018 Veton Këpuska

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method In previous section we have described a general method of linear prediction that uses samples outside the prediction error interval referred to as covariance method. Alternative approach that does not consider samples outside analysis interval, referred to as autocorrelation method, and will be presented next. This method is: Suboptimal, however it Leads to an efficient and stable solution to normal equations. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Assumes that the samples outside the time interval [n-M,n+M] are all zero, and Extends the prediction error interval, i.e., the range over which we minimize the mean-squared error to ±∞. Conventions: Short-time interval: [n, n+Nw-1] where Nw=2M+1 (Note: it is not centered around sample n as in previous derivation). Segment is shifted to the left by n samples so that the first nonzero sample falls at m=0. This operation is equivalent to: Shifting of speech sequence s[m] by n-samples to the left and Windowing by Nw -point rectangular window: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Windowed sequence can be expressed as: This operation can be depicted in the figure presented on the right. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Important observations that are consequence of zeroing the signal outside of interval: Prediction error is nonzero only in the interval [0,Nw+p-1] Nw-window length p-the predictor order The prediction error is largest at the left and right ends of the segment. This is due to edge effects caused by the way the prediction is done: from zeros – from the left of the window to zeros – from the right of the window 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method To compensate for edge effects typically tapered window is used (e.g., Hamming). Removes the possibility that the mean-squared error be dominated by end (edge) effects. Data becomes distorted hence biasing estimates: k. Let the mean-squared prediction error be given by: Limits of summation refer to new time origin, and Prediction error outside this interval is zero. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Normal equations take the following form (Exercise 5.1): where 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Due to summation limits depicted in the figure on the right function n[i,k] can be written as: Recognizing that only samples in the interval [i,k+Nw-1] contribute to the sum, and Changing variable m⇒ m-i: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Since the above expression is only function of difference i-k thus we denote it as: Letting =i-k, referred to as correlation “lag”, leads to short-time autocorrelation function: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method rn[]=sn[]*sn[-] Autocorrelation method leads to computation of the short-time sequence sn[m] convolved with itself flipped in time. Autocorrelation function is a measure of the “self-similarity” of the signal at different lags . When rn[] is large then signal samples spaced by  are said to by highly correlated. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Properties of rn[]: For an N-point sequence, rn[] is zero outside the interval [-(N-1),N-1]. rn[] is even function of  rn[0] ≥ rn[] rn[0] – energy of sn[m] ⇒ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method If sn[m] is a segment of a periodic sequence, then rn[] is periodic-like with the same period: Because sn[m] is short-time, the overlapping data in the correlation decreases as  increases ⇒ Amplitude of rn[] decreases as  increases; With rectangular window the envelope of rn[] decreases linearly. If sn[m] is a random white noise sequence, then rn[] is impulse-like, reflecting self-similarity only within a small neighborhood. 15 November 2018 Veton Këpuska

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Letting n[i,k] = rn[i-k], normal equation take the form: The expression represents p linear equations with p unknowns, k for 1≤k≤p. Using the normal equation solution, it can be shown that the corresponding minimum mean-squared prediction error is given by: Matrix form representation of normal equations: Rn=rn. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Autocorrelation Method Digital Systems: Hardware Organization and Design 11/15/2018 Autocorrelation Method Expanded form: The Rn matrix is Toepliz: Symmetric about the diagonal All elements of the diagonal are equal. Matrix is (always) invertible A Toeplitz matrix can be decomposed in O(n2) time Implies efficient solution. Rn  rn 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Example 5.3 Consider a system with an exponentially decaying impulse response of the form h[n] = anu[n], with u[n] being the unit step function. Estimate a using the autocorrelation method of linear prediction. h[n] A[n] s[n] Z 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Example 5.3 Apply N-point rectangular window [0,N-1] at n=0. Compute r0[0] and r0[1]. Using normal equations: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Example 5.3 Minimum squared error (from slide 33) is thus (Exercise 5.5): For 1st order predictor, as in this example here, prediction error sequence for the true predictor (i.e., 1 = a) is given by: e[n]=s[n]-as[n-1]=[n] (see example 5.1 presented earlier). Thus the prediction of the signal is exact except at the time origin. This example illustrates that with enough data the autocorrelation method yields a solution close to the true single-pole model for an impulse input. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Vector Quantization 15 November 2018 Veton Këpuska

Vector Quantization (VQ) Digital Systems: Hardware Organization and Design 11/15/2018 Vector Quantization (VQ) Investigation of scalar quantization techniques was the topic of previous sections. A generalization of scalar quantization referred to as vector quantization is investigated in this section. In vector quantization a block of scalars are coded as a vector rather than individually. An optimal quantization strategy can be derived based on a mean-squared error distortion metric as with scalar quantization. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Vector Quantization (VQ) Digital Systems: Hardware Organization and Design 11/15/2018 Vector Quantization (VQ) Motivation Assume the vocal tract transfer function is characterized by only two resonance's thus requiring four reflection coefficients. Furthermore, suppose that the vocal tract can take on only one of possible four shapes. This implies that there exist only four possible sets of the four reflection coefficients as illustrated in Figure 12.14 in the next slide. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Vector Quantization (VQ) Digital Systems: Hardware Organization and Design 11/15/2018 Vector Quantization (VQ) 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Scalar Quantization – considers each of the reflection coefficient individually: Each coefficient can take on 4 different values ⇒ 2 bits required to encode each coefficient. For 4 reflection coefficients it is required 4x2=8 bits per analysis frame to code the vocal tract transfer function. Vector Quantization – since there are only four possible vocal tract positions of the vocal tract corresponding to only four possible vectors of reflection coefficients. Scalar values of each vector are highly correlated. Thus 2 bits are required to encode the 4 reflection coefficients. Note: if scalars were independent of each other treating them together as a vector would have no advantage over treating them individually. 15 November 2018 Veton Këpuska

Vector Quantization (VQ) Digital Systems: Hardware Organization and Design 11/15/2018 Vector Quantization (VQ) Consider a vector of N continuous scalars: With VQ, the vector x is mapped into another N-dimensional vector x: Vector x is chosen from M possible reconstruction (quantization) levels: ^ ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Vector Quantization (VQ) Digital Systems: Hardware Organization and Design 11/15/2018 Vector Quantization (VQ) T T 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Vector Quantization (VQ) Digital Systems: Hardware Organization and Design 11/15/2018 Vector Quantization (VQ) VQ-vector quantization operator ri-M possible reconstruction levels for 1≤i<M Ci-ith “cell” or cell boundary If x is in the cell Ci, then x is mapped to ri. ri – codeword {ri} – set of all codewords; codebook. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Vector Quantization (VQ) Digital Systems: Hardware Organization and Design 11/15/2018 Vector Quantization (VQ) Properties of VQ: P1: In vector quantization a cell can have an arbitrary size and shape. In scalar quantization a “cell” (region between two decision levels) can have an arbitrary size, but its shape is fixed. P2: Similarly to scalar quantization, distortion measure D(x,x), is a measure of dissimilarity or error between x and x. ^ ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 VQ Distortion Measure Vector quantization noise is represented by the vector e: The distortion is the average of the sum of squares of scalar components: For the multi-dimensional pdf px(x): 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 VQ Distortion Measure Goal to minimize: Two conditions formulated by Lim: C1: A vector x must be quantized to a reconstruction level ri that gives the smallest distortion between x and ri. C2: Each reconstruction level ri must be the centroid of the corresponding decision region (cell Ci) Condition C1 implies that given the reconstruction levels we can quantize without explicit need for the cell boundaries. To quantize a given vector the reconstruction level is found which minimizes its distortion. This process requires a large search – active area of research. Condition C2 specifies how to obtain a reconstruction level from the selected cell. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 VQ Distortion Measure Stated 2 conditions provide the basis for iterative solution of how to obtain VQ codebook. Start with initial estimate of ri. Apply condition 1 by which all the vectors from a set that get quantized by ri can be determined. Apply second condition to obtain a new estimate of the reconstruction levels (i.e., centroid of each cell) Problem with this approach is that it requires estimation of joint pdf of all x in order to compute the distortion measure and the multi-dimensional centroid. Solution: k-means algorithm (Lloyd for 1-D and Forgy for multi-D). 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 k-Means Algorithm Compute the ensemble average D as: xk are the training vectors and xk are the quantized vectors. Pick an initial guess at the reconstruction levels {ri} For each xk select closest ri. Set of all xk nearest to ri forms a cluster (see Figure 12.16) – “clustering algorithm”. Compute the mean of xk in each cluster which gives a new ri’s. Calculate D. Stop when the change in D over two consecutive interactions is insignificant. This algorithm converges to a local minimum of D. ^ 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 k-Means Algorithm 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

ClusteringUsingGaussianMixtureModelsExample 15 November 2018 Veton Këpuska

Neural Networks Based Clustering Algorithms Digital Systems: Hardware Organization and Design 11/15/2018 Neural Networks Based Clustering Algorithms Kohonen’s SOFM Topological Ordering of the SOFM Offers potential for further reduction in bit rate. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Example of SOFM 15 November 2018 Veton Këpuska

Example of SOFM SOMDemo\SOMDemo\executable\SOMDemo.exe 15 November 2018 Veton Këpuska

SOFM 15 November 2018 Veton Këpuska

SPFM Weights 15 November 2018 Veton Këpuska

Use of VQ in Speech Transmission Digital Systems: Hardware Organization and Design 11/15/2018 Use of VQ in Speech Transmission Obtain the VQ codebook from the training vectors - all transmitters and receivers must have identical copies of VQ codebook. Analysis procedure generates a vector xi. Transmitter sends the index of the centroid ri of the closest cluster for the given vector xi. This step involves search. Receiving end decodes the information by accessing the codeword of the received index and performing synthesis operation. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 Model-Based Coding The purpose of model-based speech coding is to increase the bit efficiency to achieve either: Higher quality for the same bit rate or Lower bit rate for the same quality. Chronological perspective of model-based coding starting with: All-pole speech representation used for coding: Scalar Quantization Vector Quantization Mixed Excitation Linear Prediction (MELP) coder: Remove deficiencies in binary source representation. Code-excited Linear Prediction (CELP) coder: Does nor require explicit multi-band decision and source characterization as MELP. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Basic Linear Prediction Coder (LPC) Digital Systems: Hardware Organization and Design 11/15/2018 Basic Linear Prediction Coder (LPC) Recall the basic speech production model of the form: where the predictor polynomial is given as: Suppose: Linear Prediction analysis performed at 100 frames/s 13 parameters are used: 10 all-pole spectrum parameters, Pitch Voicing decision Gain Resulting in 1300 parameters/s. Compared to telephone quality signal: 4000 Hz bandwidth  8000 samples/s (8 bit per sample). 1300 parameters/s < 8000 samples/s 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Basic Linear Prediction Coder (LPC) Digital Systems: Hardware Organization and Design 11/15/2018 Basic Linear Prediction Coder (LPC) Instead of prediction coefficients ai use: Corresponding poles bi Partial Correlation Coefficients ki (PARCOR) Reflection Coefficients ri, or Other equivalent representation. Behavior of prediction coefficients is difficult to characterize: Large dynamic range ( large variance) Quantization errors can lead to unstable system function at synthesis (poles may move outside the unit circle). Alternative equivalent representations: Have a limited dynamic range Can be easily enforced to give stability because |bi|<1 and |ki|<1. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Basic Linear Prediction Coder (LPC) Digital Systems: Hardware Organization and Design 11/15/2018 Basic Linear Prediction Coder (LPC) Many ways to code linear prediction parameters: Ideally optimal quantization uses the Max quantizer based on known or estimated pdf’s of each parameter. Example of 7200 bps coding: Voice/Unvoiced Decision: 1 bit (on or off) Pitch (if voiced): 6 bits (uniform) Gain: 5 bits (nonuniform) Each Pole bi: 10 bits (nonuniform) 5 bits for bandwidth 5 bits for center frequency Total of 6 poles 100 frames/s 1+6+5+6x10=72 bits Quality limited by simple impulse/noise excitation model. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Basic Linear Prediction Coder (LPC) Digital Systems: Hardware Organization and Design 11/15/2018 Basic Linear Prediction Coder (LPC) Improvements possible based on replacement of poles with PARCOR. Higher order PARCOR have pdf’s closer to Gaussian centered around zero  nonuniform quantization. Companding is effective with PARCOR: Transformed pdf’s close to uniform. Original PARCOR coefficients do not have a good spectral sensitivity (change in spectrum with a change in spectral parameters that is desired to minimize). Empirical finding that a more desirable transformation in this sense is to use logarithm of the vocal tract area function ratio: 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Basic Linear Prediction Coder (LPC) Digital Systems: Hardware Organization and Design 11/15/2018 Basic Linear Prediction Coder (LPC) Parameters gi: Have a pdf close to uniform Smaller spectral sensitivity than PARCOR: The all pole spectrum changes less with a change in gi than with a change in ki Note that spectrum changes less with the change in ki than with the change in pole positions. Typically these parameters can be coded at 5-6 bits each (significant improvement over 10 bits): 100 frames/s Order 6 of the predictor (6 poles) (1+6+5+6x6)x100 bps = 4800 bps Same quality as 7200 bps by coding pole positions for telephone bandwidth speech. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Basic Linear Prediction Coder (LPC) Digital Systems: Hardware Organization and Design 11/15/2018 Basic Linear Prediction Coder (LPC) Government standard for secure communications using 2.4 kbps for about a decade used this basic LPC scheme at 50 frames per second. Demand for higher quality standards opened up research on two primary problems with speech codes base on all-pole linear prediction analysis: Inadequacy of the basic source/filter speech production model Restrictions of one-dimensional scalar quantization techniques to account for possible parameter correlation. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 A VQ LPC Coder K-means algorithm VQ based LPC PARCOR coder. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 A VQ LPC Coder Use VQ LPC Coder to achieve same quality of speech with lower bit-rate: 10—bit code book (1024 codewords) 800 bps  2400 bps of scalar quantization 44.4 frames/s 440 bits to code PARCOR coefficients per second. 8 bits per frame for: Pitch Gain Voicing 1 bit for frame synchronization per second. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

A VQ LPC Coder Maintain 2400 bps bit rate with a higher quality of speech coding (early 1980): 22-bit codebook  222 = 4200000 codewords. Problems: Intractable solution due to computational requirements (large VQ search) Memory (large Codebook size) VQ based spectrum characterized by a “wobble” due to LPC-based spectrum being quantized: Spectral representation near cell boundary  “wobble” to and from neighboring cells  insufficient number of codebooks. Emphasis changed from improved VQ of the spectrum and better excitation models ultimately to a return to VQ on the excitation. 15 November 2018 Veton Këpuska

Mixed Excitation LPC (MELP) Digital Systems: Hardware Organization and Design 11/15/2018 Mixed Excitation LPC (MELP) Multi-band voicing decision (introduced as a concept in Section 12.5.2 – not covered in slides) Addresses shortcomings of conventional linear prediction analysis/synthesis: Realistic excitation signal Time varying vocal tract formant bandwidths Production principles of the “anomalous” voice. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Mixed Excitation LPC (MELP) Digital Systems: Hardware Organization and Design 11/15/2018 Mixed Excitation LPC (MELP) Model: Different mixtures of impulses and noise are generated in different frequency bands (4-10 bands) The impulse train and noise in the MELP model are each passed through time-varying spectral shaping filters and are added together to form a full-band signal. MELP unique components: An auditory-based approach to multi-band voicing estimation for the mixed impulse/noise excitation. Aperiodic impulses due to pitch jitter, the creaky voice, and the diplophonic voice. Time-varying resonance bandwidth within a pitch period accounting for nonlinear source/system interaction and introducing the truncation effects. More accurate shape of the glottal flow velocity source. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Mixed Excitation LPC (MELP) Digital Systems: Hardware Organization and Design 11/15/2018 Mixed Excitation LPC (MELP) 2.4 kbps coder has been implemented based on the MELP model and has been selected as government standard for secure telephone communications. Original version of MELP uses: 34 bits for scalar quantization of the LPC coefficients (Specifically the line spectral frequencies LSFs). 8 bits for gain 7 bits for pitch and overall voicing Uses autocorrelation technique on the lowpass filtered LPC residual. 5-bits to multi-band voicing. 1-bit for the jittery state (aperiodic) flag. 54 bits per 22.5 ms frame  2.4 bps. In actual 2.4 kbs standard greater efficiency is achieved with vector quantization of LSF coefficients. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Mixed Excitation LPC (MELP) Digital Systems: Hardware Organization and Design 11/15/2018 Mixed Excitation LPC (MELP) Line Spectral Frequencies (LSFs) More efficient parameter set for coding the all-pole model of linear prediction. The LSFs for a pth order all-pole model are defined as follows: Two polynomials of order p+1 are created from the pth order inverse filter A(z) according to: LSFs can be coded efficiently and stability of the resulting syntheses filter can be guaranteed when they are quantized. Better quantization and interpolation properties than the corresponding PARCOR coefficients. Disadvantage is the fact that solving for the roots of P(z) and Q(z) can be more computationally intensive than the PARCOR coefficients. Polynomial A(z) is easily recovered from the LSFs (Exercise 12.18). 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Code-Excited Linear Prediction (CELP) Digital Systems: Hardware Organization and Design 11/15/2018 Code-Excited Linear Prediction (CELP) Concept: Basic Idea in CELP is to represent the residual from long-term prediction on each frame by codewords form a VQ generated codebook (as oppose to multi-pulses) On each frame a codeword is chosen from a codebook of residuals such as to minimize the mean-squared error between the synthesized and original speech waveform. The length of a codeword sequence is determined by the analysis frame length. For a 10 ms frame interval split into 2 inner frames of 5 ms each a codeword sequence is 40 samples in duration for an 8000 Hz sampling rate. The residual and long-term predictor is estimated with twice the time resolution (a 5 ms frame) of the short-term predictor (10 ms frame); Excitation is more nonstationary than the vocal tract. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Code-Excited Linear Prediction (CELP) Digital Systems: Hardware Organization and Design 11/15/2018 Code-Excited Linear Prediction (CELP) Two approach to formation of the codebook: Deterministic Stochastic Deterministic codebook – It is formed by applying the k-means clustering algorithm to a large set of residual training vectors. Channel mismatch Stochastic codebook Histogram of the residual from the long-term predictor follows roughly a Gaussian probability pdf. A valid assumption with exception of plosives and voiced/unvoiced transitions. Cumulative distributions are nearly identical to those for white Gaussian random variables  Alternative codebook is constructed of white Gaussian random variables with unit variance. 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

Digital Systems: Hardware Organization and Design 11/15/2018 CELP Coders Variety of government and International standard coders: 1990’s Government standard for secure communications at 4.8 kbps at 4000 Hz bandwidth (Fed-Std1016) uses CELP coder: Three bit rates: 9.6 kbps (multi-pulse) 4.8 kbps (CELP) 2.4 kbps (LPC) Short-time predictor: 30 ms frame interval coded with 34 bits per frame. 10th order vocal tract spectrum from prediction coefficients transformed to LSFs coded nonuniform quantization. Short-term and long-term predictors are estimated in open-loop Residual codewords are determined in closed-loop form. Current international standards use CELP based coding. G.729 G.723.1 15 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

END 15 November 2018 Veton Këpuska