Download presentation
Presentation is loading. Please wait.
2
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing EE2F1 Speech & Audio Technology Lecture 2 Martin Russell Electronic, Electrical & Computer Engineering School of Engineering The University of Birmingham
3
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 2 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Pulse Code Modulation (PCM) How many quantization points? How many samples per second (sample rate)? Quantization error Sample point Quantization point
4
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 3 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Quantisation Each sample is stored as a computer “word” with a given number of bits More bits give more levels: Number of bits Number of levels Quality 664Poor – just intelligible 8256Tolerable speech 138192FM Radio 1665536Good - CD
5
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 4 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Sample Rate All sound so far sampled at 44.1kHz (why?) Hence Nyquist frequency = 22.05kHz 44.1kHz and 16 bit quantisation used on audio CDs (we’ll talk about CDs later)
6
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 5 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Effect of sample rates Samples per secondNyquist frequency 44,10022,050Hz 22,05011,025Hz 16,0008,000Hz 8,0004,000Hz 40002,000Hz
7
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 6 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Speech sounds
8
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 7 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Speech sample rates
9
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 8 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Audio compression CD quality audio sampled at 44.1kHz, 16bits Hence 176 kbytes per second 3 minute song requires 30 megabytes So, an ISDN line at 128 kbits per second is ten times too slow for CD quality audio Hence need for compression –Lossless (e.g. for computer data files) –Lossy (e.g. for images and audio)
10
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 9 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing -Law Compression Use non-linear quantisation –Constant quantisation noise w.r.t signal amplitude Most common scheme called -Law compression 8 non-linear bit -law achieves same performance as 12 bit linear 8 bit -law at 8 kHz sample rate used in US digital phone lines
11
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 10 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Non-linear quantisation Quantization points arranged non-linearly Sample point Quantization point
12
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 11 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Adaptive Delta PCM (ADPCM) Delta PCM –Code changes in signal, not absolute values Adaptive Delta PCM (ADPCM) –Adapt quantisation step size –Compute change in signal value –Compare with previous step size –Increase or decrease step size accordingly ADPCM applications –Speech coders –Quality too poor for music or other quality-critical applications
13
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 12 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing MPEG Exploits knowledge of human audio perception….
14
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 13 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Frequency analysis 1kHz square wave
15
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 14 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Spectrum cross section 1 kHz square wave Fundamental at 1 kHz 3 rd harmonic at 3 kHz 5 th harmonic at 5 kHz 7 th harmonic at 7 kHz
16
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 15 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Sums of sine waves Sine wave Original, 3 rd harmonic Original + 3 rd harmonic
17
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 16 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Sums of sine waves Original, 3 rd harmonic, 5 th harmonic Original + 3 rd harmonic + 5 th harmonic
18
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 17 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Sums of sine waves (contd.) Original, 3 rd harmonic, 5 th harmonic, 7 th harmonic Original + 3 rd harmonic + 5 th harmonic + 7 th harmonic
19
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 18 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing More formally… fHz square wave fHz sine wave 3fHz sine wave ‘Weight’
20
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 19 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Fourier analysis Adding a set of weighted sine waves results in a complex periodic waveform In fact any periodic waveform can be expressed as a sum of weighted sine waves The mathematical technique which calculates which sine waves are needed, and what each should be multiplied by is called Fourier Analysis
21
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 20 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Spectrum of a square wave f 3f 5f 7f 9f … a a/3 a/5
22
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 21 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Recorder
23
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 22 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Phase Additional information needed to encode phase differences In fact, the values in a spectrum are complex numbers – the imaginary part encodes phase We shall ignore phase in most of what follows Phase difference
24
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 23 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Aperiodic signals For signals which are not periodic, calculate spectrum over a short time ‘window’ ‘Slide’ analysis window forward in time Spectrogram display
25
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 24 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wide-band and narrow-band Long analysis window: –Good frequency resolution –Poor temporal resolution –‘Narrow-band’ spectrogram Short analysis window: –Poor frequency resolution –Good temporal resolution –‘Wide-band’ spectrogram
26
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 25 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wide-band & narrow-band spectrograms wide-band narrow-band
27
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 26 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Next week Filters Human hearing
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.