Presentation is loading. Please wait.

Presentation is loading. Please wait.

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.

Similar presentations


Presentation on theme: "EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision."— Presentation transcript:

1 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing EE2F1 Speech & Audio Technology Lecture 3 Martin Russell Electronic, Electrical & Computer Engineering School of Engineering The University of Birmingham

2 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 2 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Reminder from last week: Fourier Transform f 3f 5f 7f Fourier Transform +

3 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 3 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Low Pass Filter (1) f 3f 5f 7f Low pass “brick- wall” filter f 3f 5f 7f Cut-off frequency

4 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 4 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Low Pass Filter (2) f 3f 5f 7f Low pass “brick- wall” filter f 3f 5f 7f

5 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 5 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing High Pass Filter f 3f 5f 7f High pass filter f 3f 5f 7f

6 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 6 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Band Pass Filter f 3f 5f 7f Band pass filter f 3f 5f 7f

7 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 7 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Demonstration

8 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 8 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Effect of filtering (1)

9 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 9 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Effect of filtering (2)

10 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 10 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Implementation of filters  Practical filters are approximations to idealised, ‘brick-wall’ filters described  Example of a linear system frequency Frequency response

11 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 11 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Linear System x1(n)x1(n)y1(n)y1(n) x2(n)x2(n)y2(n)y2(n) x 1 (n) + x 2 (n)y 1 (n) + y 2 (n) g*x1(n)g*x1(n)g*y1(n)g*y1(n)

12 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 12 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Impulse response of a LS 0 i(n)i(n) r(n)r(n) 0

13 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 13 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Response of a LS  Compute output for general input from impulse response 0 x(n)x(n) 0 0 +

14 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 14 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Digital filters  Output of LS is sum of weighted, delayed inputs 0 i(n)i(n) r(n)r(n) 0 x(n)x(n) Z -1  a2a2 a1a1 y(n)y(n)

15 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 15 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Finite Impulse Response (FIR) digital filter x(n)x(n) Z -1  y(n)y(n) a1a1 a2a2 aNaN

16 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 16 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing General digital filters  In a general digital filter, output is a sum of delayed inputs and outputs  Recursive filter  Infinite Impulse response (IIR) filter

17 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 17 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The human auditory system taken from J N Holmes, “Speech Synthesis and Recognition”, Van Nostrand Reinhold (1988)

18 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 18 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The cochlea Australian National University – http::/online.anu.edu.au/ITA/ACAT/drw/PPofM/hearing/hearing3.html

19 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 19 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The basilar membrane School for advanced studies, Triste, Italy – http::/poirot.sissa.it/multidisc/cochlea/utils/basilar.htm Frequency sensitivity of the basilar membrane

20 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 20 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Basilar membrane dynamics School for advanced studies, Triste, Italy – http::/poirot.sissa.it/multidisc/cochlea/utils/basilar.htm

21 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 21 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking  Frequency resolution of the ear  Loud sounds mask perception of quieter sounds with similar frequency  Many different psycho-acoustic experiments  Exploited in MP3 coding

22 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 22 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking Eperiment  Low level pure tone (sinusoid) mixed with narrow band of random noise with higher level and same centre frequency  Perception of tone masked by noise  Now move centre frequency of noise  How loud does the noise need to be to mask the tone? frequency ?

23 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 23 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking experiment 1kHz frequency Level dB SPL Psycho-physical tuning curve

24 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 24 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing An experiment  First play two tones: A and B  Then play a third and fourth tone: C and D i  Vary D i  When do you perceive the difference between C and D i to be the same as between A and B ???

25 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 25 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Experiment  A B C D 1  A B C D 2  A B C D 3  A B C D 4

26 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 26 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Answer:  In theory, should have chosen: –A (500Hz) B (600Hz) C (1500Hz) D 2 (1680Hz)  Equal distance between A – B and C – D 2 on the perceptual mel frequency scale  A B C D 2

27 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 27 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The mel scale A B C D 2

28 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 28 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Lessons from psycho- acoustics  Human speech perception begins with frequency analysis on the basilar membrane  Individual point on the basilar membrane can be modelled as band-pass filter – critical bandwidths  Frequency is not perceived on a linear scale – hence use of non-linear perceptual frequency scales: mel scale, bark scale,…  Loudness perceived on logarithmic scale


Download ppt "EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision."

Similar presentations


Ads by Google