EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.

Slides:



Advertisements
Similar presentations
Alex Chen Nader Shehad Aamir Virani Erik Welsh
Advertisements

Decibel values: sum and difference. Sound level summation in dB (1): Incoherent (energetic) sum of two different sounds: Lp 1 = 10 log (p 1 /p rif ) 2.
| Page Angelo Farina UNIPR | All Rights Reserved | Confidential Digital sound processing Convolution Digital Filters FFT.
Frequency analysis.
Angelo Farina Dip. di Ingegneria Industriale - Università di Parma Parco Area delle Scienze 181/A, Parma – Italy
Audio Compression ADPCM ATRAC (Minidisk) MPEG Audio –3 layers referred to as layers I, II, and III –The third layer is mp3.
Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Digital Audio Compression
Frequency selectivity of the auditory system. Frequency selectivity Important for aspects of auditory perception such as, pitch, loudness, timbre, melody,
Digital Audio Coding – Dr. T. Collins Standard MIDI Files Perceptual Audio Coding MPEG-1 layers 1, 2 & 3 MPEG-4.
EE513 Audio Signals and Systems Digital Signal Processing (Synthesis) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Hearing and Deafness 2. Ear as a frequency analyzer Chris Darwin.
CS 551/651: Structure of Spoken Language Lecture 11: Overview of Sound Perception, Part II John-Paul Hosom Fall 2010.
The peripheral auditory system David Meredith Aalborg University.
EE Audio Signals and Systems Psychoacoustics (Pitch) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Structure of human ear Understanding the processes of human auditory system are key for posting requirements for architectural acoustics. This gives us.
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 1 ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustics.
Unit 9 IIR Filter Design 1. Introduction The ideal filter Constant gain of at least unity in the pass band Constant gain of zero in the stop band The.
A.Diederich– International University Bremen – Sensation and Perception – Fall Frequency Analysis in the Cochlea and Auditory Nerve cont'd The Perception.
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
PH 105 Dr. Cecilia Vogel Lecture 10. OUTLINE  Subjective loudness  Masking  Pitch  logarithmic  critical bands  Timbre  waveforms.
SUBJECTIVE ATTRIBUTES OF SOUND Acoustics of Concert Halls and Rooms Science of Sound, Chapters 5,6,7 Loudness, Timbre.
Rob van der Willigen Auditory Perception.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 2 –Auditory Perception and Digital Audio Klara Nahrstedt Spring 2011.
EE Audio Signals and Systems Psychoacoustics (Masking) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Audio and Music Representations (Part 2) 1.
DSP. What is DSP? DSP: Digital Signal Processing---Using a digital process (e.g., a program running on a microprocessor) to modify a digital representation.
Ni.com Data Analysis: Time and Frequency Domain. ni.com Typical Data Acquisition System.
Abstract We report comparisons between a model incorporating a bank of dual-resonance nonlinear (DRNL) filters and one incorporating a bank of linear gammatone.
Sound is a pressure wave Figure by MIT OCW. After figure 11.1 in: Bear, Mark F., Barry W. Connors, and Michael A. Paradiso. Neuroscience: Exploring the.
EE Audio Signals and Systems Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Acoustics/Psychoacoustics Huber Ch. 2 Sound and Hearing.
1 Speech and Audio Processing and Coding (cont.) Dr Wenwu Wang Centre for Vision Speech and Signal Processing Department of Electronic Engineering
By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition.
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
Hearing Chapter 5. Range of Hearing Sound intensity (pressure) range runs from watts to 50 watts. Frequency range is 20 Hz to 20,000 Hz, or a ratio.
1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Chapter 7: Loudness and Pitch. Loudness (1) Auditory Sensitivity: Minimum audible pressure (MAP) and Minimum audible field (MAF) Equal loudness contours.
Angelo Farina Dip. di Ingegneria Industriale - Università di Parma Parco Area delle Scienze 181/A, Parma – Italy
Gammachirp Auditory Filter
Applied Psychoacoustics Lecture 3: Masking Jonas Braasch.
BA , 1 Basic Frequency Analysis of Sound Contents: Frequency and Wavelength Frequency Analysis Perception of Sound.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
The human auditory system
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Pitch Perception Or, what happens to the sound from the air outside your head to your brain….
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Analog and Digital Filters used in Audio Contexts Tufts University – ME 93 October 22, 2015.
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
Fletcher’s band-widening experiment (1940) Present a pure tone in the presence of a broadband noise. Present a pure tone in the presence of a broadband.
Physics Acoustics for Musicians selected slides, March 5, 2002 Loudness at different frequencies Critical band Masking The ear Neurological response.
MP3 and MP4 Audio By: Krunal Tailor
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE445S Real-Time Digital Signal Processing Lab Spring.
PATTERN COMPARISON TECHNIQUES
Fletcher’s band-widening experiment (1940)
PSYCHOACOUSTICS A branch of psychophysics
III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.
CHAPTER 10 Auditory Sensitivity.
Psychoacoustics: Sound Perception
III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.
Presentation transcript:

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing EE2F1 Speech & Audio Technology Lecture 3 Martin Russell Electronic, Electrical & Computer Engineering School of Engineering The University of Birmingham

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 2 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Reminder from last week: Fourier Transform f 3f 5f 7f Fourier Transform +

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 3 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Low Pass Filter (1) f 3f 5f 7f Low pass “brick- wall” filter f 3f 5f 7f Cut-off frequency

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 4 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Low Pass Filter (2) f 3f 5f 7f Low pass “brick- wall” filter f 3f 5f 7f

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 5 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing High Pass Filter f 3f 5f 7f High pass filter f 3f 5f 7f

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 6 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Band Pass Filter f 3f 5f 7f Band pass filter f 3f 5f 7f

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 7 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Demonstration

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 8 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Effect of filtering (1)

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 9 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Effect of filtering (2)

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 10 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Implementation of filters  Practical filters are approximations to idealised, ‘brick-wall’ filters described  Example of a linear system frequency Frequency response

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 11 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Linear System x1(n)x1(n)y1(n)y1(n) x2(n)x2(n)y2(n)y2(n) x 1 (n) + x 2 (n)y 1 (n) + y 2 (n) g*x1(n)g*x1(n)g*y1(n)g*y1(n)

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 12 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Impulse response of a LS 0 i(n)i(n) r(n)r(n) 0

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 13 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Response of a LS  Compute output for general input from impulse response 0 x(n)x(n) 0 0 +

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 14 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Digital filters  Output of LS is sum of weighted, delayed inputs 0 i(n)i(n) r(n)r(n) 0 x(n)x(n) Z -1  a2a2 a1a1 y(n)y(n)

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 15 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Finite Impulse Response (FIR) digital filter x(n)x(n) Z -1  y(n)y(n) a1a1 a2a2 aNaN

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 16 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing General digital filters  In a general digital filter, output is a sum of delayed inputs and outputs  Recursive filter  Infinite Impulse response (IIR) filter

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 17 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The human auditory system taken from J N Holmes, “Speech Synthesis and Recognition”, Van Nostrand Reinhold (1988)

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 18 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The cochlea Australian National University –

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 19 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The basilar membrane School for advanced studies, Triste, Italy – Frequency sensitivity of the basilar membrane

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 20 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Basilar membrane dynamics School for advanced studies, Triste, Italy –

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 21 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking  Frequency resolution of the ear  Loud sounds mask perception of quieter sounds with similar frequency  Many different psycho-acoustic experiments  Exploited in MP3 coding

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 22 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking Eperiment  Low level pure tone (sinusoid) mixed with narrow band of random noise with higher level and same centre frequency  Perception of tone masked by noise  Now move centre frequency of noise  How loud does the noise need to be to mask the tone? frequency ?

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 23 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking experiment 1kHz frequency Level dB SPL Psycho-physical tuning curve

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 24 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing An experiment  First play two tones: A and B  Then play a third and fourth tone: C and D i  Vary D i  When do you perceive the difference between C and D i to be the same as between A and B ???

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 25 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Experiment  A B C D 1  A B C D 2  A B C D 3  A B C D 4

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 26 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Answer:  In theory, should have chosen: –A (500Hz) B (600Hz) C (1500Hz) D 2 (1680Hz)  Equal distance between A – B and C – D 2 on the perceptual mel frequency scale  A B C D 2

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 27 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The mel scale A B C D 2

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 28 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Lessons from psycho- acoustics  Human speech perception begins with frequency analysis on the basilar membrane  Individual point on the basilar membrane can be modelled as band-pass filter – critical bandwidths  Frequency is not perceived on a linear scale – hence use of non-linear perceptual frequency scales: mel scale, bark scale,…  Loudness perceived on logarithmic scale