Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 1 ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustics.

Slides:



Advertisements
Similar presentations
Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements Christopher A. Shera, John J. Guinan, Jr., and Andrew J. Oxenham.
Advertisements

Acoustic/Prosodic Features
SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
Psycho-acoustics and MP3 audio encoding
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Digital Audio Compression
Periodicity and Pitch Importance of fine structure representation in hearing.
Hearing and Deafness 2. Ear as a frequency analyzer Chris Darwin.
CS 551/651: Structure of Spoken Language Lecture 11: Overview of Sound Perception, Part II John-Paul Hosom Fall 2010.
Pitch organisation in Western tonal music. Pitch in two dimensions Pitch perception in music is often thought of in two dimensions, pitch height and pitch.
Foundations of Physics
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.
SIMS-201 Characteristics of Audio Signals Sampling of Audio Signals Introduction to Audio Information.
Structure of human ear Understanding the processes of human auditory system are key for posting requirements for architectural acoustics. This gives us.
PAC/AAC audio coding standard A. Moreno Georgia Institute of Technology ECE8873-Spring/2004
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
A.Diederich– International University Bremen – Sensation and Perception – Fall Frequency Analysis in the Cochlea and Auditory Nerve cont'd The Perception.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Audio and Acoustics Theory
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Chapter 6: The Human Ear and Voice
© 2010 Pearson Education, Inc. Conceptual Physics 11 th Edition Chapter 21: MUSICAL SOUNDS Noise and Music Musical Sounds Pitch Sound Intensity and Loudness.
Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06.
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
Basic Concepts: Physics 1/25/00. Sound Sound= physical energy transmitted through the air Acoustics: Study of the physics of sound Psychoacoustics: Psychological.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Week 5 – Nyquist-Shannon ESE 250 – S’12 Kod & DeHon 1 ESE250: Digital Audio Basics Week 5 Feb. 9, 2012 Nyquist-Shannon Theorem.
C-15 Sound Physics Properties of Sound If you could see atoms, the difference between high and low pressure is not as great. The image below is.
Psycho- acoustics and MP3 audio encoding Physics of Music PHY103.
Psychoacoustics: Sound Perception Physics of Music, Spring 2014.
CSC361/661 Digital Media Spring 2002
Week 7 Psychoacoustic Compression1ESE 250 – S’12 Kod & DeHon ESE250: Digital Audio Basics Week 7 February 23, 2012 Psychoacoustic Compression.
15.1 Properties of Sound  If you could see atoms, the difference between high and low pressure is not as great.  The image below is exaggerated to show.
ESE 250: Digital Audio Basics Week 4 February 5, 2013 The Frequency Domain 1ESE Spring'13 DeHon, Kod, Kadric, Wilson-Shah.
Sound Waves Sound waves are divided into three categories that cover different frequency ranges Audible waves lie within the range of sensitivity of the.
1 Speech and Audio Processing and Coding (cont.) Dr Wenwu Wang Centre for Vision Speech and Signal Processing Department of Electronic Engineering
Filtering. What Is Filtering? n Filtering is spectral shaping. n A filter changes the spectrum of a signal by emphasizing or de-emphasizing certain frequency.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
David Meredith Aalborg University
1 Introduction to Information Technology LECTURE 6 AUDIO AS INFORMATION IT 101 – Section 3 Spring, 2005.
Applied Psychoacoustics Lecture 3: Masking Jonas Braasch.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
Hearing: Physiology and Psychoacoustics 9. The Function of Hearing The basics Nature of sound Anatomy and physiology of the auditory system How we perceive.
Encoding and Simple Manipulation
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Chapter 21 Musical Sounds.
Fundamentals of Sensation and Perception
1 Hearing Sound is created by vibrations from a source and is transmitted through a media (such as the atmosphere) to the ear. Sound has two main attributes:
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
HEARING LOSS Hearing Loss Children and Adults who are deaf are those who cannot hear or understand conversational speech under normal circumstances.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
Multimedia Sound. What is Sound? Sound, sound wave, acoustics Sound is a continuous wave that travels through a medium Sound wave: energy causes disturbance.
SOUND PRESSURE, POWER AND LOUDNESS
15.1 Properties of Sound. Chapter 15 Objectives  Explain how the pitch, loudness, and speed of sound are related to properties of waves.  Describe how.
Physics Mrs. Dimler SOUND.  Every sound wave begins with a vibrating object, such as the vibrating prong of a tuning fork. Tuning fork and air molecules.
Fletcher’s band-widening experiment (1940) Present a pure tone in the presence of a broadband noise. Present a pure tone in the presence of a broadband.
The physics of hearing and other odds and ends. The range of human hearing The range of average human hearing is from about 15Hz to about 15,000Hz. Though.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Fletcher’s band-widening experiment (1940)
PSYCHOACOUSTICS A branch of psychophysics
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
C-15 Sound Physics 1.
Psychoacoustics: Sound Perception
Presentation transcript:

Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 1 ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustics

2 Course Map Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

Where are we? Week 2  Received signal is sampled & quantized  q = PCM[ r ] Week 4  Sampled signal first transformed into frequency domain  Q = DFT[ q ] Week 3  Quantized Signal is Coded  c =code[ q ] Week 5  signal oversampled & low pass filtered  Q = LPF[ DFT(q+n) ] Week 6  Transformed signal analyzed  Using human psychoacoustic models Week 7  Acoustically Interesting signal is “perceptually coded”  C = MP3[ Q] Over Sample DFT LPF DecodeProduce r(t)r(t) p(t)p(t) q + n C Perceptual Coding Store / Transmit Q + N Q Week 4 Week 6 Week 5Week 3 [Painter & Spanias. Proc.IEEE, 88(4):451–512, 2000] 3 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

4 The Physical Ear External Sound Waves  Guided by outer ear  into auditory canal Excite Inner Ear  Through mechanical linkage  connecting ear drum  to cochlea [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

5 The Physical Ear Initiates signal processing  frequency domain analysis  Via analog computation  Video: CochleaCochlea What part of the Cochlea vibrates for an 800 Hz square wave? [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

6 The Cognitive Ear Modern Psychoacoustics  Benefits greatly from o decades of neural recording o contemporary brain imaging technology [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

7 Power Spectrum Model of Hearing Rough Picture (main content of today’s lecture):  Critical Bands: Auditory system contains finite array of adaptively tunable, overlapping bandpass filters  Frequency Bins: humans process a signal’s component (against noisy background) in the one filter with closest center frequency  Masking: certain signal components in a given band are “favored” and others are filtered out Established through decades of psychoacoustic experiments B.C.J. Moore. Int.Rev.Neurobiol., 70:49–86, Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

8 Auditory Thresholds In the lab, you varied the frequency, amplitude and phase of signals What was the effect of each, if any, on the sound you heard?  Frequency  Amplitude  Phase Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

Auditory Thresholds Harvey Fletcher (1940)  Played pure tones varying o frequency, f [ Hz] o Intensity, I [Dyn ¢ cm -2 ] = [N ¢ cm -2 ] = 0.1 Pa o phase changes tend to be inaudible  Large listener population o Young o Acute Recorded extreme thresholds  faintest audible  greatest tolerable Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah (

10 Auditory Thresholds Results:  pain-free hearing range extends at most over 20 Hz – 20 KHz  with sensitivity » 2 ¢ ¢ 0.1 Pa = 20  Pa Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 0.1 Pa [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940].

11 The decibel unit Define standard pressure: p 0 = ¢ 0.1 Pa = 20  Pa Threshold of human hearing Compute Sound Pressure Level as: L SPL = 20 log 10 (p/p 0 ) dB L SPL for p 1 = 20  Pa, for p 2 = 200  Pa, for p 3 = 20 mPa Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Compare to Ambient sea-level pressure: 1 Atmosphere = 10 5 Pascal Q: why use log-log scale? A 1 : dynamic range A 2 : “loudness” is a power function 0.1 Pa

12 The decibel unit – Hearing intensity Week 6 – Psychoacoustics (

13 Let’s try to reproduce these results! Week 6 – Psychoacoustics ( We will listen to single sine tones starting at a frequency of 10KHz, all the way up to 20KHz, so each student can figure out their cut-off frequency Suggestions to improve this experiment?

14 Animal hearing ranges Dogs:  Greater hearing range: 40Hz to 60KHz  Ultrasonic dog whistles Mice:  Large ears in comparison to their bodies  Hearing range: 1KHz to 70KHz  Can’t hear low frequency noises  Communicate with high frequency  Distress call (40KHz), alert of predator [Pictures from Wikipedia] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

15 Why Sinusoids? Why not some other harmonic series?  Fourier’s analysis shows  harmonic analysis could be based on  arbitrary smooth periodic fundamental Why does the animal receiver use sinusoids? Hamiltonian Mechanics  Simplest physical model of vibrating masses  Coupled spring-mass-damper mechanics  Produce sinusoidal harmonics Video: CochleaCochlea m x b k …. all sound is produced by vibrating masses …. Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

16 Masking - Spatial Masking Paradigms  “Masker” masking “maskee”  Tone Masking Noise o pure tone  of 80 SPL  at 1 kHz o just masks “critical band” noise  of 56 SPL  centered at 1 kHz  Masker-to-Maskee ratio o Constant for fixed relative frequency and varying amplitude o Changes with varying relative frequency [T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.] 1 “Bark” frequency interval Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

17 Masking [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. The first graph shows the masking pattern for a 200Hz tone  Mostly masks tones around 200Hz, but also at harmonics The second graph shows the same plot for different frequencies, but only the fundamental part  Notice that the band gets wider for increasing frequencies …masker at fundamental can somewhat mask maskees at the harmonics … … but the “spreading curve” is traditionally depicted over the fundamental only Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

18 Tone Masking Noise Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked?  200 Hz tone at 80dB  200 Hz tone at 40dB  300 Hz tone at 40dB  400 Hz tone at 40dB  700 Hz tone at 30dB

19 Masking [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. Tone Masking Noise (Fig 12)  value above quiet threshold  such that a signal at the abscissa frequency  can be heard in presence of top: 200 Hz tone bottom: various frequencies Noise Masking Tone (Fig 13)  dots show pure tone magnitude (in dB)  required to be audible above noise o Of the magnitude on the middle curve o centered at that frequency o with bandwidth  at least wider  than the bars of Fig 12 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

20 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise?  200Hz at 60dB  1KHz at 60dB Noise Masking Tone

21 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise?  200Hz at 60dB o Yes!  1KHz at 60dB Noise Masking Tone noise

22 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise?  200Hz at 60dB o No!  1KHz at 60dB Noise Masking Tone

23 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise?  200Hz at 60dB  1KHz at 60dB o No! Noise Masking Tone

24 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise?  200Hz at 60dB  1KHz at 60dB o No! Noise Masking Tone

25 Masking - Temporal Temporal Masking  Masker effect persists for tenths of a second  Masker effect is “acausal” o on ~ 2/100 timescales Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

26 Pitch JND JND = “just noticeable difference”  change in stimulus that “just” elicits perceptual notice  where “just” means that a smaller variations of stimulus cannot be discerned [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah What can you say about the JND:  Below 1000 Hz? o roughly constant o ~ 3 Hz  Above 1000 Hz? o roughly log-log linear o Log[Jnd(f 2 )] - Log[ Jnd(f 1 )] ~ n (Log[f 2 ] - Log[f 1 ]) Suggests that as frequency increases  broader frequency bands  “assigned” to same length of cochlear tissue  Remember cochlea model What is n? e.g. f 1 =2000 f 2 = = 10 – 4 ~ n( Log 10 [2] ) ) n ~ 20

27 JND experiment Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah The following audio files contain a single tone playing for 10 seconds. The sine starts at 200Hz, then changes to a higher frequency (201, 202, 203, 205, 210). This change occurs after a number of “noises”: 1, 2, 3, 4, 5, 6, 7, 8 or 9. Can you notice when the change happens?

28 Critical Bands Decades of empirical study reveal that human audio frequency perception is quantized into < 30 “critical bands” of perceptually near-identical pitch classes corresponding to ~equal length bands of cochlear tissue (neurons) Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

29 Critical Bands: Evidence  Tone masking Noise (Fig. a & c) o noise audibility threshold o for small bandwidth noise o remains constant o until tone frequency locus o falls away from critical bandwidth  Noise masking Tone (Fig. b & d) o same effect o with masker and maskee roles reversed [T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

30 The Bark Scale “Bark” units: Uniform JND scale for frequency  Maps frequency intervals into their respective critical band number [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

31 The Bark Scale Frequency-to-Bark function  First Principles vs. Empirical Modeling [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

32 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Consider the following recording Any ways to improve the compression?

33 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz dB Masked

34 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz dB JND: Could only represent integer frequency values

35 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz dB

36 Next Week Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah How can we use what we know about human perception to compress music?  Frequency hearing range  Masking o Temporal o Spatial o JND o Barks

37 Big Ideas Sound is a pressure wave that makes the Cochlea vibrate  with frequencies from ~20Hz (at the tip) to ~20KHz (at the base) This vibration is sinusoidal (physics)  This is why sound harmonics are best represented as sinusoidal signals Masking  Temporal – A masker tone can mask another tone that is present either right before or a little after the masker  Spatial – A single tone can mask an entire frequency band (that contains the tone) if its intensity is high enough  There are <30 such bands (Bark scale), and they are wider for higher frequencies Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

38 Admin Lab 5 report due tomorrow On Thursday: Lab 6  You will be designing your own experiments o To measure the range of frequencies you can hear o To perform spatial masking experiments Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah

39 ESE250: Digital Audio Basics End Week 6 Lecture Human Psychoacoustics Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah