What do we hear for? Seeing is knowing what is where by looking (David Marr) Seeing is predicting what is where, verified by looking, in order to drink that cup of coffee (Reza Shadmehr)
What do we hear for? Seeing is knowing what is where by looking (David Marr) Seeing is predicting what is where, verified by looking, in order to drink that cup of coffee (Reza Shadmehr) Hearing is predicting what will happen next, verified by listening, in order to know as much as possible about what’s out there (Eli Nelken)
Even simple sounds tell stories
A stupid story
The calm of the sea Vox balaenae (Voice of the whale) For flute, cello and piano (cello and piano playing) George Crumb
A shout of despair Wozzeck, orchestral transition between scenes 2 and 3 of act 3 Alban Berg
Auditory worlds What are sounds? What do we hear? How do we hear?
Sound As a Pressure Wave Vibrations of objects set up pressure waves in the surrounding air. The “elastic” property of air allows these pressure waves to propagate (spread). Vibrations of objects set up pressure waves in the surrounding air. The “elastic” property of air allows these pressure waves to propagate (spread).
Structure of sounds
What happens without structure?
Introducing structure
The bird and Chopin © Gabriel J. Arsante
Structure of sounds © Gabriel J. Arsante
What are sounds? Structure at a lot of time scales Perceptual correlates: –Melodies (1 s) –Notes (0.1 s) –Pitch (much faster than 0.01 s)
Peripheral processing of sounds
Inner Ear Middle Ear Outer Ear
Inner Ear Middle Ear Outer Ear
Inner Ear Middle Ear Outer Ear
Inner Ear Middle Ear Outer Ear
Cross Section of Cochlea
“Travelling Wave” Along the Basilar Membrane Von Békésy
Travelling Wave Peaks at Different Locations As the Frequency Changes
Outer Hair Cells Inner Hair Cells
A simple neuron in the auditory system BF
The auditory pathways
Responses of simple neurons to complex sounds
OrigSlow A set of complex sounds
In consequence…
The neurogram
We get a very rich and precise representation of the incoming sound at the level of the auditory nerve
The sound and its components full Brahms, Geistlisches Wiegenlied Op. 91 no. 2 Kathleen Ferrier, Phyllis Spurr, Max Gilbert
Is that enough? (do we hear the spectrogram?)
What are the perceptual qualities of sounds? “The basic elements of any sound are loudness, pitch, contour, duration (or rhythm), tempo, timbre, spatial location, and reverberation.” (D.J. Levitin, This is Your Brain on Music: The Science of a Human Obsession, p.14)
The Long Road from Spectrogram to Perception How do we go from the ‘neurogram’ to ‘loudness, pitch, contour, duration (or rhythm), tempo, timbre, spatial location, and reverberation’?
Relationships with low-level features… Loudness with sound intensity –Encoded by some population-averaged activity Pitch with periodicity
pure Pure tones Time Filtered clicks Iterated ripple noise IRN AM (3 kHz) SAM Pitch: examples
Relationships with low-level features… Loudness with sound intensity –Encoded by some population-averaged activity Pitch with periodicity –Periodicity IS NOT frequency! Contour with slow amplitude modulations –Encoded in the range of 1-10 Hz very clearly at the level of A1 (e.g. Shamma and collaborators) –But not slower than that (probably) Duration/rhythm with ??? Tempo with ??? Timbre with spatial activation patterns (e.g. in A1) Spatial location with ITD/ILD/spectral activation patterns –Low-level information available at the CN/SOC –But requires integration Reverberation with ???????
The Long Road from Spectrogram to Perception Pitch, timbre, phonemic identity, and so on are ‘separable’ – they are independent of each other They represent high-level generalizations –Many different sounds have the same pitch (violin and trumpet), same timbre (trumpet on two different tones), same phonemic identity (two different people talking) –The neurograms of these pairs of sounds are very different from each other The generalizations should be derivable from the neurogram, but are not explicitly represented at that level
The Long Road from Spectrogram to Perception Problem no. 1: we do not hear the physics of sounds, but rather their derived properties (Reverse hierarchies – we perceive high representation levels unless we make serious efforts to go down into the details)
The Long Road from Spectrogram to Perception
Problem no. 2: In natural conditions, sounds rarely occur by themselves We have to group and segregate ‘bits of sounds’ in order to form representations of ‘auditory objects’
What comes first, the sound or its properties? We may need to start by forming objects (solve problem no. 2) and only later assign properties to them (solve problem no. 1)
Hypothesis: the early auditory system (presumably up to the level of primary auditory cortex) deals with the formation of auditory objects
Evidence A: Object representation in primary auditory cortex
The auditory pathways
Primary auditory cortex is a higher brain area! Visual system: Photoreceptors Bipolar cells Retinal ganglion cells LGN V1 IT Face cells Auditory system: Hair cells Auditory nerve fibers Cochlear nucleus Superior Olive Inferior Colliculus MGB Auditory cortex Frequency Sound level Localization and binaural detection Species-specific calls? Auditory scene analysis?
The auditory pathways
A1 Neurons have a large variety of frequency response areas (FRAs) 98
Memory in primary auditory cortex
Neurons in auditory cortex represent the weak components of sounds (evidence for the representation of auditory objects in primary auditory cortex)
Strong effects of weak backgrounds… kHz dB Attn 0100 ms 0100 ms 0100 ms
Some cortical neurons respond to weak noise in mixture with high-level tones
Tones in modulated and unmodulated background
Noise (bandwidth: BF, 10 Hz trapezoidal envelope) Tone (BF) Noise (bandwidth: BF, 10 Hz trapezoidal envelope) Tone (BF) Tone+Noise Weak tones in strong noise Las et al. 2005
Responses to high-level tones in silence and to low-level tones in noise are similar
Evidence B: coding of surprising events in primary auditory cortex
Time Low Freq. High Freq. Time Low Freq. High Freq. Time High Freq. Low Freq. 95% 50% 5%
Low Freq.
High Freq. Deviant Standard SSA =
…Also with spikes…
Evidence C: Perceptual qualities such as pitch are coded outside primary auditory cortex
Activation of auditory cortex by noise and pitched stimuli
Activation by intelligible speech
Take-home messages Auditory perception is far removed from the ‘physical’, low-level representation of sounds A major problem of early processing is the definition of the ‘objects’ to which properties will be assigned There is evidence that objects are defined first, properties are assigned in higher brain areas
Reverse Hierarchy Theory The hierarchical trade offs that dictate the relations between processing and perception We perceive the high-order constructs rather than the low-level physics
Interactions between high- and low- level representations
From Hochstein and Ahissar 2002
Change blindness
Name the color of the letters
נשר
אדום
כחול
Visual Reverse Hierarchy Theory (RHT) (Ahissar & Hochstein, 1997; Hochstein & Ahissar, 2002)
Feedback reverse hierarchy Feed-forward hierarchy Low levels are sensitive to fine temporal cues, in a μs resolution Phonological/semantic level …… day bay night dream Initial perception is based on high-levels, which represent phonological entities
See: Nahum, Nelken and Ahissar, PLoS 2008 We can either hear the sounds or understand the words, but not both at the same time