Auditory, Tactile & Vestibular Systems
Human Factors PSYC 2200 Michael J. Kalsher © 2017 Department of Cognitive Science
Sound: The Auditory Stimulus
Sound waves—rhythmic vibrations of air molecules that cause corresponding compression and expansion of air molecules. This stimulus can be represented as a sine wave that has several basic characteristics, including frequency, amplitude, and timbre.
Frequency (perceived pitch)
Measured in Hertz (Hz) (cycles/second) Range of human hearing: Limited between ~20 Hz and 20,000 Hz Sensitivity: Humans most sensitive to sounds around 4000 Hz (the frequency of speech) Threshold for hearing: age- and experience-dependent
Amplitude (perceived loudness)
Expressed as ratio of sound pressures measured in decibels. Sound Intensity (db) = 20 log (P1/P2). Where: P1 = Pressure of sound wave of interest P2 = Reference pressure (.0002 dynes/cm at 1Khz) Based on a log scale. Therefore, it takes a lot of power to increase (perceived) loudness a little.
Timbre (complexity of sound)
The Decibel Scale Amplitude Can be construed in 2 ways:
In absolute terms - A reference sound (P2) is fixed at a value near threshold of hearing. - Reference sound is a pure tone of 1,000 Hz at 20 micro-Newtons / square meter. In relative terms Characterized as the ratio of two sounds (e.g., an alerting signal contrasted with ambient sound).
Decibel Ratings for Selected Sounds
180 db Rocket Launching pad—hearing loss 140 db Gunshot at close range 120 db Rock Concert in front of speakers—Immediate danger 110 db Loud thunder 90 db Truck or bus 75-85 db Noisy Restaurant –Critical level begins here 60 db Normal conversation 50 db Calm restaurant 40 db Quiet office, household sounds 30 db Library 20 db Whisper 10 db Normal breathing 0 db Threshold of hearing
Measuring Sound: Sound Intensity Meters
Possess different scales that enable sound to be measured more specifically within particular frequency ranges. A-scale Differentially weighs sounds to reflect characteristics of human hearing, providing greatest weighing at those frequencies where we are most sensitive. C-scale Weighs all frequencies nearly equally
Sound Measurement in the “Real World”: The Context
Sound Measurement in the “Real World”: Measurement Tools
Other Sound Characteristics: Temporal Characteristics
The sound envelope Helps us distinguish the distinctive sound of a fog horn from a gun shot or school bell.
Other Sound Characteristics: Location … in front of, in back of, omnidirectional
The Ear: The Sensory Transducer
The 3 primary components of the ear: Pinnea; Outer and Middle Ear; Inner Ear
Pinnea, Outer & Middle Ear
The Pinnea Comprised of cartilage, the pinnea helps to collect sound. Due to its asymmetrical shape, it provides directional information. The Outer and Middle Ear - Purpose is to conduct and amplify sounds. - Primary components: tympanic membrane (ear drum); malleus (hammer); incus (anvil); and stapes (stirrup bones). Muscles of the middle ear are responsive to loud noises and reflexively contract--termed the aural reflex--to attenuate the amplitude of vibration before it is conveyed to the inner ear. Sources of breakdown or deafness at this level include wax build-up and ear drum rupture.
Inner Ear: Structure and Function
Structure: The Cochlea -- Basilar membrane -- Inner and outer hair cells (tectorial membrane) -- Auditory nerve. Function Pressure changes applied to the oval window by stapes causes changes of pressure inside cochlea which sets basilar membrane into up and down motion. Sound is converted or transduced to neural form by the differential bending of hair cells lining the tectorial membrane. Neural signals are compared between 2 ears to determine the delay and amplitude differences between them which provide location cues. These features will only be identical if a sound is presented directly along the mid- or sagittal plane of the listener Cochlear cilia are arranged in groups that look like a "V." Each "V" is tuned to a specific sound frequency. The smaller, thinner most sensitive "V" groups for the highest sound frequencies (and the ones most easily damaged) are at one end of the cochlea, and the largest and strongest lowest frequency "V" groups are at the helicotroma end.
How do we perceive pitch? Place Theory
Suggests that sounds of different frequencies causes maximal displacement at different places along the floor of the basilar membrane. Structure of the basilar membrane is the key: High frequency sounds cause maximal displacement closer to the oval window (near the stapes) The stapes end of the basilar membrane is stiffer, thinner and contains hair cells that seem better at detecting higher frequency sounds. Low frequency sounds cause maximal displacement at the helicotrema end of the basilar membrane The helicotrema end of the basilar membrane is looser, wider and contains hair cells that seem better at detecting lower frequency sounds.
Place Theory: Strengths and Weaknesses
Research verifies the fact that maximal deformation of the basilar membrane correlates with frequency. High frequency sounds selectively vibrate the basilar membrane of the inner ear near the entrance port (the oval window). Lower frequencies travel further along the membrane before causing appreciable excitation of the membrane. The basic pitch-determining mechanism is based on the location along the membrane where the hair cells are stimulated. But … our ability to make fine discriminations among very low frequency sounds (ones below 500 Hz) cannot be accounted for by the decreasing changes in place of maximal displacement.
How do we perceive pitch? Frequency Theory
Suggests that sounds of different frequency (pitch) cause different rates of neural firing: High Pitch high rates of neural firing Low Pitch low rates of neural firing This theory is accurate for sound frequencies up to 1,000 Hz, the maximum rate of firing for individual neurons. Frequencies above 1,000 Hz require a modification termed the “volley principle.” Volley Principle Patterned neural firing among clusters of neurons to match higher sound frequencies
The Auditory Experience: Overall Quality of Sound
Determined by: The set of frequencies that comprise the sound stimulus. The sound envelope. Timbre (the characteristic that makes a trumpet sound different from a flute; determined by the combination of harmonic frequencies that lie above the fundamental frequency of the sound). Temporal characteristics of the sound envelope and rhythm of successive sounds.
Loudness and Pitch Loudness correlates with sound intensity--but imperfectly so. Perceived loudness is better predicted through psychophysical scaling. an experimental approach to discover the relationship between physical intensity and psychological experience. basic finding is that equal increases in sound intensity on a decibel scale do not create equal increases in loudness. the scale that relates physical intensity to the psychological experience of loudness is expressed in units called sones.
Psychophysical Scaling
One sone established arbitrarily as the loudness of a 40 db tone of 1000 Hz. Tone perceived to be twice as loud = 2 sones Research shows that perceived loudness doubles (approximately) with each 10 db increase in sound intensity
Frequency Influence: Equal Loudness Curves
Equal loudness curves are described in phons 1 phon = 1 db of loudness of a 1,000 Hz tone (standard for calibration) At low pressure levels, humans are most sensitive to 500-5,000 Hz range (where most speech sounds are) Equal loudness contours showing the intensity of different variables as a function of frequency. All points lying on a single curve are perceived as equally loud.
Sound Masking Sounds can be masked by other sounds
Important design principles to counteract the effects of masking Minimum intensity difference necessary to ensure that a sound can be heard is about 15 db (above the mask). Sounds tend to be masked most by sounds in a critical frequency band surrounding the sound that is masked. Low pitch (frequency) sounds tend to mask high pitch sounds more than the converse.
Auditory Alarms: Advantages
Auditory design system is omnidirectional a person doesn’t have to be looking in order to benefit from the warning (harder to close ears than it is to close eyes). under certain circumstances, auditory alarms induce a greater level of compliance than visual alarms (i.e., Wogalter, Kalsher, & Racicot, 1993) redundancy across visual and/or tactile modalities can enhance effectiveness of alarms. If the volume of the auditory warning is set appropriately, it is almost guaranteed to get the attention of the operator whereas visual signals may be missed (especially in high workload environments). Sound can be used when sight may be degraded, (e.g. night time, bright sunlight, glare, impaired vision). Auditory perception is not affected as much as visual perception during periods of high g-forces or anoxia.
Auditory Alarms: Disadvantages
Can cause a panicked reaction. Additionally it can make it hard for the crew to communicate. This can result in the crew directing their efforts to cancelling the alarm rather than the problem that caused the alarm. When there are too many warning sounds for the pilot to comprehend (as many as 15 on a Boeing Aircraft). Warning sounds may not be conceived of as a set and hence different alarms may sound very similar if not sufficiently different. The sounds may be too loud (levels over 100db at the pilot’s ear) and they start sounding at their full intensity to overcome ambient noise. If two warning sounds come on at the same time it can be difficult to identify either one of them because of the combined sound. High frequency tones are often used which are not localizable by the human ear.
Alarms: An illustration of some of the problems with auditory alarms
I was flying in a jetstream at night when my peaceful revelry was shattered by the stall audio warning, the stick shaker, and several warning lights. The effect was exactly what was not intended; I was frightened numb for several seconds and drawn off instruments trying to work out how to cancel the audio/visual assault, rather than taking what should be instinctive actions. The combined assault is so loud and bright that it is impossible to talk to the other crew member and action is invariably taken to cancel the cacophony before getting on with the actual problem (Patterson, 1990).
Criteria for Alarms Must be heard above background ambient noise.
Should be a minimum of 15 db above the threshold of hearing above the noise level. This typically requires about 30 db difference to guarantee detection. Sound components should be distributed across several frequencies to avoid masking of the alarm by the malfunctioning equipment/system noise. Should not be above danger levels for hearing whenever possible. Danger level begins at db. Careful selection of frequencies can often be used to accomplish this and the criteria for 15 db above noise alarm. Should not be overly startling Trade-off between “too loud” and “too soft.” Can be addressed by tuning the rise time of the alarm pulse. Should be informative Signal the nature of the emergency Signal the appropriate action to be taken (ideally) Too many types of alarms can produce confusion. Should not disrupt the processing of other signals or any background speech communications that may be essential to deal with the alarm Aircraft, Medical equipment, alarm systems.
Designing Alarms: Some Guidelines
Conduct environmental and task analysis The goal: to understand quality and intensity of other sounds present to guarantee detectability, while minimizing disruption of other tasks Make alarms maximally discriminable along four important dimensions. Rhythm (synchronous vs. asynchronous) Pitch (high vs. low) Envelope (rising vs. falling) Timbre (quality of sound; flute vs horn) Enhance alarm effectiveness through the design of individual sounds (next slide).
Designing Alarms: Individual Sounds
Rise envelope should not be too abrupt (at least 20 m-sec ramp-up) Inter-pulse train interval can be used to create unique and distinctive rhythms to avoid confusability problem. Changing intensity can be used to produce “perceived urgency.” Technology can be incorporated to produce “smart alarms” which sense when action has--or has not--been taken.
Voice Alarms Advantages Disadvantages
Compared to “symbolic” alarm sounds, voice alarms are not dependent on learning (e.g., “Engine Fire” or “Stall!”). Disadvantages Can be confused with and are less discriminable from background of other voice communications. May be more susceptible to frequency-specific masking noise. Problematic for “non-native” speakers. * Advisable to use redundant system that combines distinctive features of the non-speech alarm sound with more informative features of synthetic voice (redundancy gain).
False Alarms: The “Cry Wolf” Problem
When sensing low intensity signals from environment, alarm systems sometimes make mistakes—inferring that nothing happened when it did (miss) or that something has happened when it actually has not (false alarm). Too many false alarms can cause users to distrust alarms, ignore them, or try to disable them. Steps to Avoid False Alarms Alarm criterion should not be overly sensitive! Technology can play a role; more sophisticated design algorithms may be developed to improve the overall sensitivity of an alarm system. Users can be trained about the inevitable tradeoff between misses and false alarms; frame-of-reference training can help them to accept false alarm rates as part of an automated protection “system.” Consider use of graded or likelihood alarm systems in which more than a single level of alert is provided Example: burning toast would trigger alarm of reduced intensity compared to a larger fire
Signal Detection Theory (SDT)
Assumes that sensitivity is a function of: (a) sensory capabilities and (b) the signal-to-noise ratio. Developed to separate sensitivity from motivational factors (e.g., response bias). Derives from the fact that we have noisy nervous systems. Our willingness to say “I see it” or “I hear it” depends on both sensitivity and motivational factors. A new variant--Fuzzy SDT--speaks in terms of degree of signal present or the degree of danger or threat. Implies that the variable can take on a continuous range of values.
Possible Outcomes of SDT
The Sound-Transmission Problem
In 1979, a collision occurred between two jets at Tenerife airport in the Canary Islands. One of the jets, a KLM 747, was poised at the end of the runway, engines primed, and the pilot was in a hurry to take off because of deteriorating weather conditions. Meanwhile, a Pan American plane that had just landed was still on the same runway trying to find its way off. The air traffic controller instructed the pilot of the KLM as follows: “Okay, stand by for takeoff and I will call.” Because of the quality of the radio transmission and his desire to proceed with the takeoff, the pilot instead heard …”Okay .. Take off.” What role did bottom-up processing play in the incident just described? Top-down?
The Sound Transmission Problem: Speech Components
Waveform Variation in air pressure (intensity) over time Spectrum Intensity of varying frequencies across time, per phoneme, or per word Speech Spectrograph (see figures on the following page) Shows speech along three dimensions: (1) Time; (2) Frequency; and (3) Amplitude
The Nature of the Speech Stimulus
The Speech Spectrograph: The sound waves of a typical speech signal. Voice Time Signal Voice Spectrum
Speech Spectrograph: Time Dependency and the Speech Envelope
Spectral Representation of Speech Speech Spectrograph Speech Spectrograph (the letter “d”) (the words “human factors” Many key properties captured in time-dependent changes in the spectrum (in the envelope of the sound)
Masking Effects of Noise
The potential of an auditory signal to be masked by other sounds depends on: Intensity (power) of the signal Frequency of the signal Circumstances likely to lead to masking: Power/intensity is much greater for vowels than for consonants; therefore, consonants are more susceptible to the effects of masking. Problematic because consonants convey more information than vowels (e.g., “fly to” vs. “fly through”) Female voices (typically at a higher frequency than male voices) are more vulnerable to masking.
Voice Synthesizers The level of fidelity of voice synthesizers must be sufficient to: Produce recognizable speech that can be heard in noise. Support “easy listening”. Listening to synthetic speech takes more mental resources than listening to natural speech Can produce greater interference with other ongoing tasks that must be accomplished concurrently with listening task. Memory is worse for synthesized speech since more processing is required.
Synthesized Speech Performance
Intelligibility: correct word identification within meaningful sentences as the task Intelligibility Error Rate Human Speech 99.2% 1% Best Synthesized % 3% Worst Synthesized 83.7% 35%
Guidelines for Synthesized Speech
Voice warnings should be presented in a voice that is qualitatively different from other voices that will be heard in the situation. If synthesized speech is used for other types of information in addition to warnings, some means of directing attention to the voice warning might be required. Maximize the intelligibility of the messages. Maximize user acceptance by making the voice as natural as possible. Consider providing a replay mode in the system so users can replay the message if desired. Give the user the ability to interrupt the message; this is especially important for experienced users who do not need to listen to the entire message each time the system is used. Provide an introductory or training message to familiarize the user with the system’s voice. Use synthesized speech sparingly and only where it is appropriate and acceptable to the users.
Advantages of Speech Displays
Relatively fast transmission rate (250 wpm) compared to other established systems. Transmission for Person skilled with Morse code is about 30wpm Does not require extensive training because people have considerable prior experience with speech. Good for poor readers, children, illiterates. Can tell directly what the problem/situation is
Disadvantages of Speech Displays
Can’t have multiple speech displays at once. Can’t be a long message (too long to communicate). Problems might be overcome by: making voice distinguishable. prioritizing the messages. using brief speech messages to capture attention, describe problem concisely, and tell receiver to refer to more extensive visual print description. designing redundant print and speech warnings where practical.
Echoic Memory Voice is transient - one spoken, it is gone.
Human info-processing system (STM) is designed to prolong duration of spoken word for only a few seconds. Beyond this time, spoken info must be actively rehearsed.
Sources of Noise-Induced Hearing Loss
Masking - Loss of sensitivity to signal while the noise is still present. Temporary threshold shift (TTS) Large immediately after noise is terminated, declines over the next few minutes. Typically expressed as loss in hearing 2 minutes after noise is terminated. Permanent threshold shift (PTS). Also termed occupational deafness Stems from louder and longer exposure to noise Tends to be most pronounced at higher frequencies, usually greatest at around 4,000 Hz.
Preventing Hearing Loss
Aging is responsible for a large portion of hearing loss, particularly in high frequency regions. Hearing loss also results from noisy work environments. In the U.S., OSHA has taken steps to prevent noise-induced hearing loss among workers by establishing standards that trigger remedial action. These standards are based on a time-weighted average (TWA) of noise experienced in the workplace which trades off the intensity of exposure against its duration.
Time Weighted Average TWAs are typically computed on the basis of noise dose meters that can be worn by individual workers (over the course of a workday). TWA > 85 decibels = Action Level Employers required to implement hearing protection plan TWA > 90 decibels = Permissible Exposure Level (PEL) Employers required to take steps toward noise reduction
Noise Remediation: Signal Enhancement (if TWA is < 85 dB)
Bottom-up Solutions: Analyze spectral content of the masking noise; then use signal spectra that has least overlap with noise content. Use lower frequency sounds or earphones to bring the sound closer to the operator’s ear. Top-down Solutions: Use Redundancy. Face to face mode provides redundant cues (lip movement) that are not provided when the listener cannot see the speaker. Use of the phonetic alphabet (“alpha, bravo, charlie …) Use common words or standardized communications procedures to decrease the chances of error.
Reducing Noise in the Workplace: Source and Environment
Source (Equipment and Tool Selection) Ventilation, fans and tools vary in sound that they produce (important point to consider before buying) Many sources of noise can be alleviated through the use of damping materials. Irritating effects of noise greater in high frequency areas. Environmental Noise Change environment near source Sound absorbing walls/ceilings & floors can be effective in decreasing sound from reverberations Reposition workers relative to noise source Effective more with only a single sound source
Reducing Noise in the Workplace: Listener
Listener (Workers) Ear Protection devices must be made available to workers if noise level exceeds the action level (TWA > 85 dB). Types of Ear Protection: Ear plugs: fit inside ear (most likely to be worn improperly) Ear muffs: fit over top of ear Need to consider the devices Noise Reduction Ratio. Manufacturer’s “specified” NRR is typically greater than the noise reduction actually experienced by users, in part, because testing is done under laboratory conditions.
Ear Protection Ear Muffs Ear Plugs
Can double as headphones through which critical signals can be delivered. Ear Plugs Offer greater overall protection if worn properly More likely than ear muffs to be worn improperly. For both, comfort becomes an issue; uncomfortable devices may be disregarded despite their effectiveness.
Understanding Speech Intelligibility: the degree to which a speech message is correctly recognized. Quality is also important, but is usually defined by personal preference and prior experience. Intelligibility is context dependent.
Human Communication Contributing Factors to Speaker Effectiveness
Longer syllable duration Greater speech intensity Infrequent Pauses Greater variation observed in fundamental frequency When it comes to Communication, humans are adaptive: In noisy environments, we automatically adjust our voice levels to ensure listeners hear us and we adjust our bodies / ears to ensure we hear the “target” message. Sometimes we unaware of our propensity to do this: Example: Speaking loudly to others while listening to music through headphone
The Other Senses Tactile and Haptic Senses
Sensory receptors under skin respond to pressure and relay information to the brain regarding subtle changes in force applied. Senses also provide haptic info regarding the shape of manipulated objects and things.
Proprioception and Kinesthesis
Proprioceptive Channel Receptor System located within all joints. Conveys representation of all joint angles to the brain (perception of limb position in space). Kinesthetic Channel Conveys sense of motion of limbs as exercised by the muscles.
Vestibular Senses: Contributors to our sense of balance
Semicircular canals and vestibular sacs are receptors located deep within the inner-ear. Head can rotate on three axes—three semicircular canals are aligned to each axis.
Motion Sickness Normally, visual and vestibular senses convey compatible and redundant information. Tends to occur when the information reaching different sensory systems conflict, motion sickness results.
