Auralization Lauri Savioja (Tapio Lokki) Helsinki University of Technology, TKK.

Slides:



Advertisements
Similar presentations
Angelo Farina Dip. di Ingegneria Industriale - Università di Parma Parco Area delle Scienze 181/A, Parma – Italy
Advertisements

Researches and Applications for Automotive Field Andrea Azzali, Eraldo Carpanoni, Angelo Farina University of Parma.
Spatial Sound Encoding Including Near Field Effect: Introducing Distance Coding Filters and a Viable, New Ambisonic Format Jérôme Daniel, France Telecom.
3-D Sound and Spatial Audio MUS_TECH 348. Multi-Loudspeaker Reproduction: Surround Sound.
1 3D sound reproduction with “OPSODIS” and its commercial applications Takashi Takeuchi, PhD Chief Technical Officer OPSODIS Limited Institute of Sound.
Binaural Hearing Or now hear this! Upcoming Talk: Isabelle Peretz Musical & Non-musical Brains Nov. 12 noon + Lunch Rm 2068B South Building.
Spatial Perception of Audio J. D. (jj) Johnston Neural Audio Corporation.
Listening Tests and Evaluation of Simulated Sound Fields Using VibeStudio Designer Wersényi György Hesham Fouad SZÉCHENYI ISTVÁN UNIVERSITY, Hungary VRSonic,
3-D Sound and Spatial Audio MUS_TECH 348. Wightman & Kistler (1989) Headphone simulation of free-field listening I. Stimulus synthesis II. Psychophysical.
Reflections Diffraction Diffusion Sound Observations Report AUD202 Audio and Acoustics Theory.
ELEC 407 DSP Project Algorithmic Reverberation – A Hybrid Approach Combining Moorer’s reverberator with simulated room IR reflection modeling Will McFarland.
Echo Generation and Simulated Reverberation R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.
Marc Moonen Dept. E.E./ESAT, KU Leuven
Back to Stereo: Stereo Imaging and Mic Techniques Huber, Ch. 4 Eargle, Ch. 11, 12.
1 Real Time Walkthrough Auralization - the first year from static to dynamic auralization properties and limitations model and receiver grid examples current.
PREDICTION OF ROOM ACOUSTICS PARAMETERS
3-D Sound and Spatial Audio MUS_TECH 348. Cathedral / Concert Hall / Theater Sound Altar / Stage / Screen Spiritual / Emotional World Subjective Music.
Virtualized Audio as a Distributed Interactive Application Peter A. Dinda Northwestern University Access Grid Retreat, 1/30/01.
Interactive Sound Rendering SIGGRAPH 2009 Dinesh Manocha UNC Chapel Hill
Integration into game engines Nicolas Tsingos Dolby Laboratories
3-D Spatialization and Localization and Simulated Surround Sound with Headphones Lucas O’Neil Brendan Cassidy.
Geometric Sound Propagation Anish Chandak & Dinesh Manocha UNC Chapel Hill
Project Presentation: March 9, 2006
1 Manipulating Digital Audio. 2 Digital Manipulation  Extremely powerful manipulation techniques  Cut and paste  Filtering  Frequency domain manipulation.
STUDIOS AND LISTENING ROOMS
1 Dong Lu, Peter A. Dinda Prescience Laboratory Computer Science Department Northwestern University Virtualized.
Spectral centroid 6 harmonics: f0 = 100Hz E.g. 1: Amplitudes: 6; 5.75; 4; 3.2; 2; 1 [(100*6)+(200*5.75)+(300*4)+(400*3.2)+(500*2 )+(600*1)] / = 265.6Hz.
Real-time wave acoustics for games Nikunj Raghuvanshi Microsoft Research.
Binaural Sound Localization and Filtering By: Dan Hauer Advisor: Dr. Brian D. Huggins 6 December 2005.
Precomputed Wave Simulation for Real-Time Sound Propagation of Dynamic Sources in Complex Scenes Nikunj Raghuvanshi †‡, John Snyder †, Ravish Mehra ‡,
U.S. Army Research, Development and Engineering Command Braxton B. Boren, Mark Ericson Nov. 1, 2011 Motion Simulation in the Environment for Auditory Research.
So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition.
1 Ambisonics: The Surround Alternative Richard G. Elen The Ambisonic Network.
L INKWITZ L AB Accurate sound reproduction from two loudspeakers in a living room 13-Nov-07 (1) Siegfried Linkwitz.
ANISH CHANDAK COMP 770 (SPRING’09) An Introduction to Sound Rendering © Copyright 2009 Anish Chandak.
Audio in VEs Ruth Aylett. Use of audio in VEs n Important but still under-utilised channel for HCI including virtual environments. n Speech recognition.
1 Recording Fundamentals INART 258 Fundamentals of MIDI & Digital Audio Mark BalloraMark Ballora, instructor 1.
Binaural Sonification of Disparity Maps Alfonso Alba, Carlos Zubieta, Edgar Arce Facultad de Ciencias Universidad Autónoma de San Luis Potosí.
Improved 3D Sound Delivered to Headphones Using Wavelets By Ozlem KALINLI EE-Systems University of Southern California December 4, 2003.
SOUND IN THE WORLD AROUND US. OVERVIEW OF QUESTIONS What makes it possible to tell where a sound is coming from in space? When we are listening to a number.
Issac Garcia-Munoz Senior Thesis Electrical Engineering Advisor: Pietro Perona.
Virtual Worlds: Audio and Other Senses. VR Worlds: Output Overview Visual Displays: –Visual depth cues –Properties –Kinds: monitor, projection, head-based,
Rumsey Chapter 16 Day 3. Overview  Stereo = 2.0 (two discreet channels)  THREE-DIMENSIONAL, even though only two channels  Stereo listening is affected.
Digital Sound Ming C. Lin Department of Computer Science University of North Carolina
3-D Sound and Spatial Audio MUS_TECH 348. Main Types of Errors Front-back reversals Angle error Some Experimental Results Most front-back errors are front-to-back.
Timo Haapsaari Laboratory of Acoustics and Audio Signal Processing April 10, 2007 Two-Way Acoustic Window using Wave Field Synthesis.
Audio Systems Survey of Methods for Modelling Sound Propagation in Interactive Virtual Environments Ben Tagger Andriana Machaira.
Simulation of small head-movements on a Virtual Audio Display using headphone playback and HRTF synthesis Wersényi György SZÉCHENYI ISTVÁN UNIVERSITY,
3-D Sound and Spatial Audio MUS_TECH 348. Physical Modeling Problem: Can we model the physical acoustics of the directional hearing system and thereby.
L INKWITZ L AB S e n s i b l e R e p r o d u c t i o n & R e c o r d i n g o f A u d i t o r y S c e n e s Hearing Spatial Detail in Stereo Recordings.
Spatial and Spectral Properties of the Dummy-Head During Measurements in the Head-Shadow Area based on HRTF Evaluation Wersényi György SZÉCHENYI ISTVÁN.
Interactive acoustic modeling of virtual environments Nicolas Tsingos Nicolas TsingosREVES-INRIA.
3-D Sound and Spatial Audio MUS_TECH 348. Stereo Loudspeaker Reproduction.
Effects. Effects in Music n All music that is recorded or amplified relies on effects to enhance certain characteristics of the sound. n Guitarists typically.
On the manifolds of spatial hearing
Subjective Assessments of Real-Time Room Dereverberation and Loudspeaker Equalisation Panagiotis Hatziantoniou and John Mourjopoulos AudioGroup, WCL Department.
Time Based Processors. Reverb source:
Auditory Perception: 2: Linear Systems. Signals en Systems: To understand why the auditory system represents sounds in the way it does, we need to cover.
SPATIAL HEARING Ability to locate the direction of a sound. Ability to locate the direction of a sound. Localization: In free field Localization: In free.
3-D Sound and Spatial Audio MUS_TECH 348. What do these terms mean? Both terms are very general. “3-D sound” usually implies the perception of point sources.
Equipment for Measuring Hearing and Calibration
27th Tonmeistertagung nd November 2012, Cologne
PREDICTION OF ROOM ACOUSTICS PARAMETERS
3D sound reproduction with “OPSODIS” and its commercial applications
ACOUSTICS part – 4 Sound Engineering Course
Volume 62, Issue 1, Pages (April 2009)
PREDICTION OF ROOM ACOUSTICS PARAMETERS
Hearing Spatial Detail
Volume 62, Issue 1, Pages (April 2009)
Presentation transcript:

Auralization Lauri Savioja (Tapio Lokki) Helsinki University of Technology, TKK

AGENDA, 8:45 – 9:20  Auralization, i.e., sound rendering  Impulse response  Basic principle + Marienkirche demo  Source signals and modeling of directivity of sources  Modeling from perceptual point of view  Dynamic auralization  Evaluation of auralization quality  Spatial sound reproduction  Headphones  Loudspeakers

Impulse response of a room 10 meters 7 meters

Impulse response of a room

 A linear time-invariant system (LTI) can be modeled with an impulse response  The output y(t) is the convolution of the input x(t) and the impulse response h(t)  Discrete form (convolution is sum) Impulse response

Measured (binaural) impulse response of Tapiola concert hall

Two goals of room acoustics modeling  Goal 1: room acoustics prediction  Static source and receiver positions  No real-time requirement  Goal 2: auralization, sound rendering  Possibly moving source(s) and listener, even geometry  Both off-line and interactive (real-time) applications  Need of anechoic stimulus signals (Binaural rendering, Lokki, 2002)

Goal 2: Auralization / sound rendering -“Auralization is the process of rendering audible, by physical or mathematical modeling, the sound field of a source in a space, in such a way as to simulate the binaural listening experience at a given position in the modeled space.” (Kleiner et al. 1993, JAES) -Sound rendering: plausible 3-D sound, e.g., in games 3-D model  spatial IR * dry signal = auralization

Auralization  Goal: Plausible 3-D sound, authentic auralization  The most intuitive way to study room acoustic prediction results  Not only for experts  Anechoic stimulus signal  Reproduction with binaural or multichannel techniques  Impulse response has to contain also spatial information

Auralization, input  Input data:  Anechoic stimulus signal(s) !  Geometry + material data  source(s) and receiver(s) locations and orientations

Auralization, modeling  Source(s): omnidirectional, sometimes directional  Medium:  physically-based sound propagation in a room  perceptual models, i.e., artificial reverb  Receiver: spatial sound reproduction (binaural or multichannel)

Marienkirche, concert hall in Neubrandenburg (Germany)

source – medium – receiver (Savioja et al. 1999, Väänänen 2003)

 Stimulus  Sound signal synthesis  Anechoic recordings Source Modeling – stimulus signal

 Directivity is a measure of the directional characteristic of a sound source.  Point sources omnidirectional omnidirectional frequency dependent directivity characteristics frequency dependent directivity characteristics  Line and volume sources  Database of loudspeakers Source Modeling - Radiation

Anechoic stimulus signals  In a concert hall typical sound source is an orchestra  Anechoic recordings needed  Directivity of instruments also needed  We have just completed such recordings  Demo  All recordings with 22 microphones  Recordings are publicly available for Academic purposes Contact: Contact:

Sound field decomposition (Svensson, AES22 nd 2002) diffuse reflections handled by surface sources

Computation vs. human perception Computation vs. Frequency resolution Computation vs. Time resolution (Svensson & Kristiansen 2002)

Two approaches Perceptually-based Physically-based (Väänänen, 2003)

Auralization: Two approaches (1)  Perceptually-based modeling  Impulse response is not computed with a geometry A ”statistical” response is applied A ”statistical” response is applied  Psychoacoustical (subjective) parameters are applied in tuning the response e.g. reverberation time, clarity, warmness, spaciousness e.g. reverberation time, clarity, warmness, spaciousness  Applications: music production, teleconferencing, computer games...

Auralization: Two approaches (2)  Physically-based modeling  Sound propagation and reflections of boundaries are modeled based on physics.  Impulse response is predicted based on the geometry and its properties depend on surface materials, directivity and position of sound source(s) as well as position and orientation of the listener  Applications: prediction of acoustics, concert hall design, virtual auditory environments for games and virtual reality applications, education,...

Dynamic auralization (≈sound rendering)  Method 1: A grid of impulse responses is computed and convolution is performed with interpolated responses:  Applied in the CATT software (  Method 2: ”Parametric rendering”

Typical Auralization System 1. Scene definition 2. Parametric presentation of sound paths 3. Auralization with parametric DSP structure

Auralization parameters  For the direct sound and each image source the following set of auralization parameters is provided:  Distance from the listener  Azimuth and elevation angles with respect to the listener  Source orientation with respect to the listener  Reflection data, e.g. as a set of filter coefficients which describe the material properties in reflections

Treatment of one image source – a DSP view  Directivity  Air absorption  Distance attenuation  Reflection filters  Listener modeling  Linear system  Commutation  Cascading (Adapted from Strauss, 1998)

Auralization block diagram

Treatment of each image source

Late reverberation algorithm  A special version of feedback delay network (Väänänen et al. 1997)

A Case Study: a Lecture Room

Image sources 1st order

Image sources up to 2nd order

Image sources up to 3rd order

Distance attenuation

Distance attenuation (zoomed)

Gain + air absorption

Gain + air and material absorption

All monaural filtering

All monaural filtering (zoomed)

Treatment of each image source

Only ITD for pure impulse

Only ITD for pure impulse (zoom)

ITD + minimum phase HRTF

Monaural filterings + ITD

Monaural filterings + ITD + HRTF

Auralization block diagram

Reverb

Image sources + reverberation

Dynamic Sound Rendering  Dynamic rendering  Properties of image sources are time variant The coefficients of filters are changing all the time The coefficients of filters are changing all the time  Every single parameter has to be interpolated  In delay line pick-ups the fractional delay filters have to be used to avoid clicks and artifacts  Late reverberation is static  Update rate  latency

Auralization quality  What is the wanted quality?  Assesment of quality is possible only by case studies  Objectively:  Acoustical attributes  With auditory modeling  Subjectively:  Listening tests

A case study, lecture hall T3

Quality of auralization (Lokki, 2002) Stimuli: clarinet drum Results clarinet:recording auralization Results drum: recording auralization

Spatial auditory display Nicolas Tsingos Lauri Savioja

Spatial Sound Reproduction Techniques  Reproduce the correct perceived location/direction of a virtual sound source to the ears of the listener  Headphone or speaker based. Binaural stereoMultiple speakers

Binaural and Transaural Stereophony  Natural filtering of the ears and torso  Apply a directional filtering to the signal  Head Related Transfer Functions (HRTFs)  Headphones (binaural)  Speaker pair (transaural)

Head Related Transfer Functions  Modeling  Finite element techniques  Measuring  Dummy-heads  Human listener  HRTFs strongly depend on the listener  Morphological differences  Adaptation by scaling in frequency domain

HRTF filter design  Filters separated into two parts:  1. Inter-aural time difference (ITD)  2. Minimum-phase FIR-filter  In movements:  Linear interpolation of ITD  Bilinear interpolation for FIR

Implementing HRTFs  Principal component analysis  HRTF is a linear combination of eigenfilters  Allows for smooth interpolation  Allows for reducing the number of operations

Transaural Stereophony  Cross talk cancellation  H ll and H rr are HRTFs  H rl and H lr ?

Amplitude/Intensity Panning  The common “surround sound”  Apply the proper gain to every speaker to reproduce the proper perceived direction  in 2D pair of loudspeakers  in 3D loudspeaker triangle  Vector-Base Amplitude Panning (image from Ville Pulkki, TKK)

Ambisonics  Spherical harmonics decomposition of the pressure field at a given point  1st order spherical harmonics  Sound field can be reproduced from 4 components  1 omnidirectional and 3 orthogonal figure-of-8  Allows for manipulating the sound-field  Rotations, etc.

Wave Field Synthesis  Reproduce the exact wave-field in the reproduction regions  Use speakers on the boundary  Kirchoff integral theorem  Sound field valid everywhere in the room  Heavy resources  In practice limited to a planar configuration

Comparison Technique Setup (# chans) DSPelevationimaging Sweet spot recordi ng HRTF light (2) light (2)moderateyesv.goodn/ayes Transaural light (2+) moderateyesgoodsmall yes yes Amplitude Panning average (5+) low yes (3D array) averagemedium no no Ambisonics average (4+) moderate yes (3D array) goodsmall yes yes WFS heavy (100+) high ?v.goodn/a ?

Which Setup for which Environment ?  Binaural systems for desktop use  Includes stereo transaural  Multi-speaker systems for multi-user  Well suited to immersive projection-based VR systems Projection screens act as low-pass filters Projection screens act as low-pass filters Video projection constraints Video projection constraints

Other Issues for Immersive Environments  Overall system latency  Less than 100ms is OK  Tracking the user’s head  Update binaural/transaural filters  Correction of loudspeakers gains  Room problems  Reflective surfaces

Summary  Auralization  Direct convolution with full directional impulse responses Computationally too heavy in practice Computationally too heavy in practice  Parametric impulse response rendering Early reflections treated separately Early reflections treated separately Statistic late reverberation Statistic late reverberation  Spatial sound reproduction  Headphones: HRTFs  Loudspeakers: VBAP, Ambisonics, Wave Field Synthesis

Thank you for your attention! Contact: