A Bayesian System for Noise

Slides:



Advertisements
Similar presentations
Multipitch Tracking for Noisy Speech
Advertisements

Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
Classical Analog Synthesis. Analog Synthesis Overview Sound is created by controlling electrical current within synthesizer, and amplifying result. Basic.
Speaking Style Conversion Dr. Elizabeth Godoy Speech Processing Guest Lecture December 11, 2012.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Probabilistic Robotics Probabilistic Motion Models.
Watkins, Raimond & Makin (2011) J Acoust Soc Am –2788 temporal envelopes in auditory filters: [s] vs [st] distinction is most apparent; - at higher.
Bayes Filters Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics TexPoint fonts used in EMF. Read the.
6/3/20151 Voice Transformation : Speech Morphing Gidon Porat and Yizhar Lavner SIPL – Technion IIT December
Subband-based Independent Component Analysis Y. Qi, P.S. Krishnaprasad, and S.A. Shamma ECE Department University of Maryland, College Park.
The Integration Algorithm A quantum computer could integrate a function in less computational time then a classical computer... The integral of a one dimensional.
1 Manipulating Digital Audio. 2 Digital Manipulation  Extremely powerful manipulation techniques  Cut and paste  Filtering  Frequency domain manipulation.
Angle Modulation Objectives
Harmonics and Overtones Waveforms / Wave Interaction Phase Concepts / Comb Filtering Beat Frequencies / Noise AUD202 Audio and Acoustics Theory.
Sound source segregation (determination)
Applications of Signals and Systems Fall 2002 Application Areas Control Communications Signal Processing.
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
Sound Source Localization based Robot Navigation Group 13 Supervised By: Dr. A. G. Buddhika P. Jayasekara Dr. A. M. Harsha S. Abeykoon 13-1 :R.U.G.Punchihewa.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
Synthesis advanced techniques. Other modules Synthesis would be fairly dull if we were limited to mixing together and filtering a few standard waveforms.
Intensity, Intensity Level, and Intensity Spectrum Level
Infrasound detector for Apatity group Asming V.E., Kola Regional Seismological Center, Apatity, Russia.
Applied Psychoacoustics Lecture: Binaural Hearing Jonas Braasch Jens Blauert.
1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.
Supervisor: Dr. Boaz Rafaely Student: Limor Eger Dept. of Electrical and Computer Engineering, Ben-Gurion University Goal Directional analysis of sound.
3-D Sound and Spatial Audio MUS_TECH 348. Main Types of Errors Front-back reversals Angle error Some Experimental Results Most front-back errors are front-to-back.
Forward-Scan Sonar Tomographic Reconstruction PHD Filter Multiple Target Tracking Bayesian Multiple Target Tracking in Forward Scan Sonar.
Summary of This Course Huanhuan Chen. Outline  Basics about Signal & Systems  Bayesian inference  PCVM  Hidden Markov Model  Kalman filter  Extended.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Subtractive Sound Synthesis. Subtractive Synthesis Involves subtracting frequency components from a complex tone to produce a desired sound Why is it.
Audio Systems Survey of Methods for Modelling Sound Propagation in Interactive Virtual Environments Ben Tagger Andriana Machaira.
Figures for Chapter 14 Binaural and bilateral issues Dillon (2001) Hearing Aids.
Sound Bot Alan Liou Undergraduate Student Computer Engineering.
Indoor Location Detection By Arezou Pourmir ECE 539 project Instructor: Professor Yu Hen Hu.
Oct 13, 2005CS477: Analog and Digital Communications1 PLL and Noise in Analog Systems Analog and Digital Communications Autumn
Exploiting cross-modal rhythm for robot perception of objects Artur M. Arsenio Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory.
Microphone Array Project ECE5525 – Speech Processing Robert Villmow 12/11/03.
CHAPTER 4 COMPLEX STIMULI. Types of Sounds So far we’ve talked a lot about sine waves =periodic =energy at one frequency But, not all sounds are like.
AUDIOFILES Harika Basana ), Elizabeth Chan ), Nikolai ), Frank Zhang ) 6100.
Copyright © 2011 by Denny Lin1 Simple Synthesizer Part 3 Based on Floss Manuals (Pure Data) “Building a Simple Synthesizer” By Derek Holzer Slides by Denny.
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
BEER BOT Dalton Verhagen. Sound Sensor Designed to find the direction a specified sound source is coming from Determines this with a time of arrival algorithm.
Performance of Digital Communications System
Fletcher’s band-widening experiment (1940)
Spectral subtraction algorithm and optimize Wanfeng Zou 7/3/2014.
Submitted To: Submitted By: Seminar On Digital Audio Broadcasting.
Jonas Braasch Architectural Acoustics Group Communication Acoustics and Aural Architecture Research Laboratory (C A 3 R L) Rensselaer Polytechnic Institute,
TIA-1083-A and C63.19 Magnetic Interference Requirements Compared
Searching for pulsars using the Hough transform
GWDAW - 8 University of Wisconsin - Milwaukee, December
Acoustic mapping technology
Speech Level Measures Dr. Herman J.M. Steeneken.
University of Silesia Acoustic cues for studying dental fricatives in foreign-language speech Arkadiusz Rojczyk Institute of English, University of Silesia.
A map of periodicity orthogonal to frequency representation in the cat auditory cortex.  Gerald Langner, Ben Godde, and Hubert R. Dinse Examples of auditory.
CS 591 S1 – Computational Audio -- Spring, 2017
FM Hearing-Aid Device Checkpoint 2
Central auditory processing
Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003
Probabilistic Robotics
The cocktail party problem
Chapter 3: PCM Noise and Companding
Auditory system: A neural substrate for frequency selectivity?
Distributed Sensing, Control, and Uncertainty
Perceptual Echoes at 10 Hz in the Human Brain
Josh H. McDermott, Eero P. Simoncelli  Neuron 
Noise Aperiodic complex wave
Microphone Array Project
Microphone array beamforming
Volume 17, Issue 15, Pages (August 2007)
Phase Shift Keying (PSK)
Presentation transcript:

A Bayesian System for Noise Robust Binaural Speaker Counting for Humanoid Robots Matthew Tata, Austin Kothig, and Francesco Rea

Computational Implementation To describe a biologically inspired computational model for localizing speakers [2,3] by a binaural humanoid robot iCub [1] To extend the algorithm to test whether use of the 5hz envelope dynamics of speech can help to reject distractor sound sources. To enrich a Bayesian active hearing algorithm [4] that uses instantaneous egocentric evidence to update an allocentric posterior map two approaches: in the Amplitude Only condition, the RMS amplitude of each band x beam signal is computed. In the Envelope condition, the 5hz envelope modulations due to speech are extracted from each band x beam signal by computing the absolute value of the Hilbert transform, then band-pass filtering the envelope and collapsing across time using RMS. The creation of the acoustic Bayesian map (ABM) is defined by the product of all the allocentric acoustic maps (AMallo) and approximates the output of the inferior colliculus of the mammalian auditory pathway. Thus the Amplitude Only ABM describes the spectrospatial scene unmixed on the basis of signal amplitude, whereas the Envelope ABM represents the spectrospatial scene unmixed on the basis of 5hz envelope dynamics. To arrive at a single posterior distribution of sound sources across the azimuthal plane, we averaged across frequency bands, yeilding a distribution of belief that a sound source occupied a particular azimuthal angle. Each peak in this distribution can be considered a sound source and a candidate for target selection

Experiment Does the system reliably reports the presence and location of human voice regardless of competing noise sources? We reproduced auditory targets and distractors (pink noise) in free field in the auditory virtual-reality lab.

Result In counting and localizing the single candidate target shows that Envelop approach is unaffected by increasing number of distractors. LOCALIZATION ERROR COUNT TARGETS A repeated-measures ANOVA with set-size and envelope approach supported this significant interaction (F 6,1668 = 4.9; p<0.001)

Human Voice Target / Noise Distractor COUNT TARGETS LOCALIZATION ERROR

Human Voice Target / Urban Sounds Distractors COUNT TARGETS LOCALIZATION ERROR

5hz AM Target / Noise Distractors Counting SumSQ DF MeanSq F p Greenhouse-Geiser SetSize 60,1908163 6 10,0318027 9,58306412 2,20E-10 1,62E-09 Filter x SetSize 30,8459184 5,14098639 4,91102184 5,47E-05 0,000117306 error 1746,10612 1668 1,04682621 1 0,5 Error 42300,3357 7050,05595 3,73882988 1,06E-03 1,41E-03 1597,21939 266,203231 0,14117457 9,91E-01 0,988299429 3145233,59 1885,63165 Group Main Effect mean difference StdError p 'amp' 'env' -0,298979592 0,047968867 5,63E-10 4,137755102 2,048018716 0,04334505

Human Voice Target / Noise Distractors Counting SumSQ DF MeanSq F p Greenhouse-Geiser SetSize 1049,6 6 174,933333 146,077742 4,60E-149 4,46E-108 Filter x SetSize 264,338776 44,0564626 36,7892641 4,69E-42 5,47E-31 error 1997,4898 1668 1,19753585 1 0,5 Error 168020,332 28003,3886 13,1549126 1,34E-14 4,58E-14 11648,2214 1941,37024 0,91198091 4,85E-01 0,482046551 3550738,3 2128,73999 Group Main Effect mean difference StdError p 'amp' 'env' 0,170408163 0,046117612 0,00021982 15,91428571 2,127831594 1,06E-10

Human Voice Target / Urban Sounds Distractors Counting SumSQ DF MeanSq F p Greenhouse-Geiser SetSize 102,638776 6 17,1064626 24,6305867 4,24E-28 1,06E-23 Filter x SetSize 5,75714286 0,95952381 1,38156175 2,18E-01 0,22863497 error 1158,46122 1668 0,69452112 1 0,5 Error 398902,542 66483,757 28,6750963 8,67E-33 9,64E-32 6813,69082 1135,61514 0,48980194 8,16E-01 0,810225562 3867289,77 2318,51905 Group Main Effect mean difference StdError p 'amp' 'env' -0,1 0,037153803 0,00711284 9,269387755 2,012262111 4,10E-06