Position Calibration of Acoustic Sensors and Actuators on Distributed General Purpose Computing Platforms Vikas Chandrakant Raykar | University of Maryland,

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Chunyi Peng, Guobin Shen, Yongguang Zhang, Yanlin Li, Kun Tan BeepBeep: A High Accuracy Acoustic Ranging System using COTS Mobile Devices.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
Computer vision: models, learning and inference
G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.
Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.
Automatic Position Calibration of Multiple Microphones
Location Estimation in Sensor Networks Moshe Mishali.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Parametric Inference.
Independent Component Analysis (ICA) and Factor Analysis (FA)
Project Presentation: March 9, 2006
APPROXIMATE EXPRESSIONS FOR THE MEAN AND COVARIANCE OF THE ML ESTIMATIOR FOR ACOUSTIC SOURCE LOCALIZATION Vikas C. Raykar | Ramani Duraiswami Perceptual.
Course AE4-T40 Lecture 5: Control Apllication
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Algorithm Evaluation and Error Analysis class 7 Multiple View Geometry Comp Marc Pollefeys.
HIWIRE meeting ITC-irst Activity report Marco Matassoni, Piergiorgio Svaizer March Torino.
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Maximum likelihood (ML)
QUASI MAXIMUM LIKELIHOOD BLIND DECONVOLUTION QUASI MAXIMUM LIKELIHOOD BLIND DECONVOLUTION Alexander Bronstein.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Sound Source Localization based Robot Navigation Group 13 Supervised By: Dr. A. G. Buddhika P. Jayasekara Dr. A. M. Harsha S. Abeykoon 13-1 :R.U.G.Punchihewa.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Colorado Center for Astrodynamics Research The University of Colorado STATISTICAL ORBIT DETERMINATION Project Report Unscented kalman Filter Information.
PATTERN RECOGNITION AND MACHINE LEARNING
1 Mohammed M. Olama Seddik M. Djouadi ECE Department/University of Tennessee Ioannis G. PapageorgiouCharalambos D. Charalambous Ioannis G. Papageorgiou.
Kalman filtering techniques for parameter estimation Jared Barber Department of Mathematics, University of Pittsburgh Work with Ivan Yotov and Mark Tronzo.
July, 2005 Doc: IEEE a Qi, Li, Hara, Kohno (NICT) SlideTG4a1 Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs)
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Computer vision: models, learning and inference Chapter 19 Temporal models.
Intel Labs Self Localizing sensors and actuators on Distributed Computing Platforms Vikas Raykar Igor Kozintsev Igor Kozintsev Rainer Lienhart.
Chapter 6 BEST Linear Unbiased Estimator (BLUE)
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Modern Navigation Thomas Herring
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Dealing with Acoustic Noise Part 2: Beamforming Mark Hasegawa-Johnson University of Illinois Lectures at CLSP WS06 July 25, 2006.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Signal and Noise Models SNIR Maximization Least-Squares Minimization MMSE.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Modern Navigation Thomas Herring MW 11:00-12:30 Room
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Full-rank Gaussian modeling of convolutive audio mixtures applied to source separation Ngoc Q. K. Duong, Supervisor: R. Gribonval and E. Vincent METISS.
A Flexible New Technique for Camera Calibration Zhengyou Zhang Sung Huh CSPS 643 Individual Presentation 1 February 25,
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Analyzing Expression Data: Clustering and Stats Chapter 16.
September, 2005 Doc: IEEE a Qi, Li, Umeda, Hara and Kohno (NICT) SlideTG4a1 Project: IEEE P Working Group for Wireless Personal.
Turning a Mobile Device into a Mouse in the Air
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Doc.: a Submission September 2004 Z. Sahinoglu, Mitsubishi Electric research LabsSlide 1 A Hybrid TOA/RSS Based Location Estimation Zafer.
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Spatial Covariance Models For Under- Determined Reverberant Audio Source Separation N. Duong, E. Vincent and R. Gribonval METISS project team, IRISA/INRIA,
Position Calibration of Audio Sensors and Actuators in a Distributed Computing Platform Vikas C. Raykar | Igor Kozintsev | Rainer Lienhart University of.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Acoustic source tracking using microphone array R 羅子建 R 林祺豪.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
LECTURE 11: Advanced Discriminant Analysis
LECTURE 10: DISCRIMINANT ANALYSIS
Filtering and State Estimation: Basic Concepts
Solving an estimation problem
EE513 Audio Signals and Systems
Feature space tansformation methods
LECTURE 15: REESTIMATION, EM AND MIXTURES
LECTURE 09: DISCRIMINANT ANALYSIS
NonLinear Dimensionality Reduction or Unfolding Manifolds
Submission Title: [Three ranging-related schemes]
Presentation transcript:

Position Calibration of Acoustic Sensors and Actuators on Distributed General Purpose Computing Platforms Vikas Chandrakant Raykar | University of Maryland, CollegePark

Motivation  Many multimedia applications are emerging which use multiple audio/video sensors and actuators. Microphones Cameras Speakers Displays DistributedCapture Distributed Rendering Other Applications Number Crunching Current Thesis

X What can you do with multiple microphones…  Speaker localization and tracking.  Beamforming or Spatial filtering.

Some Applications… Audio/Video Surveillance Smart Conference Rooms Audio/Image Based Rendering Meeting Recording Source separation and Dereverberation Speech Recognition Hands free voice communication Speaker Localization and tracking Multichannel speech Enhancement MultiChannel echo Cancellation Novel Interactive audio Visual Interfaces

More Motivation…  Current work has focused on setting up all the sensors and actuators on a single dedicated computing platform.  Dedicated infrastructure required in terms of the sensors, multi-channel interface cards and computing power. On the other hand  Computing devices such as laptops, PDAs, tablets, cellular phones,and camcorders have become pervasive.  Audio/video sensors on different laptops can be used to form a distributed network of sensors.

Common TIME and SPACE  Put all the distributed audio/visual input/output capabilities of all the laptops into a common TIME and SPACE.  This thesis deals with common SPACE i.e estimate the 3D positions of the sensors and actuators. Why common SPACE  Most array processing algorithms require that precise positions of microphones be known.  Painful, tedious and imprecise to do a manual measurement.

This thesis is about.. X Y Z

If we know the positions of speakers…. If distances are not exact If we have more speakers X Y ? Solve in the least square sense

If positions of speakers unknown…  Consider M Microphones and S speakers.  What can we measure? Distance between each speaker and all microphones. Or Time Of Flight (TOF) MxS TOF matrix Assume TOF corrupted by Gaussian noise. Can derive the ML estimate. Calibration signal

Nonlinear Least Squares.. More formally can derive the ML estimate using a Gaussian Noise model Find the coordinates which minimizes this

Maximum Likelihood (ML) Estimate.. we can define a noise model and derive the ML estimate i.e. maximize the likelihood ratio Gaussian noise If noise is Gaussian and independent ML is same as Least squares

Reference Coordinate System Reference Coordinate system | Multiple Global minima X axis Positive Y axis Origin Similarly in 3D 1.Fix origin (0,0,0) 2.Fix X axis (x1,0,0) 3.Fix Y axis (x2,y2,0) 4.Fix positive Z axis x1,x2,y2>0 Which to choose? Later…

On a synchronized platform all is well..

However On a Distributed system..

The journey of an audio sample.. Network This laptop wants to play a calibration signal on the other laptop. Play comand in software. When will the sound be actually played out from The loudspeaker. Operating system Multimedia/multistream applications Audio/video I/O devices I/O bus

t t Signal Emitted by source j Signal Received by microphone i Capture Started Playback Started Time Origin On a Distributed system..

Joint Estimation.. Speaker Emission Start Times S Microphone Capture Start Times M -1 Assume tm_1=0 Microphone and speaker Coordinates 3(M+S)-6 MS TOF Measurements Totally 4M+4S-7 parameters to estimates MS observations Can reduce the number of parameters

Use Time Difference of Arrival (TDOA).. Formulation same as above but less number of parameters.

Assuming M=S=K Minimum K required..

Nonlinear least squares.. Levenberg Marquadrat method Function of a large number of parameters Unless we have a good initial guess may not converge to the minima. Approximate initial guess required.

Closed form Solution.. Say if we are given all pairwise distances between N points can we get the coordinates XXXX 2XXXX 3XXXX 4XXXX

Classical Metric Multi Dimensional Scaling dot product matrix Symmetric positive definite rank 3 Given B can you get X ?....Singular Value Decomposition Same as Principal component Analysis But we can measure Only the pairwise distance matrix

How to get dot product from the pairwise distance matrix… k i j

Centroid as the origin… Later shift it to our orignal reference Slightly perturb each location of GPC into two to get the initial guess for the microphone and speaker coordinates

Example of MDS…

Instead of pairwise distances we can use pairwise “dissimilarities”. When the distances are Euclidean MDS is equivalent to PCA. Eg. Face recognition, wine tasting Can get the significant cognitive dimensions. MDS is more general..

Can we use MDS..Two problems s1s2s3s4m1m2m3m4 s1 ???? XXXX s2 ???? XXXX s3 ???? XXXX s4 ???? XXXX m1 XXXX ???? m2 XXXX ???? m3 XXXX ???? m4 XXXX ???? 1. We do not have the complete pairwise distances 2. Measured distances Include the effect of lack of synchronization UNKNOWN

Clustering approximation…

j i j i j i

Finally the complete algorithm… Approx Distance matrix between GPCs Approx ts Approx tm Clustering Approximation Dot product matrix Dimension and coordinate system MDS to get approx GPC locations perturb TOF matrix Approx. microphone and speaker locations TDOA based Nonlinear minimization Microphone and speaker locations tm

Sample result in 2D…

Algorithm Performance… The performance of our algorithm depends on Noise Variance in the estimated distances. Number of microphones and speakers. Microphone and speaker geometry One way to study the dependence is to do a lot of monte carlo simulations. Else can derive the covariance matrix and bias of the estimator. The ML estimate is implicitly defined as the minimum of a certain error function. Cannot get an exact analytical expression for the mean and variance. Or given a noise model can derive bounds on how worst can our algortihm perform. The Cramer Rao bound.

 Can use implicit function theorem and Taylors series expansion to get approximate expressions for bias and variance. J A Fessler. Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): Applications to tomography. IEEE Tr. Im. Proc., 5(3): , Amit Roy Chowdhury and Rama Chellappa, "Statistical Bias and the Accuracy of 3D Reconstruction from Video", Submitted to International Journal of Computer Vision  Using first order taylors series expansion Jacobian Rank Deficit..remove the Known parameters Estimator Variance…

 Gives the lower bound on the variance of any unbiased estimator.  Does not depends on the estimator. Just the data and the noise model.  Basically tells us to what extent the noise limits our performance i.e. you cannot get a variance lesser than the CR bound. Jacobian Rank Deficit..remove the Known parameters

Different Estimators..

Number of sensors matter…

Geometry also matters…

Calibration Signal…

Compute the cross-correlation between the signals received at the two microphones. The location of the peak in the cross correlation gives an estimate of the delay. Task complicated due to two reasons 1.Background noise. 2.Channel multi-path due to room reverberations. Use Generalized Cross Correlation(GCC). W(w) is the weighting function. PHAT(Phase Transform) Weighting Time Delay Estimation…

Synchronized setup | bias 0.08 cm sigma 3.8 cm Mic 3 Mic 1 Mic 2 Mic 4 Speaker 1 Speaker 4 Speaker 2 Speaker 3 X Z Room Length = 4.22 m Room Width = 2.55 m Room Height = 2.03 m

Distributed Setup… Initialization phase Scan the network and find the number of GPC’s and the UPnP services available Master GPC 1GPC 2 GPC M GPC 1 (Speaker) GPC 2 (Mic) Calibration signal parameters TOA Computation TOA TOA matrix Position estimation Play Calibration Signal Play ML Sequence

Experimental results using real data

Related Previous work… J. M. Sachar, H. F. Silverman, and W. R. Patterson III. Position calibration of large-aperture microphone arrays. ICASSP 2002 Y. Rockah and P. M. Schultheiss. Array shape calibration using sources in unknown locations Part II: Near-field sources and estimator implementation. IEEE Trans. Acoust., Speech, Signal Processing, ASSP-35(6): , June J. Weiss and B. Friedlander. Array shape calibration using sources in unknow locations a maximum-likelihood approach. IEEE Trans. Acoust., Speech, Signal Processing, 37(12): , December R. Moses, D. Krishnamurthy, and R. Patterson. A self-localization method for wireless sensor networks. Eurasip Journal on Applied Signal Processing Special Issue on Sensor Networks, 2003(4):348{358, March index.htm

Our Contributions… Novel setup for array processing. Position calibration in a distributed scenario. Closed form solution for the non-linear minimization routine. Expression for the mean and variance of the esimators. Study the effect of sensor geometry.

Acknowledgements… Dr. Ramani Duraiswami and Prof. Rama Chellappa Prof. Yegnanarayana Dr. Igor Kozintsev and Dr. Rainer Lienhart, Intel Research Prof. Min Wu and Prof. Shihab Shamma Prof. Larry Davis

Thank You ! | Questions ?