A Low-Complexity Universal Architecture for Distributed Rate-Constrained Nonparametric Statistical Learning in Sensor Networks Avon Loy Fernandes, Maxim.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Sampling and Pulse Code Modulation
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Onur G. Guleryuz & Ulas C.Kozat DoCoMo USA Labs, San Jose, CA 95110
Enhancing Secrecy With Channel Knowledge
Model Assessment and Selection
CWIT Robust Entropy Rate for Uncertain Sources: Applications to Communication and Control Systems Charalambos D. Charalambous Dept. of Electrical.
Chapter 6 Information Theory
Instructor : Dr. Saeed Shiry
Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.
Fundamental limits in Information Theory Chapter 10 :
Spatial and Temporal Data Mining
Location Estimation in Sensor Networks Moshe Mishali.
Distributed Video Coding Bernd Girod, Anne Margot Aagon and Shantanu Rane, Proceedings of IEEE, Jan, 2005 Presented by Peter.
On the interdependence of routing and data compression in multi-hop sensor networks Anna Scaglione, Sergio D. Servetto.
Losslessy Compression of Multimedia Data Hao Jiang Computer Science Department Sept. 25, 2007.
© 2005, it - instituto de telecomunicações. Todos os direitos reservados. Gerhard Maierbacher Scalable Coding Solutions for Wireless Sensor Networks IT.
BASiCS Group University of California at Berkeley Generalized Coset Codes for Symmetric/Asymmetric Distributed Source Coding S. Sandeep Pradhan Kannan.
Lattices for Distributed Source Coding - Reconstruction of a Linear function of Jointly Gaussian Sources -D. Krithivasan and S. Sandeep Pradhan - University.
Probability Grid: A Location Estimation Scheme for Wireless Sensor Networks Presented by cychen Date : 3/7 In Secon (Sensor and Ad Hoc Communications and.
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Xinqiao LiuRate constrained conditional replenishment1 Rate-Constrained Conditional Replenishment with Adaptive Change Detection Xinqiao Liu December 8,
Compression with Side Information using Turbo Codes Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University Data Compression Conference.
Noise, Information Theory, and Entropy
Noise, Information Theory, and Entropy
Software Research Image Compression Mohamed N. Ahmed, Ph.D.
Signal Strength based Communication in Wireless Sensor Networks (Sensor Network Estimation) Imran S. Ansari EE 242 Digital Communications and Coding (Fall.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Computer Vision – Compression(2) Hanyang University Jong-Il Park.
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Target Tracking with Binary Proximity Sensors: Fundamental Limits, Minimal Descriptions, and Algorithms N. Shrivastava, R. Mudumbai, U. Madhow, and S.
When rate of interferer’s codebook small Does not place burden for destination to decode interference When rate of interferer’s codebook large Treating.
November 1, 2012 Presented by Marwan M. Alkhweldi Co-authors Natalia A. Schmid and Matthew C. Valenti Distributed Estimation of a Parametric Field Using.
CMPT 365 Multimedia Systems
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
User Cooperation via Rateless Coding Mahyar Shirvanimoghaddam, Yonghui Li, and Branka Vucetic The University of Sydney, Australia IEEE GLOBECOM 2012 &
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
Image Denoising Using Wavelets
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
CHAPTER 5 SIGNAL SPACE ANALYSIS
Introduction to Digital Signals
Name Iterative Source- and Channel Decoding Speaker: Inga Trusova Advisor: Joachim Hagenauer.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,
BCS547 Neural Decoding.
Computer simulation Sep. 9, QUIZ 2 Determine whether the following experiments have discrete or continuous out comes A fair die is tossed and the.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Jayanth Nayak, Ertem Tuncel, Member, IEEE, and Deniz Gündüz, Member, IEEE.
Jayanth Nayak, Ertem Tuncel, Member, IEEE, and Deniz Gündüz, Member, IEEE.
Bounds on Redundancy in Constrained Delay Arithmetic Coding Ofer ShayevitzEado Meron Meir Feder Ram Zamir Tel Aviv University.
Lampel ZIV (LZ) code The Lempel-Ziv algorithm is a variable-to-fixed length code Basically, there are two versions of the algorithm LZ77 and LZ78 are the.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
- A Maximum Likelihood Approach Vinod Kumar Ramachandran ID:
DIGITAL COMMUNICATION. Introduction In a data communication system, the output of the data source is transmitted from one point to another. The rate of.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
12. Principles of Parameter Estimation
The Johns Hopkins University
Ch3: Model Building through Regression
Context-based Data Compression
2018/9/16 Distributed Source Coding Using Syndromes (DISCUS): Design and Construction S.Sandeep Pradhan, Kannan Ramchandran IEEE Transactions on Information.
Graduate School of Information Sciences, Tohoku University
Presenter by : Mourad RAHALI
Wyner-Ziv Coding of Video - Towards Practical Distributed Coding -
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Chapter 8: Generalization and Function Approximation
Parametric Methods Berlin Chen, 2005 References:
12. Principles of Parameter Estimation
Presentation transcript:

A Low-Complexity Universal Architecture for Distributed Rate-Constrained Nonparametric Statistical Learning in Sensor Networks Avon Loy Fernandes, Maxim Raginsky & Todd P. Coleman Introduction 1. We consider the problem of fitting a function of sensor location to a series of noisy observations. 2.Characteristics of the problem:  Regression : The ultimate goal is for the function to be a good predictor of the response of a sensor randomly placed in the field with a small mean squared error.  Nonparametric : We only assume that the function lies in some class of sufficiently smooth functions.  Algorithm is universal : it works for any joint distribution of sensor locations and measurements as long as the regression function is sufficiently smooth.  No assumptions are made a priori about the noise distribution – it is allowed to be unbounded and need not be additive. 3.This setup corresponds to real-world scenarios where a large number of cheap sensors are deployed in the field where little can be assumed about the ambient noise present. In many such scenarios, the underlying object of interest is a smooth function (e.g. temperature gradient, pressure gradient etc.) Learning & Rate Constraints The sensors have to communicate wirelessly to the fusion center and must therefore digitize the observations. The fusion center will only see a quantized version of the sensor observations, but will have complete knowledge of the sensor locations. Each sensor knows its own location. An algorithm is proposed where each sensor quantizes its observations and passes one message to one neighbor that relates to a universal sequential probability estimator. Each sensor then compresses its observation according to its probability estimate and passes the information (bits) to the fusion center. The fusion center uses a decompression algorithm, decodes the quantizer indices losslessly and then obtains the quantized representation of the sensor outputs. A statistical learning algorithm is used to approximate the function. Only low-complexity operations are performed by the encoder. Each sensor quantizes its observation using a uniform scalar quantizer. The indices are losslessly compressed using arithmetic or Huffman codes, where the probability model is obtained by the simple message passing scheme. This approach demonstrates a novel linkage of principles from data compression and information theory with principles from nonparametric estimation and statistical learning theory. Proposed Approach 1. Field Model : N sensors are placed uniformly at random on the unit square [0,1] 2. The output of the i th sensor is of the form, where f :[0,1] 2 →R is some smooth function, X i are the sensor locations, and Z 1,…,Z N are i.i.d. Gaussian RVs ~ N (0,σ 2 ). This model is standard, but the scheme applies to an arbitrary relation between X and Y. 2. Quantization : Given ε>0 (the quantizer step size), the encoder mapping is defined as follows: Let U 1,…,U N be i.i.d. RVs drawn from a uniform distribution on the interval. This is the dither signal, which is known at the fusion center. Dithering is necessary to make the estimator robust. The i th sensor computes The unit square [0,1] 2 is divided into n ordered squares, where n is chosen adaptively depending on the quantizer outputs M i. Then L i є {1,…,n} is the number of the square containing X i. Similarly, the dither interval is divided into n subintervals (also chosen adaptively) and K i є {1,…,n} is the number of the subinterval containing U i. The i th sensor knows (M i,K i,L i ); the fusion center knows all (K i,L i ). 3. Krichevsky-Trofimov Estimator : Given the index M i =m i, each sensor calculates the conditional probability: where the numerator and denominator contain the K-T Estimate defined as follows: The K-T Estimator induces a lossless code whose redundancy converges to zero as N →∞. 4. Decoder: The fusion center, upon decoding M, computes: 5. Estimation of the function: An estimator using Fourier coefficients is used to estimate f : where, The cutoffs J 1,n and J 2,n can either be chosen adaptively or prior knowledge on the smoothness of f can be incorporated into their choice. Theoretical Guarantees 1. Bit Rate : For a given quantizer step size, the average number of bits per sensor is bounded from above by the conditional rate distortion function of Y given X evaluated at epsilon, plus bits (Ziv’s bound with side information). In practice, this number will be slightly higher than Ziv’s bound. 2. Estimator Performance : Using Ziv’s entropy-coded scalar quantization with dither, the proposed estimator is an unbiased, efficient estimator of theta. 3. The algorithm attains minimax convergence rates for the MSE for regression functions that are Lipschitz, Sobolev, etc. Scalar quantization of sensor observations affects the multiplicative constants, but not the convergence rates. Results Conclusions The algorithm was able to learn the function well (i.e. low MSE). Empirically, the algorithm performs very close to Ziv’s bound with side information. MSE curves reproduce the linear relation between MSE and epsilon. The algorithm is attractive at low epsilon values because we can use a huge number of sensors, communicating their observations at very low rates, and still get minimax-rate convergence of the MSE. This approach can also be generalized to multiplicative noise and Poisson noise. References [1] Maxim Raginsky, “Learning From Compressed Observations”, Proceedings of the 2007 IEEE Information Theory Workshop, Lake Tahoe, CA [2] Ye Wang and Prakash Ishwar, “On Non-Parametric Field Estimation using Randomly Deployed, Noisy, Binary Sensors”, Proceedings of ISIT 2007, Nice, France [3] Jacob Ziv, “On Universal Quantization”, IEEE Transactions on Information Theory, v IT-31, n 3, May, 1985, p [4] Vladimir N. Vapnik, “An Overview of Statistical Learning Theory”, IEEE Transactions on Neural Networks, v 10, n 5, Sept. 1999, p [5] Ram Zamir and Meir Feder, “On Universal Quantization by Randomized Uniform/Lattice Quantizers”, IEEE Transactions on Information Theory, v 38, n 2 pt I, Mar, 1992, p [6] Slobodan N. Simić, “A Learning-Theory Approach to Sensor Networks”, IEEE Pervasive Computing, v 2, n 4, October/December, 2003, p Illinois Center for Wireless Systems