Distributed Regression: an Efficient Framework for Modeling Sensor Network Data Carlos Guestrin Peter Bodik Romain Thibaux Mark Paskin Samuel Madden.

Slides:



Advertisements
Similar presentations
Copyright ©2004 Carlos Guestrin VLDB 2004 Efficient Data Acquisition in Sensor Networks Presented By Kedar Bellare (Slides adapted.
Advertisements

Bayesian Belief Propagation
Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Computer vision: models, learning and inference Chapter 8 Regression.
A Survey on Tracking Methods for a Wireless Sensor Network Taylor Flagg, Beau Hollis & Francisco J. Garcia-Ascanio.
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
CMPUT 466/551 Principal Source: CMU
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
David Chu--UC Berkeley Amol Deshpande--University of Maryland Joseph M. Hellerstein--UC Berkeley Intel Research Berkeley Wei Hong--Arched Rock Corp. Approximate.
Extensions of wavelets
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
1 Vertically Integrated Seismic Analysis Stuart Russell Computer Science Division, UC Berkeley Nimar Arora, Erik Sudderth, Nick Hay.
Monday, June 01, 2015 ARRIVE: Algorithm for Robust Routing in Volatile Environments 1 NEST Retreat, Lake Tahoe, June
SIA: Secure Information Aggregation in Sensor Networks Bartosz Przydatek, Dawn Song, Adrian Perrig Carnegie Mellon University Carl Hartung CSCI 7143: Secure.
Sam Pfister, Stergios Roumeliotis, Joel Burdick
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein.
Compressive Data Gathering for Large- Scale Wireless Sensor Networks Chong Luo Feng Wu Shanghai Jiao Tong University Microsoft Research Asia Jun Sun Chang.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
1 Distributed localization of networked cameras Stanislav Funiak Carlos Guestrin Carnegie Mellon University Mark Paskin Stanford University Rahul Sukthankar.
Distributed Inference in Dynamical Systems Emergency response systems: monitoring in hazardous conditions sensor calibration, localization Autonomous teams.
DNA Research Group 1 CountTorrent: Ubiquitous Access to Query Aggregates in Dynamic and Mobile Sensor Networks Abhinav Kamra, Vishal Misra and Dan Rubenstein.
Probabilistic video stabilization using Kalman filtering and mosaicking.
Avoiding Communication in Sparse Iterative Solvers Erin Carson Nick Knight CS294, Fall 2011.
Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.
Taming the Underlying Challenges of Reliable Multihop Routing in Sensor Networks.
© 2005, it - instituto de telecomunicações. Todos os direitos reservados. Gerhard Maierbacher Scalable Coding Solutions for Wireless Sensor Networks IT.
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
Optimizing Lifetime for Continuous Data Aggregation With Precision Guarantees in Wireless Sensor Networks Xueyan Tang and Jianliang Xu IEEE/ACM TRANSACTIONS.
Model-driven Data Acquisition in Sensor Networks Amol Deshpande 1,4 Carlos Guestrin 4,2 Sam Madden 4,3 Joe Hellerstein 1,4 Wei Hong 4 1 UC Berkeley 2 Carnegie.
Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,
CS 580S Sensor Networks and Systems Professor Kyoung Don Kang Lecture 7 February 13, 2006.
Energy Conservation in wireless sensor networks Kshitij Desai, Mayuresh Randive, Animesh Nandanwar.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Network Topologies.
An Introduction to Support Vector Machines Martin Law.
Decentralised Coordination of Mobile Sensors School of Electronics and Computer Science University of Southampton Ruben Stranders,
Efficient Gathering of Correlated Data in Sensor Networks
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
An Introduction to Support Vector Machines (M. Law)
RIDA: A Robust Information-Driven Data Compression Architecture for Irregular Wireless Sensor Networks Nirupama Bulusu (joint work with Thanh Dang, Wu-chi.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Rushing Attacks and Defense in Wireless Ad Hoc Network Routing Protocols ► Acts as denial of service by disrupting the flow of data between a source and.
Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,
Some Aspects of Bayesian Approach to Model Selection Vetrov Dmitry Dorodnicyn Computing Centre of RAS, Moscow.
Biointelligence Laboratory, Seoul National University
Analyzing wireless sensor network data under suppression and failure in transmission Alan E. Gelfand Institute of Statistics and Decision Sciences Duke.
DISTIN: Distributed Inference and Optimization in WSNs A Message-Passing Perspective SCOM Team
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
1 Compression and Storage Schemes in a Sensor Network with Spatial and Temporal Coding Techniques You-Chiun Wang, Yao-Yu Hsieh, and Yu-Chee Tseng IEEE.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.
Aggregation and Secure Aggregation. Learning Objectives Understand why we need aggregation in WSNs Understand aggregation protocols in WSNs Understand.
1 Constraint-Chaining: On Energy -Efficient Continuous Monitoring in Sensor Networks Adam Silberstein Rebecca Braynard Jun Yang Duke University.
Global Clock Synchronization in Sensor Networks Qun Li, Member, IEEE, and Daniela Rus, Member, IEEE IEEE Transactions on Computers 2006 Chien-Ku Lai.
Energy-Efficient Signal Processing and Communication Algorithms for Scalable Distributed Fusion.
Multi-Objective Optimization for Topology Control in Hybrid FSO/RF Networks Jaime Llorca December 8, 2004.
Aggregation and Secure Aggregation. [Aggre_1] Section 12 Why do we need Aggregation? Sensor networks – Event-based Systems Example Query: –What is the.
Construction of Optimal Data Aggregation Trees for Wireless Sensor Networks Deying Li, Jiannong Cao, Ming Liu, and Yuan Zheng Computer Communications and.
Dominik Kaspar, Eunsook Kim, Carles Gomez, Carsten Bormann
Computing and Compressive Sensing in Wireless Sensor Networks
Data Mining K-means Algorithm
Fast Approximate Query Answering over Sensor Data with Deterministic Error Guarantees Chunbin Lin Joint with Etienne Boursier, Jacque Brito, Yannis Katsis,
StatSense In-Network Probabilistic Inference over Sensor Networks
Solve Linear Equations by Elimination
Biointelligence Laboratory, Seoul National University
Machine Learning: UNIT-4 CHAPTER-1
Presentation transcript:

Distributed Regression: an Efficient Framework for Modeling Sensor Network Data Carlos Guestrin Peter Bodik Romain Thibaux Mark Paskin Samuel Madden

Data collection paradigm Base Station Query Distribute query Collect data New Query SQL-style query Redo process Goal: Push beyond simple data gathering devices paradigm

Example: temperature data from 10 nearby sensors: Slow changes over time Measurements correlated 4 hours of data send 5 numbers!! (yet very good approximation) Approximate measurements as send 500 numbers Collect all measurements: VS using Regression: Data is highly correlated Redundancy & Structure Build lower dimensional representation  Compression for data transmission  Provide nodes with local view of global state  …

The regression problem Given, basis functions Find coeffs w={w 1,…,w k } Precisely, minimize the residual error: N sensors K basis functions N sensors measurements weights K basis func

Regression solution where k×k matrix for k basis functions k×1 vector Problems: Invert A: too expensive in one mote “Gather” matrix A: NK 2 messages

Global temperature is complex Temperature surface is complex  Need complex basis functions? Lots of communication?

What are we missing? Temperature surface is complex but Lots of local structure! Local temperature regionsDo the right thing in the overlaps

Kernel regression Local basis functions for each region Kernels average between regions Distributed algorithm for obtaining coefficients Simple communication along a spanning tree Robust to lost messages Need global optimization to find optimal coefficients

Kernel regression  Sparse matrices 0 0 sensors basis functions (sparse) Sparse basis  Kernel basis functions have local support h1h1

Gaussian Elimination A is sparse ) Efficient Gaussian elimination: Complete system [A|b] After Gaussian elimination, solve linear system by k simple divisions subtract

Add message from node 1 One step of Gaussian elimination Distributed regression same matrices Complete system [A|b] Sensor 2 can locally compute w 2, w This subsystem is enough to compute w 2, w 3 M12M12

Specify regions. 1 Sensors compute small matrices that add up to [A|b]: 2. Message Passing. 3 Solve local Systems. 4 Distributed Regression: Solve global kernel regression problem with simple local communication

Communication pattern High quality links may not align with kernel topology Kernels may not form a tree structure Kernels form a tree structure  Communication along a spanning tree Communication along spanning tree using junction tree data structure

Distributed junction trees K 1, K 3 K 1, K 2 K 3, K 4 K 4, K 6 K 3, K 5 K 5, K 6 K 1,K 2 K 4, K 6 K 1,K 3 K 3,K 5,K 6 K 1,K 2,K 3, K 4,K 5,K 6 K 1,K 2,K 3, K 4,K 5,K 6 K 1,K 2,K 3, K 4,K 5,K 6 K 1,K 3,K 4,K 5,K 6 K 1,K 2, K 3,K 4,K 6 K 5,K 6 K1,K1,, K 6, K6, K  Any spanning tree transformed to a junction tree  Communication along junction tree guaranteed to obtain optimal parameters  Different spanning trees lead to different junction trees with different computation and communication complexity  See Paskin and Guestrin ’04 for spanning tree optimization

Robustness Robustness is key in sensor networks Nodes may be added to the network or fail Communication is unreliable Link qualities change over time Distributed regression messages are robust: Lost messages correspond to lost measurements Must make spanning tree and junction tree algorithms robust See Paskin and Guestrin ’04 for details

Locally, nodes obtain global view View from node 1:View from node 17:View from node 46: Global solution:

Temperature model for lab data

Convergence and robustness Distributed regression reliable communication Distributed regression 50% packets lost Offline solution

Incremental changes Distributed regression reliable communication Distributed regression 50% packets lost Offline solution Initializing with noon temperatures At 6pm, initializing from noon results

Residual error varies over time Average over regions Quadratic in time Linear in time Constant in time Regression with linear spatial components:

Effect of time window

Communication complexity

Extensions and applications Adaptive sampling Outlier and faulty sensor detection Contour finding Adaptive data modeling Basis function selection Model-based bit compression Bounds on bit precision for Gaussian elimination applicable Hierarchical models Unifying with wavelet-based approaches Currently applying similar ideas to probabilistic inference, actuator control, … See Paskin and Guestrin ’04 for details

Conclusions General distributed regression algorithm for sensor networks Robust to node and message losses Kernel regression is an effective model for wide range of sensor network data Provide basis for new more complex sensor network applications

Add message from node 1 One step of Gaussian elimination Distributed regression same matrices Complete system [A|b] Sensor 2 can locally compute w 2, w 3 12 This subsystem is enough to compute w 2, w 3 M12M12