In a lot of applications, wireless sensing systems are used for inference and prediction on environmental phenomena. Statistical models are widely used.

Slides:



Advertisements
Similar presentations
Estimating the detector coverage in a negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital Dipankar Dasgupta The University of Memphis.
Advertisements

Applications of one-class classification
TARGET DETECTION AND TRACKING IN A WIRELESS SENSOR NETWORK Clement Kam, William Hodgkiss, Dept. of Electrical and Computer Engineering, University of California,
Unsupervised Learning
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
CMPUT 466/551 Principal Source: CMU
David Chu--UC Berkeley Amol Deshpande--University of Maryland Joseph M. Hellerstein--UC Berkeley Intel Research Berkeley Wei Hong--Arched Rock Corp. Approximate.
Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,
x – independent variable (input)
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Sampling Design: Determine Where to Take Measurements Sampling Design: Determine Where to Take Measurements Empirical Approaches to Sensor Placement: Mobile.
Ensemble Learning: An Introduction
An Optimal Learning Approach to Finding an Outbreak of a Disease Warren Scott Warren Powell
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
Chapter 11 Integration Information Instructor: Prof. G. Bebis Represented by Reza Fall 2005.
Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.
Single Point of Contact Manipulation of Unknown Objects Stuart Anderson Advisor: Reid Simmons School of Computer Science Carnegie Mellon University.
1 Automatic Request Categorization in Internet Services Abhishek B. Sharma (USC) Collaborators: Ranjita Bhagwan (MSR, India) Monojit Choudhury (MSR, India)
Radial Basis Function Networks
HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE NEURAL NETWORKS RESEACH CENTRE Variability of Independent Components.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Data Selection In Ad-Hoc Wireless Sensor Networks Olawoye Oyeyele 11/24/2003.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Anomaly detection Problem motivation Machine Learning.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Link Reconstruction from Partial Information Gong Xiaofeng, Li Kun & C. H. Lai
PARAMETRIC STATISTICAL INFERENCE
Life Under Your Feet. Sensor Network Design Philosophies Use low cost components No access to line power –Deployed in remote locations Radio is the.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Simultaneous Localization and Mapping Presented by Lihan He Apr. 21, 2006.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
Blind Calibration of Sensor Networks Laura Balzano (UCLA) Robert Nowak (UW-Madison) This work was supported.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Chapter 21 R(x) Algorithm a) Anomaly Detection b) Matched Filter.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Probabilistic Coverage in Wireless Sensor Networks Authors : Nadeem Ahmed, Salil S. Kanhere, Sanjay Jha Presenter : Hyeon, Seung-Il.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Designing multiple biometric systems: Measure of ensemble effectiveness Allen Tang NTUIM.
D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Classification Ensemble Methods 1
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Problem Description: One line explanation of the problem to be solved Problem Description: One line explanation of the problem to be solved Proposed Solution:
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
How Dirty is your Data : The Duality between detecting Events and Faults J. Gupchup A. Terzis R. Burns A. Szalay Department of Computer Science Johns Hopkins.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Deep Feedforward Networks
ASEN 5070: Statistical Orbit Determination I Fall 2014
Ch8: Nonparametric Methods
Dynamical Statistical Shape Priors for Level Set Based Tracking
Presenter: Xudong Zhu Authors: Xudong Zhu, etc.
Filtering and State Estimation: Basic Concepts
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
10701 / Machine Learning Today: - Cross validation,
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Parametric Methods Berlin Chen, 2005 References:
Linear Discrimination
Jia-Bin Huang Virginia Tech
Probabilistic Surrogate Models
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

In a lot of applications, wireless sensing systems are used for inference and prediction on environmental phenomena. Statistical models are widely used to represent these environmental phenomena: Models characterize how unknown quantities (phenomena) are related to known quantities (measurements): Choosing the models involves a great deal of uncertainty. Often a single model M is used. If M does not characterize a phenomenon correctly, the inferences and predictions will not be accurate. It is better to start with multiple plausible models and select the model by collecting measurements at informative locations. Reducing Uncertainty in Sensor Calibration Reducing Uncertainty in Hardware Functionality (Fault Detection/Diagnosis) Take Physical Sample Reducing Uncertainty in Model Selection Minimizing Data Uncertainty through System Design DeploymentData quality indicators Bangladesh45% GDISensors reported 3-60% faulty data Ecuador Volcano 82% false negative rate / 13% false positive rate Macroscope8 of 33 temperature sensors faulty Laura Balzano, Nabil Hajj Chehade, Sheela Nair, Nithya Ramanathan, Abhishek Sharma, Deborah Estrin, Leana Golubchik, Ramesh Govindan, Mark Hansen, Eddie Kohler, Greg Pottie, Mani Srivastava Integrity Group, Center for Embedded Networked Sensing Introduction: There are Many Sources of Uncertainty in Interpreting Data Introduction: There are Many Sources of Uncertainty in Interpreting Data Environment Modeling Uncertainty Sensor Calibration Uncertainty UCLA – UCR – Caltech – USC – UC Merced Center for Embedded Networked Sensing Data uncertainty can be reduced through careful system design! Hardware Uncertainty Wireless sensing systems utilize low cost and unreliable hardware Faults are commonExamples of Sensor Faults Accurate calibration function is required to translate data from sensors Calibration parameters for most sensors drifts non-deterministically over time Problem Description: Online fault detection and diagnosis By detecting faults when they occur, instead of after the fact, users can take actions in the field to validate questionable data and fix hardware faults. Confidence Assumptions: Faults can be common, an initial fault-free training period is not always available, environmental phenomena are hard to predict so tight bounds on expected behavior are not possible Evaluated in real-world Deployments Confidence detects faults with low false positive and negative rates. Difficult to validate what is truly a fault without ground truth In our San Joaquin deployment we validated data by analyzing soil samples taken from each sensor Outlier Detection: Using a continually updated distribution, in place of statically defined thresholds, makes Confidence resilient to human configuration error and adaptable to dynamic environments Replace Sensor Gradient Standard Deviation Readings are mapped into a multi-dimensional space defined by carefully chosen features: gradient, distance from LDR, distance from NLDR, standard deviation. Points far from the origin are faulty Assume a normal distribution of distances for good points. Points outside 2 standard deviations of the mean distance are considered outliers and are rejected. All other points are used to continually update distribution parameters. Points are clustered using an online K-means algorithm. Clusters are associated with a previously successful remediating action Bangladesh Detects 85% of faulty data in a real-world data-trace captured in Bangladesh even though over one third of the data are faulty San Joaquin River We ran Confidence In a deployment of 20 sensors in San Joaquin. Confidence accurately detected all 4 faults that occurred and correctly diagnosed 3 of the 4 faults, with no false positives or negatives Data-driven techniques for identifying faulty sensor readings 1) Rule/Heuristic-based methods 2) Linear-Least Squares Estimation based method Exploits correlation in the data measured at different sensors LLSE Equation: HMM model: Number of states Transition probabilities Conditional probability: Pr [O | S ] SHORT Rule: Compute the rate of change between two successive samples. If it is above a threshold, this is an instance of SHORT fault. NOISE Rule : Compute the std. deviation of samples within time window W. If it is above a threshold, the samples are corrupted by NOISE fault. Results Analyzed data sets from real world deployments to characterize the prevalence of data faults using these 3 methods. NAMOS deployment : CONSTANT+NOISE faults, up to 30% of samples affected by data faults. Intel Lab, Berkeley deployment: CONSTANT+NOISE faults, up to 20% of samples affected by data faults. Great Duck Island deployment: SHORT+NOISE faults, 10-15% of samples affected by data faults. SensorScope deployment: SHORT faults, very few samples affected by data faults. SHORT faultNOISE fault 3) Learning data models : Hidden Markov Models Injected CONSTANT fault NO YES Signatures for modeling normal and faulty behavior Difficult to initialize sensor signature without learning period that is guaranteed to be fault-free. –Can use a stricter threshold during learning period to decrease chance of incorporating faults into sensor signature Method is dependent on accurately representing fault models, which is difficult without available labeled training data. Summarize sensor and fault behaviors using a signature: multivariate probability density of features (Cahill, Lambert, Pinhiero, and Sun; 2000) Features chosen to exploit differences between faulty and normal behavior. Current features summarize temporal and spatial information: –Temporal: actual reading, change between successive readings, voltage –Spatial: diff. from neighboring sensors. Calculate score for new readings using log likelihood ratio: Higher scores are more suspicious. Use of sensor signatures allows for sensor- specific fault detection. Fault Detection Algorithm (adapted from Detecting Fraud In the Real World; Cahill, Lambert, Pinhiero, and Sun; 2000) Tested on one week of Cold-Air Drainage data 4/06 4/08 4/10 4/12 stuck-at fault Sensor 2 malfunctioning at start of deployment; Noisy readings are learned as “normal” sensor behavior update sensor sig. Signature update requires online density estimation Sequentially update density estimate with each new reading Unable to store historical data Must compactly represent density No single parametric family flexible enough to represent all distributions of features Developing a new method to do this using log-splines. Calculate Features: X t Calculate score New reading Sensor signature: S t Fault signature: F Score > threshold? update fault sig. Sensor 1 Sensor 2 unusually noisy readings Low voltage Problem Description: Blind Calibration Blindly calibrate sensor response from routine measurements collected from the sensor network. Manual calibration is not a scalable practice! Consider a network with n sensors. We can call the vector of a true signal from the n sensors x: And the vector of the measured signal y: Then assuming the measured signals y are a linear function of x: and assuming the true signals x lie in a known r-dimensional subspace of R n which can be defined by P, the orthogonal projection matrix onto the orthogonal complement of that subspace: Then under certain conditions on P, with no noise and exact knowledge of the subspace, we can perfectly recover the gain factors and partially recover the offset factors. robust to noise Error at 2% noise in the measured signal: Gain: <.01% Offset: <2.4% robust to mismodeling Error at 10% of the true signal outside of the assumed subspace: Gain: <1% Offset: <4% Evaluation: In a deployment with all sensors in a styrofoam box, thus with a 1-d signal subspace, the algorithm recovers the gains and offsets almost exactly. In a deployment with sensors spread across a valley at the James Reserve, using a 4-d signal subspace constructed from the calibrated data, the gain calibration was quite accurate. The offset calibration, as expected, captured some of the non-zero mean signal; additionally it was sensitive to the model. Algorithm: T-Designs A sequential algorithm is used to iteratively collect measurements that maximize the discrimination between the two models [1]. Evaluation on Real Data: Likelihoods: M1  , M2   M2 fits better. Generalization: In case of multiple models, apply the same algorithm to the best two models that fit the data at each iteration (worst case). Problem Description: Optimal Sensor placement Where should we collect measurements to optimally choose a model that represents the field? Assumptions: Two plausible models. Gaussian noise. Idea: Find the locations where the “difference” between the two models is the largest. Technically: [1] A.C. Atkinson and V.V. Fedorov. Optimal design: Experiments for discriminating between several models. Biometrika 62, , 1975.