Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,

Slides:

Advertisements

Similar presentations

Brocade: Landmark Routing on Peer to Peer Networks Ben Y. Zhao Yitao Duan, Ling Huang, Anthony Joseph, John Kubiatowicz IPTPS, March 2002.

Advertisements

Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.

Fast Algorithms For Hierarchical Range Histogram Constructions

Sogang University ICC Lab Using Game Theory to Analyze Wireless Ad Hoc networks.

David Chu--UC Berkeley Amol Deshpande--University of Maryland Joseph M. Hellerstein--UC Berkeley Intel Research Berkeley Wei Hong--Arched Rock Corp. Approximate.

Monday, June 01, 2015 ARRIVE: Algorithm for Robust Routing in Volatile Environments 1 NEST Retreat, Lake Tahoe, June

Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.

Compressive Oversampling for Robust Data Transmission in Sensor Networks Infocom 2010.

1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Transport Protocols.

1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel.

Distributed Inference in Dynamical Systems Emergency response systems: monitoring in hazardous conditions sensor calibration, localization Autonomous teams.

Distributed Regression: an Efficient Framework for Modeling Sensor Network Data Carlos Guestrin Peter Bodik Romain Thibaux Mark Paskin Samuel Madden.

Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn.

An Algebraic Approach to Practical and Scalable Overlay Network Monitoring Yan Chen, David Bindel, Hanhee Song, Randy H. Katz Presented by Mahesh Balakrishnan.

1 A New Paradigm For Distributed Monitoring Ling Huang, Minos Garofalakis, Nina Taft and Anthony Joseph {minos.garofalakis,

More routing protocols Alec Woo June 18 th, 2002.

Communication-Efficient Distributed Monitoring of Thresholded Counts Ram Keralapura, UC-Davis Graham Cormode, Bell Labs Jai Ramamirtham, Bell Labs.

Naming in Wireless Sensor Networks. 2 Sensor Naming  Exploiting application-specific naming and in- network processing for building efficient scalable.

Tributaries and Deltas: Efficient and Robust Aggregation in Sensor Network Streams Amit Manjhi, Suman Nath, Phillip B. Gibbons Carnegie Mellon University.

Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.

Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.

1 Distributed Online Simultaneous Fault Detection for Multiple Sensors Ram Rajagopal, Xuanlong Nguyen, Sinem Ergen, Pravin Varaiya EECS, University of.

Probabilistic Data Aggregation Ling Huang, Ben Zhao, Anthony Joseph Sahara Retreat January, 2004.

Robust Topology Control for Indoor Wireless Sensor Networks Greg Hackmann, Octav Chipara, and Chenyang Lu SenSys 2009 S Slides from Greg Hackmann at Washington.

A Probabilistic Approach to Collaborative Multi-robot Localization Dieter Fox, Wolfram Burgard, Hannes Kruppa, Sebastin Thrun Presented by Rajkumar Parthasarathy.

Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.

Improving the Accuracy of Continuous Aggregates & Mining Queries Under Load Shedding Yan-Nei Law* and Carlo Zaniolo Computer Science Dept. UCLA * Bioinformatics.

Adaptive Self-Configuring Sensor Network Topologies ns-2 simulation & performance analysis Zhenghua Fu Ben Greenstein Petros Zerfos.

RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar.

Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.

Enhancing TCP Fairness in Ad Hoc Wireless Networks Using Neighborhood RED Kaixin Xu, Mario Gerla University of California, Los Angeles {xkx,

Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,

Choosing an Accurate Network Model using Domain Analysis Almudena Konrad, Mills College Ben Y. Zhao, UC Santa Barbara Anthony Joseph, UC Berkeley The First.

RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks Huamin Chen, Jian Li, and Prasant Mohapatra Presenter: Jian Li.

Energy Conservation in wireless sensor networks Kshitij Desai, Mayuresh Randive, Animesh Nandanwar.

Statistical Methods for long-range forecast By Syunji Takahashi Climate Prediction Division JMA.

Not All Microseconds are Equal: Fine-Grained Per-Flow Measurements with Reference Latency Interpolation Myungjin Lee †, Nick Duffield‡, Ramana Rao Kompella†

Sensor Networks Storage Sanket Totala Sudarshan Jagannathan.

Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.

Traffic modeling and Prediction ----Linear Models

COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,

Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy Bilkent University, Ankara, Turkey.

IPCCC’111 Assessing the Comparative Effectiveness of Map Construction Protocols in Wireless Sensor Networks Abdelmajid Khelil, Hanbin Chang, Neeraj Suri.

Privacy-Aware Personalization for Mobile Advertising

COMPUTING AGGREGATES FOR MONITORING WIRELESS SENSOR NETWORKS Jerry Zhao, Ramesh Govindan, Deborah Estrin Presented by Hiren Shah.

Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.

Benjamin AraiUniversity of California, Riverside Reliable Hierarchical Data Storage in Sensor Networks Song Lin – Benjamin.

ENERGY-EFFICIENT FORWARDING STRATEGIES FOR GEOGRAPHIC ROUTING in LOSSY WIRELESS SENSOR NETWORKS Presented by Prasad D. Karnik.

Multi-Resolution Spatial and Temporal Coding in a Wireless Sensor Network for Long-Term Monitoring Applications You-Chiun Wang, Member, IEEE, Yao-Yu Hsieh,

Dave McKenney 1.  Introduction  Algorithms/Approaches  Tiny Aggregation (TAG)  Synopsis Diffusion (SD)  Tributaries and Deltas (TD)  OPAG  Exact.

Energy-Efficient Signal Processing and Communication Algorithms for Scalable Distributed Fusion.

Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan.

Energy-Efficient Monitoring of Extreme Values in Sensor Networks Loo, Kin Kong 10 May, 2007.

1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.

Using Polynomial Approximation as Compression and Aggregation Technique in Wireless Sensor Networks Bouabdellah KECHAR Oran University.

Secure In-Network Aggregation for Wireless Sensor Networks

Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,

DISTIN: Distributed Inference and Optimization in WSNs A Message-Passing Perspective SCOM Team

EE515/IS523: Security 101: Think Like an Adversary Evading Anomarly Detection through Variance Injection Attacks on PCA Benjamin I.P. Rubinstein, Blaine.

By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.

On Exploiting Transient Social Contact Patterns for Data Forwarding in Delay-Tolerant Networks 1 Wei Gao Guohong Cao Tom La Porta Jiawei Han Presented.

An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.

Bing Wang, Wei Wei, Hieu Dinh, Wei Zeng, Krishna R. Pattipati (Fellow IEEE) IEEE Transactions on Mobile Computing, March 2012.

TreeCast: A Stateless Addressing and Routing Architecture for Sensor Networks Santashil PalChaudhuri, Shu Du, Ami K. Saha, and David B. Johnson Department.

1 Roie Melamed, Technion AT&T Labs Araneola: A Scalable Reliable Multicast System for Dynamic Wide Area Environments Roie Melamed, Idit Keidar Technion.

Optimization-based Cross-Layer Design in Networked Control Systems Jia Bai, Emeka P. Eyisi Yuan Xue and Xenofon D. Koutsoukos.

Corelite Architecture: Achieving Rated Weight Fairness

A New Multipath Routing Protocol for Ad Hoc Wireless Networks

Overview: Chapter 2 Localization and Tracking

Brocade: Landmark Routing on Peer to Peer Networks

Presentation transcript:

Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj, June, 2004

June 2004 Outline Background Motivation Statistical properties of real life data streams Problem of existing approaches Our Approach  Reduce communication overhead  Recover from loss Evaluation Conclusion and future work

June 2004 Background Aggregate functions  MIN, MAX, AVG, COUNT, …, etc. In-Network hierarchical processing  Query propagation  Tree construction  Aggregates computed epoch by epoch Addressing fault-tolerance  Multi-root  Multi-tree  Reliable transmission A B C D E Count ?

June 2004 Motivation Data aggregation is an important function for all network infrastructures  Sensor networks  P2P networks  Network monitoring and intrusion detection systems Exact result not achievable in face of loss and faults  High cost when adding fault-tolerance Low communication overhead, accurate approximation is crucial But, it’s difficult to achieve

June 2004 Observation: Comparison of Data Streams Three real-world data traces and a random trace

June 2004 Statistical Properties of Data Streams Density estimation for relative increment There is temporal correlation in real data stream, by which we can leverage to maintain aggregate data accuracy, while reducing communication overhead and recovering from data loss. Relative Increment is defined as:

June 2004 Problems in Existing Approaches Few approach exploits the temporal properties and is designed to handle data loss  Simple last-value algorithm for data loss recovery in TAG  Multi-root/tree make things worse by consuming more resource Fragile for large process groups  Need all relevant nodes for participation Difficult to trade accuracy for communication overhead  Good applications need this tradeoff  Only need approximation  But, minimize resource consumption  Centralize solution of adaptive filtering proposed by Olston et.al.

June 2004 Our Approach Probabilistic data aggregation: a scalable and robust approach  Exploit and leverage statistical properties of data stream in temporal domain  Apply statistical algorithms to data aggregation  Develop protocol that handles loss and failures as essential part of normal operations Nodes participate in aggregation and communication according to statistical sampling algorithm In the absence of data, estimate value using time series algorithms Differentiate between voluntary and involuntary Loss

June 2004 Reducing Communication Overhead Trade off between accuracy and resource consumption  Allow selective participation of nodes while maintaining aggregate accuracy  Node participates in the operation with certain probability, which is the design parameter of the algorithm Sampling strategies:  Uniform Sampling: all nodes use the identical sampling rate  Subtree-size based Sampling: sampling rate of a node is proportional to the size of its subtree  Variance based sampling: a sensor only reports a new value if it is above or below a threshold percentage its last reported value.

June 2004 Performance of Sampling algorithms  As fewer nodes participate, overall accuracy decreases for all algorithms.  Uniform sampling performs worst.  Variance based sampling is most accurate, Max Operation AVG Operation

June 2004 Observation: Long-Term Pattern in Data Data source: bandwidth measurements for the CUDI network interface on an Abilene router with 5-minute average. Daily patterns in a weekly data stream Long-term trend

June 2004 Two Level Representation of Data The data stream can be decomposed into two layers: the long trend (pattern), which changes slowly; the residual, high frequency but low amplitude. Monday Data Long-term trend

June 2004 Recovering From Loss Traditional Approaches  Last seen data as approximation for current epoch  Linear Prediction Two-Level data representation and prediction  Long term trend: B-spline estimation  High frequency residual: ARMA modeling  ARMA stands for AutoRegressive and Moving Average model, which is a standard time series technique to model chaotic data stream

June 2004 Two-Level Data Prediction B-spline modeling for long term trend  Piecewise continuous, low-degree B-spline can represent complex shapes  Least-square B-spline regression for two-level decomposition  B-Spline extension for future forecasting ARMA forecasting for transient oscillation  System Identification to determine the order of the model  Parameter estimation by optimization algorithm  Low complexity recursive equation for future forecasting Statistical properties for the calibration of prediction results

June 2004 Performance of Prediction Algorithms Performance of Prediction Algorithms For MAX Operation in Lossless Environment

June 2004 Performance of Prediction Algorithms Performance of prediction algorithms in lossy environments. Average loss rate of the network is 20%. The ration of loss rate between wide-area links and local links is 3:1.

June 2004 Summary of Results All prediction algorithms are effective in improving the accuracy of aggregation results Two-level prediction approach perform the best in all situations  Achieve more than 90% of accuracy even under each node nonparticipation with rate up to 60%  Is effective even in a high loss environment

June 2004 Conclusion and Future Work Apply statistical algorithms to data aggregation system  quantify the statistical properties of real-world measurement data  propose the concept of probabilistic participation of nodes  propose multi-level prediction mechanism to recover from sampling and data loss Uniqueness: multi-level prediction enables high accuracy even under high loss and voluntary non-participation Future Work  Develop online algorithm and exploit tradeoff between prediction accuracy and computation and storage cost  Build real system for applications in network health monitoring, traffic measurement and router statistics aggregation  Real system implementation and Deployment

June 2004 The Danger of Prediction Prediction Without Statistical Calibration Prediction With Statistical Calibration