Proof Sketches: Verifiable In-Network Aggregation Minos Garofalakis Yahoo! Research, UC Berkeley, Intel Research Berkeley

Slides:



Advertisements
Similar presentations
Raghavendra Madala. Introduction Icicles Icicle Maintenance Icicle-Based Estimators Quality Guarantee Performance Evaluation Conclusion 2 ICICLES: Self-tuning.
Advertisements

I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.
Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.
Haowen chan  cmu Outline  The Secure Aggregation Problem  Algorithm Description  Algorithm Analysis Proof (sketch) of correctness Proof (sketch) of.
TAODV: A Trusted AODV Routing Protocol for MANET Li Xiaoqi, GiGi March 22, 2004.
Harikrishnan Karunakaran Sulabha Balan CSE  Introduction  Icicles  Icicle Maintenance  Icicle-Based Estimators  Quality & Performance  Conclusion.
© 2007 Levente Buttyán and Jean-Pierre Hubaux Security and Cooperation in Wireless Networks Chapter 4: Naming and addressing.
A Framework for Secure Data Aggregation in Sensor Networks Yi Yang Xinran Wang, Sencun Zhu and Guohong Cao The Pennsylvania State University MobiHoc’ 06.
A Framework for Secure Data Aggregation in Sensor Networks Yi Yang Joint work with Xinran Wang, Sencun Zhu and Guohong Cao Dept. of Computer Science &
Computer Science SDAP: A Secure Hop-by-Hop Data Aggregation Protocol for Sensor Networks Yi Yang, Xinran Wang, Sencun Zhu and Guohong Cao April 24, 2007.
SIA: Secure Information Aggregation in Sensor Networks Bartosz Przydatek, Dawn Song, Adrian Perrig Carnegie Mellon University Carl Hartung CSCI 7143: Secure.
IC-29 Security and Cooperation in Wireless Networks 1 Secure and Robust Aggregation in Sensor Networks Parisa Haghani Supervised by: Panos Papadimitratos.
Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.
1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel.
Graph-Based Synopses for Relational Selectivity Estimation Joshua Spiegel and Neoklis Polyzotis University of California, Santa Cruz.
Sharing Aggregate Computation for Distributed Queries Ryan Huebsch, UC Berkeley Minos Garofalakis, Yahoo! Research † Joe Hellerstein, UC Berkeley Ion Stoica,
Approximate XML Query Answers Alkis Polyzotis (UC Santa Cruz) Minos Garofalakis (Bell Labs) Yannis Ioannidis (U. of Athens, Hellas)
Operator Placement for In-Network Stream Query Processing.
Computer Science Spatio-Temporal Aggregation Using Sketches Yufei Tao, George Kollios, Jeffrey Considine, Feifei Li, Dimitris Papadias Department of Computer.
Processing Data-Stream Joins Using Skimmed Sketches Minos Garofalakis Internet Management Research Department Bell Labs, Lucent Technologies Joint work.
CMSC 414 Computer and Network Security Lecture 9 Jonathan Katz.
Flow Algorithms for Two Pipelined Filtering Problems Anne Condon, University of British Columbia Amol Deshpande, University of Maryland Lisa Hellerstein,
Time-Decaying Sketches for Sensor Data Aggregation Graham Cormode AT&T Labs, Research Srikanta Tirthapura Dept. of Electrical and Computer Engineering.
Cumulative Violation For any window size  t  Communication-Efficient Tracking for Distributed Cumulative Triggers Ling Huang* Minos Garofalakis.
Estimating Set Expression Cardinalities over Data Streams Sumit Ganguly Minos Garofalakis Rajeev Rastogi Internet Management Research Department Bell Labs,
Statistic estimation over data stream Slides modified from Minos Garofalakis ( yahoo! research) and S. Muthukrishnan (Rutgers University)
1 Wavelet synopses with Error Guarantees Minos Garofalakis Phillip B. Gibbons Information Sciences Research Center Bell Labs, Lucent Technologies Murray.
Approximate XML Query Answers Alkis Polyzotis (UC Santa Cruz) Minos Garofalakis (Bell Labs) Yannis Ioannidis (U. of Athens, Hellas)
CS 580S Sensor Networks and Systems Professor Kyoung Don Kang Lecture 7 February 13, 2006.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Mitigating DoS Attacks against Broadcast Authentication in Wireless Sensor Networks Peng Ning, An Liu North Carolina State University and Wenliang Du Syracuse.
Hashed Samples Selectivity Estimators for Set Similarity Selection Queries.
Secure Localization Algorithms for Wireless Sensor Networks proposed by A. Boukerche, H. Oliveira, E. Nakamura, and A. Loureiro (2008) Maria Berenice Carrasco.
Computer Science Secure Hierarchical In-network Data Aggregation for Sensor Networks Steve McKinney CSC 774 – Dr. Ning Acknowledgment: Slides based on.
CONGRESSIONAL SAMPLES FOR APPROXIMATE ANSWERING OF GROUP-BY QUERIES Swarup Acharya Phillip Gibbons Viswanath Poosala ( Information Sciences Research Center,
Sparse Coding for Specification Mining and Error Localization Runtime Verification September 26, 2012 Wenchao Li, Sanjit A. Seshia University of California.
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
Computer Science 1 CSC 774 Advanced Network Security Distributed detection of node replication attacks in sensor networks (By Bryan Parno, Adrian Perrig,
Privacy Preservation of Aggregates in Hidden Databases: Why and How? Arjun Dasgupta, Nan Zhang, Gautam Das, Surajit Chaudhuri Presented by PENG Yu.
DAQ: A New Paradigm for Approximate Query Processing Navneet Potti Jignesh Patel VLDB 2015.
EBAS: An Energy-Efficient Event Boundary Approximated Suppression Algorithm in Wireless Sensor Networks Longjiang Guo Heilongjiang University
DoS-Resilient Secure Aggregation Queries in Sensor Networks Haifeng Yu National University of Singapore
Secure and Highly-Available Aggregation Queries via Set Sampling Haifeng Yu National University of Singapore.
Secure Routing in Wireless Sensor Networks: Attacks and Countermeasures Chris Karlof and David Wagner (modified by Sarjana Singh)
Data Streams Part 3: Approximate Query Evaluation Reynold Cheng 23 rd July, 2002.
PODC Distributed Computation of the Mode Fabian Kuhn Thomas Locher ETH Zurich, Switzerland Stefan Schmid TU Munich, Germany TexPoint fonts used in.
© 2007 Levente Buttyán and Jean-Pierre Hubaux Security and Cooperation in Wireless Networks Chapter 4: Naming and addressing.
High-integrity Sensor Networks Mani Srivastava UCLA.
Secure In-Network Aggregation for Wireless Sensor Networks
Joseph M. Hellerstein Peter J. Haas Helen J. Wang Presented by: Calvin R Noronha ( ) Deepak Anand ( ) By:
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Space-Efficient Online Computation of Quantile Summaries SIGMOD 01 Michael Greenwald & Sanjeev Khanna Presented by ellery.
Robust Estimation With Sampling and Approximate Pre-Aggregation Author: Christopher Jermaine Presented by: Bill Eberle.
By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.
XCluster Synopses for Structured XML Content Alkis Polyzotis (UC Santa Cruz) Minos Garofalakis (Intel Research, Berkeley)
1 Routing security against Threat models CSCI 5931 Wireless & Sensor Networks CSCI 5931 Wireless & Sensor Networks Darshan Chipade.
University of Texas at Arlington Presented By Srikanth Vadada Fall CSE rd Sep 2010 Dynamic Sample Selection for Approximate Query Processing.
Written By: Presented By: Swarup Acharya,Amr Elkhatib Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy Join Synopses for Approximate Query Answering.
ICICLES: Self-tuning Samples for Approximate Query Answering By Venkatesh Ganti, Mong Li Lee, and Raghu Ramakrishnan Shruti P. Gopinath CSE 6339.
A paper on Join Synopses for Approximate Query Answering
Location Cloaking for Location Safety Protection of Ad Hoc Networks
ICICLES: Self-tuning Samples for Approximate Query Answering
Spatio-temporal Pattern Queries
Spatial Online Sampling and Aggregation
Differential Privacy in Practice
Phillipa Gill University of Toronto
Data Integration with Dependent Sources
Nikhil Bansal, Shashwat Garg, Jesper Nederlof, Nikhil Vyas
Range-Efficient Computation of F0 over Massive Data Streams
Presentation transcript:

Proof Sketches: Verifiable In-Network Aggregation Minos Garofalakis Yahoo! Research, UC Berkeley, Intel Research Berkeley Minos Garofalakis Joe Hellerstein Petros Maniatis Yahoo! Research, UC Berkeley, Intel Research Berkeley

Introduction & Motivation Context: Distributed, in-network aggregation –Network monitoring, sensornet/p2p query processing, … –Data is distributed – cannot afford to warehouse! –Approximations are often sufficient Can tradeoff approximation quality with communication Querier: “How many Win-XP hosts running patch X have CPU utilization > 95%?” “Predicate poll” query More general aggregate queries (SUM, AVG), general-purpose summaries (e.g., random samples) of (sub)populations

In-Network Aggregation Typical assumption: Benign aggregation infrastructure –Aggregator nodes cannot “misbehave” BUT, aggregators are often untrusted! –3 rd party hosted operations (e.g., Akamai), shared infrastructure, viruses/worms, … Challenge: Verifiable, efficient, in-network aggregation –Provide trustworthy, guaranteed-quality results with potentially malicious aggregators

Our Contributions Proof Sketches: Family of certificates for verifiable, approximate, in-network aggregation –Concise sketch synopses  Communication-efficient –Guarantee detection of malicious tampering whp if result is perturbed by more than a small error bound Basic Technique: Combines FM sketch with compact Authentication Manifest (AM) –Prevents inflation through crypto signatures; bounds deflation through complementary deflation detection Extensions: Verifiable random sampling; verifiable aggregates over multi-tuple nodes

Talk Outline  Introduction and Motivation  Overview of Contributions  System Model  AM-FM Proof Sketches  Extensions – Verifiable random samples – Verifiable aggregation over multi-tuple nodes  Experimental Results  Conclusions

System Model Inflation Attacks: Aggregators can manipulate or inject spurious PSRs Deflation Attacks: Aggregators can suppress valid PSRs U = size of sensor population

A Naïve Inflation Detector Straightforward application of crypto signatures –Each sensor node crypto signs each tuple satisfying the predicate poll, and sends up the tuple + signature –Aggregators simply union the signed tuple sets and forward up the tree Aggregators cannot forge sensor tuples –Within crypto function guarantees BUT, size of Authentication Manifest (AM) = size of answer set – O(U) in general!

Solution: AM-FM Proof Sketches Sketch and AM structure of size only O(logU) Based on the FM sketch for distinct-element counting Index of rightmost zero ~ log(Count) O(log(1/  ) sketches to get an estimate of the Count Bitmap of size O(logU) h() 1 012k with prob 1/2 k+1 P[h(x)=0] = ½, P[h(x)=1] = ¼, P[h(x)=2] = 1/8, …...

Adding AM to FM: Inflation Prevention Observation: Each FM sketch bit is an independent function of the input tuples AM = Authenticate each 1-bit in the FM sketch using a signed “witness”/exemplar sensor tuple –Crypto-signed tuple that turns that bit on Aggregators: Merge input PSRs (AM-FM sketches) –OR the FM sketches –Keep a single exemplar for each 1-bit –Size = O(logU) – Cannot forge 1-bits

AM-FM Proof Sketches: Bounding Deflation Malicious aggregator can omit 1-bits & witnesses from sketch  Underestimate predicate poll count Approach: Complementary Deflation Detection –Assumes that we know sensor count U –Use AM-FM to estimate count for both pred and !pred –Check that C pred + C !pred is close to U (based on sketching approximation guarantees) Adversary cannot inflate C !pred to compensate for deflating C pred Sum check will catch significant deviations

More Formally… Assume O(log()/  ) AM-FM proof sketches to estimate C pred and C !pred Verification Condition: Flag adversarial attack if C pred + C !pred < (1-)U Theorem: If verification step is successful, the AM-FM estimate is within § 2U of the true C pred whp –Adversary cannot deflate the result by more than 2U without being detected whp –Relative error guarantees for high-selectivity predicates

Verifiable Random Sampling Build a general-purpose, verifiable synopsis of node data –Can support arbitrary predicates, quantile/heavy-hitter queries, … Traditional (eg, reservoir) sampling + authentication fails –Adversary can arbitrarily bias the sample Solution: AM-Sample Proof Sketches –Use FM hashing to sample, retain tuples + AMs for all tuples mapping above a certain level –A la Distinct Sampling [Gibbons’01] – adapt level based on target sample size –Easily merged up the tree using max-level –Verification condition and error guarantees based on target sample size and knowledge of U

Aggregates over Multi-Tuple Nodes So far, focus on predicate poll queries –Each sensor contributes · 1 tuple to result Key Issue: Knowing the total number of tuples M –With known M, our earlier results and analysis apply Approach: Verifiable approximate counting algorithm –Estimate M using a logarithmic number of simple AM-FM predicate polls –To within a given accuracy , using predicate polls of the form –Detailed algorithm, analysis, … in the paper Fraction of sensors with #tuples ¸  k

Other Extensions / Issues Discuss “generalized template” for proof sketches to support verifiable query results –E.g., Bloom-filter proof sketch Accountability: Trace-back mechanisms for pinpointing attackers Only approximate knowledge of population size U

Experimental Study Study average-case behavior of AM-FM proof sketches for verifiable predicate polls Population of 100K sensors, fixed number of sketches to 256 –About 4% of space for “naïve” –  ¼ 0.15 wp 0.8 Parameters –Predicate selectivity –“Coverage” of malicious aggregators –Two adversarial strategies (Targeted, Safe)

Some Results: Benign Population

Experimental Summary Average case behavior is better than (worst-case) bounds suggest –Adversary has even less “wiggle room” to deflate result without being detected Bounds based on worst case for sketch approximation and combination of pred/!pred estimates Adversary typically has limited coverage in the aggregation tree –Can only affect a small fraction of the aggregated results

Conclusions Introduced Proof-Sketches – first compact certificate structure for verifiable, in-network aggregation Basic technique: AM-FM proof sketch –Adds concise AM to basic FM sketch; prevents deflation through complementary deflation detection Extensions –Verifiable random sampling –Approximate verifiable counting for general aggregates over multi-tuple nodes Future: Extending ideas and methodology to more general approximate in-network queries (e.g., joins)

Thank you!

Some Results: Safe Adversary