Secure and Highly-Available Aggregation Queries via Set Sampling Haifeng Yu National University of Singapore.

Slides:



Advertisements
Similar presentations
Routing Complexity of Faulty Networks Omer Angel Itai Benjamini Eran Ofek Udi Wieder The Weizmann Institute of Science.
Advertisements

The Average Case Complexity of Counting Distinct Elements David Woodruff IBM Almaden.
Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)
Fast Algorithms For Hierarchical Range Histogram Constructions
Summarizing Distributed Data Ke Yi HKUST += ?. Small summaries for BIG data  Allow approximate computation with guarantees and small space – save space,
Haowen chan  cmu Outline  The Secure Aggregation Problem  Algorithm Description  Algorithm Analysis Proof (sketch) of correctness Proof (sketch) of.
Distribution and Revocation of Cryptographic Keys in Sensor Networks Amrinder Singh Dept. of Computer Science Virginia Tech.
Fabian Kuhn, Microsoft Research, Silicon Valley
1 Distributed Adaptive Sampling, Forwarding, and Routing Algorithms for Wireless Visual Sensor Networks Johnsen Kho, Long Tran-Thanh, Alex Rogers, Nicholas.
A Framework for Secure Data Aggregation in Sensor Networks Yi Yang Xinran Wang, Sencun Zhu and Guohong Cao The Pennsylvania State University MobiHoc’ 06.
A Framework for Secure Data Aggregation in Sensor Networks Yi Yang Joint work with Xinran Wang, Sencun Zhu and Guohong Cao Dept. of Computer Science &
Computer Science SDAP: A Secure Hop-by-Hop Data Aggregation Protocol for Sensor Networks Yi Yang, Xinran Wang, Sencun Zhu and Guohong Cao April 24, 2007.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 7 Instructor: Haifeng YU.
Haifeng Yu National University of Singapore
SIA: Secure Information Aggregation in Sensor Networks Bartosz Przydatek, Dawn Song, Adrian Perrig Carnegie Mellon University Carl Hartung CSCI 7143: Secure.
Open Problems in Data- Sharing Peer-to-Peer Systems Neil Daswani, Hector Garcia-Molina, Beverly Yang.
Roberto Di Pietro, Luigi V. Mancini and Alessandro Mei.
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
Deterministic Wavelet Thresholding for Maximum-Error Metrics Minos Garofalakis Bell Laboratories Lucent Technologies 600 Mountain Avenue Murray Hill, NJ.
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
SUMP: A Secure Unicast Messaging Protocol for Wireless Ad Hoc Sensor Networks Jeff Janies, Chin-Tser Huang, Nathan L. Johnson.
Random Key Predistribution Schemes for Sensor Networks Authors: Haowen Chan, Adrian Perrig, Dawn Song Carnegie Mellon University Presented by: Johnny Flowers.
Lower and Upper Bounds on Obtaining History Independence Niv Buchbinder and Erez Petrank Technion, Israel.
Aggregation in Sensor Networks NEST Weekly Meeting Sam Madden Rob Szewczyk 10/4/01.
INSENS: Intrusion-Tolerant Routing For Wireless Sensor Networks By: Jing Deng, Richard Han, Shivakant Mishra Presented by: Daryl Lonnon.
Proof Sketches: Verifiable In-Network Aggregation Minos Garofalakis Yahoo! Research, UC Berkeley, Intel Research Berkeley
Improving the Accuracy of Continuous Aggregates & Mining Queries Under Load Shedding Yan-Nei Law* and Carlo Zaniolo Computer Science Dept. UCLA * Bioinformatics.
Adaptive Self-Configuring Sensor Network Topologies ns-2 simulation & performance analysis Zhenghua Fu Ben Greenstein Petros Zerfos.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
SIGMOD'061 Energy-Efficient Monitoring of Extreme Values in Sensor Networks Adam Silberstein Kamesh Munagala Jun Yang Duke University.
SybilGuard: Defending Against Sybil Attacks via Social Networks Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman Presented by Ryan.
CS 580S Sensor Networks and Systems Professor Kyoung Don Kang Lecture 7 February 13, 2006.
Understanding RFID Counting Protocols Binbin Chen # Ziling Zhou ^# Haifeng Yu ^ # Advanced Digital Sciences Center ^ National University of Singapore MobiCom.
Provable Protocols for Unlinkability Ron Berman, Amos Fiat, Amnon Ta-Shma Tel Aviv University.
Computer Science Secure Hierarchical In-network Data Aggregation for Sensor Networks Steve McKinney CSC 774 – Dr. Ning Acknowledgment: Slides based on.
Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.
Computing and Communicating Functions over Sensor Networks A.Giridhar and P. R. Kumar Presented by Srikanth Hariharan.
Securing Every Bit: Authenticated Broadcast in Wireless Networks Dan Alistarh, Seth Gilbert, Rachid Guerraoui, Zarko Milosevic, and Calvin Newport.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Group Rekeying for Filtering False Data in Sensor Networks: A Predistribution and Local Collaboration-Based Approach Wensheng Zhang and Guohong Cao.
DoS-Resilient Secure Aggregation Queries in Sensor Networks Haifeng Yu National University of Singapore
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
Rushing Attacks and Defense in Wireless Ad Hoc Network Routing Protocols ► Acts as denial of service by disrupting the flow of data between a source and.
PODC Distributed Computation of the Mode Fabian Kuhn Thomas Locher ETH Zurich, Switzerland Stefan Schmid TU Munich, Germany TexPoint fonts used in.
Energy-Efficient Monitoring of Extreme Values in Sensor Networks Loo, Kin Kong 10 May, 2007.
1 Shape Segmentation and Applications in Sensor Networks Xianjin Xhu, Rik Sarkar, Jie Gao Department of CS, Stony Brook University INFOCOM 2007.
Computer Science CSC 774 Adv. Net. Security1 Presenter: Tong Zhou 11/21/2015 Practical Broadcast Authentication in Sensor Networks.
Game-based composition for key exchange Cristina Brzuska, Marc Fischlin (University of Darmstadt) Nigel Smart, Bogdan Warinschi, Steve Williams (University.
The Cost of Fault Tolerance in Multi-Party Communication Complexity Binbin Chen Advanced Digital Sciences Center Haifeng Yu National University of Singapore.
Multi-user Broadcast Authentication in Wireless Sensor Networks Kui Ren, Wenjing Lou, Yanchao Zhang SECON2007 Manar Mahmoud Abou elwafa.
Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)
Understanding RFID Counting Protocols Authors: Binbin Chen, Ziling Zhou, Haifeng Yu MobiCom 2013 Presenter: Musab Hameed.
Quantum Computing MAS 725 Hartmut Klauck NTU
By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.
1 CSC 421: Algorithm Design & Analysis Spring 2014 Complexity & lower bounds  brute force  decision trees  adversary arguments  problem reduction.
Towards a Scalable and Robust DHT Baruch Awerbuch Johns Hopkins University Christian Scheideler Technical University of Munich.
International Conference Security in Pervasive Computing(SPC’06) MMC Lab. 임동혁.
Round-Efficient Multi-Party Computation in Point-to-Point Networks Jonathan Katz Chiu-Yuen Koo University of Maryland.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Randomized Algorithms for Distributed Agreement Problems Peter Robinson.
The Cost of Fault Tolerance in Multi-Party Communication Complexity Haifeng Yu National University of Singapore Joint work with Binbin Chen, Yuda Zhao,
Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.
Computer Science Least Privilege and Privilege Deprivation: Towards Tolerating Mobile Sink Compromises in Wireless Sensor Network Presented by Jennifer.
Stochastic Streams: Sample Complexity vs. Space Complexity
A paper on Join Synopses for Approximate Query Answering
Location Cloaking for Location Safety Protection of Ad Hoc Networks
Presentation transcript:

Secure and Highly-Available Aggregation Queries via Set Sampling Haifeng Yu National University of Singapore

Haifeng Yu, National University of Singapore2 2 Secure Aggregation Queries in Sensor Networks  Multi-hop sensor network with trusted base station  With the presence of malicious (byzantine) sensors  Goal: Count the # of sensors sensing smoke (i.e., satisfying a certain predicate)  Sum, Avg, and other aggregates are similar – see paper  Type-1 attack: Malicious sensors report fake readings  If # malicious sensor is small – damage is limited  Not the focus of our work

Haifeng Yu, National University of Singapore3 1 3 Secure Aggregation Queries in Sensor Networks  Type-2 attack: Malicious sensors (indirectly) corrupt the readings of other sensors – much larger damage  E.g., in tree based aggregation  Focus of most research on secure aggregation – our focus too 3 6 malicious base station

Haifeng Yu, National University of Singapore4 4 State-of-Art and Our Goal  Active area in recent years (e.g. [Chan et al.’06], [Frikken et al.’08], [Roy et al.’06], [Nath et al.’09] )  All these approaches focus on detection (i.e., safety only)  Will detect if the result is corrupted  But will not produce a correct result when under attack Detecting attacks  Tolerating attacks Safety only  Safety + Liveness System made harmless  System made useful Our Goal

Haifeng Yu, National University of Singapore5 5 Our Approach to Tolerating Attacks  Previous approaches: Fix the security holes in tree-based aggregation  Dilemma in in-network processing  Our novel approach: Use sampling  With MACs on each sample, security comes almost automatically

Haifeng Yu, National University of Singapore6 0 6 Our Approach to Tolerating Attacks sampled flood the sample result (with a MAC) Cannot modify the result Challenge with sampling: Potentially large overhead  Previous approaches: Fix the security holes in tree-based aggregation  Dilemma in in-network processing  Our novel approach: Use sampling  With MACs on each sample, security comes almost automatically

Haifeng Yu, National University of Singapore7 7 (Prohibitively) expensive for small b Background: Estimate Count via Sampling  n sensors, b sensors sensing smoke (called black sensors)  Goal: Output ( ,  ) approximation b’ such that:  E.g.: Sample 10 sensors and 5 are black  b’ = 0.5n  Classic result: # sensors needed to sample is

Haifeng Yu, National University of Singapore8 8 Reduce the Overhead via Set Sampling  Challenges with small b :  Need many samples to encounter black sensors  Set sampling: Sample a set of sensors together  Binary result will tell whether any sensor in the set is black (but not how many)  Efficient implementation in sensor networks – later  Should be easier to hit sets containing black sensors How effective will this be? (How many sets do we need to sample to estimate count?)

Haifeng Yu, National University of Singapore9 9 Our Results  Novel algorithm for estimating count using set sampling  Defines randomized and inter-related sets, and sample them adaptively  # sets needed to sample:  Previously without set sampling: # of samples reduced from polynomial to polylogarithmic (can be further reduced – see paper)

Haifeng Yu, National University of Singapore10 Our Results  Per-sensor msg complexity:  Comparable to some detection-only protocols [Roy et al.’06]  Similar msg sizes  See paper for time complexity  See paper for other aggregates (sum, avg)  Set sampling + novel algorithms using set sampling  Enables secure aggregation queries despite adversarial interference Haifeng Yu, National University of Singapore 10

Haifeng Yu, National University of Singapore11Haifeng Yu, National University of Singapore 11 Outline of This Talk  Background, goal, and summary of results  Simple implementation of set sampling in sensor networks  Main technical results: Novel algorithm for estimating count via set sampling

Haifeng Yu, National University of Singapore12Haifeng Yu, National University of Singapore 12 Implementing Set Sampling – Non-Secure Version  Example: sample the set {A, B, C, D}  Request flooded from the base station: O(log n ) bits  We use only O( n ) (instead of O(2 n )) random sets  O(log n ) bits to name a set  Reply: Single bit  Flood back from all black sensors in the set {e.g., A and C}  Each sensor only forwards the first message received  Base station sees binary answer  Multiple samples can be taken in one flooding  Our algorithm takes samples in O(log n ) sequential stages  Only O(log n ) times of flooding Goal: O(1) per-sensor msg complexity for sampling a set

Haifeng Yu, National University of Singapore13Haifeng Yu, National University of Singapore 13 Implementing Set Sampling – Secure Design  Each set = Some distinct symmetric key K  Preload K onto all sensors in the set  Each sensor should be only be in a small number of sets – O(log n ) in our protocol  Request:  name of K, nonce   Reply:  MAC_ K (nonce)   Only sensors holding K can generate  DoS attacks possible  Can be avoided with improved design – see paper

Haifeng Yu, National University of Singapore14Haifeng Yu, National University of Singapore 14 Outline of This Talk  Background, goal, and summary of results  Implement set sampling in sensor networks  Main technical meat: Novel algorithm for estimating count via set sampling  For now assume all sensors are honest  Security follows from the clean security guarantees of sampling, though some minor modifications needed – see paper

Haifeng Yu, National University of Singapore15Haifeng Yu, National University of Singapore 15 Random Sets on the Sampling Tree  Basic approach:  Construct (related) randomized sets of different sizes and adaptively sample them  Base station internally created a sampling tree  A complete binary tree with 4n leaves  Each tree node = A distinct symmetric key = Some set of sensors  Sampling tree is an internal data structure and not network topology

Haifeng Yu, National University of Singapore16Haifeng Yu, National University of Singapore 16 K1, K2, K5, K10 loaded onto the sensor A A K1, K3, K6, K12 loaded onto the sensor B Each sensor is associated with a uniformly random leaf (independently) Each tree node corresponds to a set containing all the sensors in its subtree B

Haifeng Yu, National University of Singapore17Haifeng Yu, National University of Singapore 17 Properties of the Sampling Tree  A sensor is black if it satisfies the predicate  A key is black iff the corresponding set contains black sensor  : fraction of black keys at level i

Haifeng Yu, National University of Singapore18Haifeng Yu, National University of Singapore 18  is monotonic as we go down the tree  Decrease by a factor of at most 2 per level  At the top (assuming at least one black sensor)  At the bottom (4 n leaves!) Lemma: There exists a level  with

Haifeng Yu, National University of Singapore19 Why Level  Helps  not too small  Efficient estimation of via naïve sampling:  samples on level  yields an ( ,  ) approximation for  not too large  Can potentially estimate final count directly from  Chernoff-type occupancy tail bound for balls into bins  See paper for details Haifeng Yu, National University of Singapore 19

Haifeng Yu, National University of Singapore20Haifeng Yu, National University of Singapore 20 Additional Issues: Too Few Keys on Level   Challenge:  To estimate final count based on, the number of keys on level  needs to be large enough  If not, need to track down to lower levels  Need to leverage other interesting properties on the sampling tree  See paper

Haifeng Yu, National University of Singapore21Haifeng Yu, National University of Singapore 21 Additional Issues: Finding Level   Binary search on the O(log( n )) levels  On each level i examined, sample a small number of random keys to roughly estimate  Extremely efficient  Challenges:  The binary search operates on estimated values (with error and may not be monotonic)  When is small, the estimation only has error guarantee on one side  See paper

Haifeng Yu, National University of Singapore22 Example Numerical Results  n = 10,000 and count result ( b ) range from 0 to 10,000  Overhead:  5-15 sequential stages of sampling  Total samples  Avg approximation error: (1±0.08)  Hard to get better accuracy even in trusted environments ( [Nath et al.’09] )…  Naive sampling: 300 samples gives same accuracy only when b > 2,000

Haifeng Yu, National University of Singapore23Haifeng Yu, National University of Singapore 23 Conclusions  Making aggregation queries secure is critical for many sensor network applications  Contribution: Detecting attacks  Tolerating attacks  Safety only  Safety + Liveness  Our approach:  Abandon in-network processing and use sampling  Use novel set sampling to reduce the overhead  Polynomial overhead  Logarithmic overhead

Haifeng Yu, National University of Singapore24Haifeng Yu, National University of Singapore 24 Related Work to Set Sampling  Decision tree complexity for threshold- t functions (i.e., whether b  t ) [Ben-Asher and Newman’95] [Aspnes’09]  Most results are for error-free deterministic protocols  Large lower bound:  ( t ) (implying  ( b ) for count)  No prior results for general Monte Carlo randomized algorithm

Haifeng Yu, National University of Singapore25Haifeng Yu, National University of Singapore 25 Tolerating Attacks is Difficult  Example: Byzantine consensus  Detection substantially easier than tolerance  n  3f +1 lower bound only applies to tolerance and not detection  Pinpointing / revoking malicious sensors is hard  E.g., due to lack of public-key authentication  Active research area by itself

Haifeng Yu, National University of Singapore26Haifeng Yu, National University of Singapore 26 System Model  Multi-hop sensor network with trusted base station  Performance metric: Time complexity – see paper  Performance metric: Per-sensor msg complexity  Max number of msgs sent/received by an single sensor (captures loading balance)  msg size is either 8 bytes (size of a MAC) of log(n) bits  Collision ignored – as in all prior work  Or one can apply existing algorithms…

Haifeng Yu, National University of Singapore27Haifeng Yu, National University of Singapore 27 Implementing Set Sampling – Non-Secure Version  Request size: We use at most O(n) (random) sets  O(log(n)) bits to name a set Goal: O(1) per-sensor msg complexity for sampling a set Request flooding – every sensor sends/receives one msg

Haifeng Yu, National University of Singapore28Haifeng Yu, National University of Singapore 28 Implementing Set Sampling – Non-Secure Version  Reply: Single bit Goal: O(1) per-sensor msg complexity for sampling a set A C B D B, C, D satisfies the predicate, A does not Reply flooding – Only the first reply is forwarded This is why set sampling is designed to be binary

Haifeng Yu, National University of Singapore29Haifeng Yu, National University of Singapore 29 (The overhead of sampling a set needs to be properly controlled – will discuss later.)

Haifeng Yu, National University of Singapore30Haifeng Yu, National University of Singapore 30 Translating to b  We now have a good estimation for  Need to produce a good estimation for b  Let number of keys on level be n  Throw b balls into n bins  The fraction of occupied bins has the same distribution as  This distribution is highly concentrated near its mean (Chernoff-type occupancy tail bound), assuming  not too close to 1  n not too small

Haifeng Yu, National University of Singapore31Haifeng Yu, National University of Singapore 31 Summary of Techniques to Achieve the Results  Define randomized sets based on a complete binary tree  Interesting relationships among the sets  Sample the sets adaptively  Leverages Chernoff-type occupancy tail bounds for balls-into-bins