SIA: Secure Information Aggregation in Sensor Networks Bartosz Przydatek, Dawn Song, Adrian Perrig Carnegie Mellon University Carl Hartung CSCI 7143: Secure Sensor Networks
Overview Secure Aggregation What is aggregation in sensor networks Why aggregate? Security Issues with aggregation Communication Efficiency vs. Accuracy Aggregate-Commit-Prove Computing Median, Min/Max, Average Conclusions
Aggregation in sensor networks Aggregators Collect information from nearby sensors Process it locally Send the processed information to user Reduces communication & power consumption
Why Aggregate? Given a query, it may be unnecessary and inefficient to return all raw data collected from each sensor—instead, information should be processed and aggregated within the network and only processed and aggregated information is returned
Security issues with aggregation Node Compromise One or more sensor nodes Aggregator(s) Denial of Service Stealth Attack Make user accept false aggregation results Goal of Paper: Prevent the user from accepting incorrect results
Communication Each sensor has unique identifier and shares key with home server and aggregator Home Server and Aggregator each have master key K B and K A respectively. Nodes store the shared keys MAC K B (node ID) and MAC K A (node ID), where MAC is a secure message authentication code.
Assumptions Uncorrupted sensors can reach each other via paths of uncorrupted sensors (including aggregator) Base station has a mechanism to broadcast authentic messages such that each node can verify authenticity. (TESLA, other?)
More Assumptions Attacker can corrupt some* sensors as well as aggregator. Attacker has complete control over corrupted node(s) * Attacker can corrupt at most a small fraction of nodes.
Efficiency vs. Accuracy Assume communication between nodes/aggregator and Home Server is expensive Trivial solution Send all data with aggregated data so Home Server can verify. – Linear communication. Must be willing to accept a small non-zero possibility of error to get sub-linear communication.
Efficiency vs. Accuracy Let f be a function of a 1,…,a n into real numbers, and let y = f(a 1,…,a n). ỹ is a multiplicative ε-approximation of y if (1- ε)y <= ỹ <= (1+ ε)y. In addition to approximation error ε, also use δ to upper bound the probability of not detecting a cheating aggregator. Called a (ε, δ)-approximation. Finds ε-approximation with probability at least 1 – δ. ε
Aggregate – Commit – Prove Aggregators compute aggregation of sensor nodes’ data Report aggregated data to home server along with commitment Home server and aggregator perform efficient interactive proofs such that the home server will be able to verify results or detect cheating.
Aggregator collects data A B C Aggregator Nodes share key with Aggregator, preventing impersonation, but not flawed data from a corrupt sensor Home Server
Aggregator commits data m0m0 m1m1 m2m2 m3m3 v 3,0 v 3,1 v 3,2 v 3,3 v 2,0 v 1,0 m4m4 m5m5 m6m6 m7m7 v 3,4 v 3,5 v 3,6 v 3,7 v 0,0 = H(v 1,0 || v 1,1 ) v 2,1 v 2,2 v 2,3 v 1,1 Example: M5 is authentic if the following holds true: v0,0 = H(v1,0 || H( H(v3,4 || H(m5)) || v2,3))
Aggregator commits data m0m0 m1m1 m2m2 m3m3 v 3,0 v 3,1 v 3,2 v 3,3 v 2,0 v 1,0 m4m4 m5m5 m6m6 m7m7 v 3,4 v 3,5 v 3,6 v 3,7 v 0,0 = H(v 1,0 || v 1,1 ) v 2,1 v 2,2 v 2,3 v 1,1 Example: M5 is authentic if the following holds true: v0,0 = H(v1,0 || H( H(v3,4 || H(m5)) || v2,3))
Aggregator proves data A B C Aggregator Home Server checks committed data and aggregated data in order to verify Home Server Aggregated data and Commitment
Computing the Median Require Aggregator to commit in hash-tree construction AND values are sorted 2 committed sequences One sorted on measured values One sorted on sensor IDs Pick random elements from one list and verify that they are present in the other Pick random elements from committed sequence and check that elements picked from left half are less than median, elements from right half are greater. Requires only O(log n/ε) elements to check whether is an ε-approximation.
Computing the Min/Max Construct a spanning tree in the network of sensors such that the root of the tree holds the minimum element. Each node authenticates its final state using the shared key with the home server, and sends the authenticated state to the aggregator. The aggregator checks consistency of tree and commits to the list of all nodes and their states, and reports the root-node to the home server. Home server randomly picks a node in the committed list and traverses the path from the chosen node to the root, checking the consistency of the constructed tree. If all checks are successful, home server accepts the value reported by the aggregator.
Counting Distinct Elements Random Node Selection Home Server distributes hash function h Sensors compute MIN using h, ID, and time interval Find lower and upper bounds using sampling.
Forward Secure Authentication Time is divided into constant time intervals Each sensor updates its key shared with the home station at the beginning of each time interval using a one way function. Uses updated key to compute the MAC on the sensing data during that time interval. If hacker compromises sensor at a later time, because of the one-way function, will be unable to compute the MAC key for the previous time interval. Problem: How to efficiently store past data and authenticator.
Hierarchical Aggregation If networks is too big, might need to use multiple Aggregators Basically, have regular aggregators and super aggregators Super aggregators aggregate the data from regular aggregators
Conclusions Possible to securely aggregate information using the aggregate-commit-prove framework even when some nodes (including the aggregator) are compromised. Can be done with less than linear communication Not all values from all nodes need to be sent to home server to verify that aggregation is correct. Forward Secure Authentication Ensure that a hacker can not change previous values/measurements on a node compromised later in time.