haowen chan cmu Outline The Secure Aggregation Problem Algorithm Description Algorithm Analysis Proof (sketch) of correctness Proof (sketch) of overhead bound
haowen chan cmu In-Network Data Aggregation (( )) Q “What is the sum of all the sensor readings?” Answer: Sensor readings
haowen chan cmu Attacker Model Unsecured deployment area Sensor nodes not tamper-resistant Adversary may undetectably take control of sensor nodes or base station
haowen chan cmu Correct Data Aggregation (( )) Q
haowen chan cmu Sensor Reading Falsification (( )) Q Malicious node reports false sensor reading (within legal bounds)
haowen chan cmu Sensor Reading Falsification General aggregation problem: Assume no application-specific information Attacker’s data indistinguishable from true data Sensor reading falsification is always possible in any general secure aggregation algorithm Attacker’s ability limited by how many nodes compromised
haowen chan cmu Aggregation Result Falsification (( )) Q Malicious node reports false aggregation result
haowen chan cmu Aggregation Result Falsification Single malicious node may cause unbounded deviation in query result Secure aggregation problem: Can we restrict the attacker’s ability to falsify aggregation results? Tightest possible restriction without application knowledge: Attacker can only perform sensor reading falsification attacks or equivalent
haowen chan cmu Prior Related Work Either probabilistic detection or only for special cases Single malicious node L. Hu and D. Evans [2003] P. Jadia and A. Mathuria [2004] Flat aggregator topology B. Przydatek, A. Perrig, D. Song [2003] W. Du, J. Deng, Y. Han, P.K. Varshney [2003] Probabilistic Detection B. Przydatek, A. Perrig, D. Song [2003] Y. Yang, X. Wang, S. Zhu, G. Cao [2006]
haowen chan cmu Our Algorithm General hierarchical (tree-based) aggregation topologies Multiple (unbounded) number of compromised nodes Achieves tightest possible bound on adversary ability to change aggregation result Low communication overhead edge-congestion O ( l og 2 n )
haowen chan cmu Outline The Secure Aggregation Problem Algorithm Description Algorithm Analysis Proof (sketch) of correctness Proof (sketch) of overhead bound
haowen chan cmu Preventing SUM Result Deflation Consider only the SUM aggregate Straightforward reductions from COUNT, AVG, MEDIAN to SUM Adversary only wishes to reduce the aggregate result Sensor readings are nonnegative: in [0, m] Let the sum of reported sensor readings of all legitimate nodes be S. If adversary reports any S’ < S then we detect its presence. Adversary gains no additional benefit from aggregation result falsification vs. sensor reading falsification
haowen chan cmu Generating Commitments Require nodes to cryptographically commit to a single version of the aggregation process Any aggregation result falsification cause in an inconsistency in some position in the commitment structure Verification process can discover inconsistency
haowen chan cmu Commitment Tree Aggregation Tree Commitment Tree F E D C B A M A M A M AB M B M AB M C M C … M AB = h ( M A jj M B ) ; v A + v B M D M ABCD M ABCD } v AB M ABCD = h ( M AB jj M D jj M C ) ; v AB + v D + v C … M E M E M F M A = A ; v A M B = B ; v B M R
haowen chan cmu Main Idea Commitment structure is probed to verify aggregation correctness Prior work: Querier performs probing Cannot probe every node Too much congestion near base station New idea: Distribute the verification process to the sensor nodes Every sensor node checks that its sensor reading was included in the aggregate
haowen chan cmu Self-verification Querier disseminates commitment tree root M R using authenticated broadcast E.g. [Perrig et al. ’01] Node A verifies its own contribution: Node A receives commitment tree root M R Node A requests all off-path vertices for M A Verify that the inputs to each aggregation step are non-negative Verify that the correct M R can be recomputed ¹ TESLA
haowen chan cmu Self-Verification of Node C M A M B M AB M C M D M ABCD M E M F R eques t o ® -pa t h ver t i ces f or M C C h ec k t h a t v AB ; v D ; v E ; v F area ll non-nega t i ve M R R ecompu t e M ABCD ; M R
haowen chan cmu Aggregating Verification Results Each node shares a secret key with querier Node A ’s “OK” bit phrase for query k : OK bit phrases are aggregated using XOR on the way to the querier Querier verifies that received aggregate bitphrase is XOR of all bit phrases If any node does not respond with OK, this test will fail: aggregation result rejected. MAC K A ( Q uery k ver i ¯ e d OK b yno d e A )
haowen chan cmu Aggregating with XOR
haowen chan cmu Outline The Secure Aggregation Problem Algorithm Description Algorithm Analysis Proof (sketch) of correctness Proof (sketch) of overhead bound
haowen chan cmu Motivating Observations Correctness: Self-verification is cumulative Net result of all nodes performing independent self-verification is equivalent to having a central querier verify every node Efficiency: Standard metric: congestion – maximum communication load on any single edge Self-verification incurs low congestion Even if every node performs self-verification
haowen chan cmu Correctness Lemma: If two legitimate nodes A and B both pass their verifications, then the SUM aggregate has value at least v A + v B M A M AX M AXYZ v AX ¸ v A M X v X ¸ 0 M YZ v YZ ¸ 0 Observation: Intermediate sums are non-decreasing. v AXYZ ¸ v A
haowen chan cmu Correctness M C M A M B M R M X M Y v Y ¸ v B v X ¸ v A v C ¸ v A + v B v R ¸ v A + v B s i nce h i sco ll i s i on-res i s t an t M X an d M Y are d i s t i nc t M C : LCA o f M A an d M B
haowen chan cmu Correctness Corollary: If all legitimate nodes pass their verifications, then the final aggregation result is at least Lower bound: Adversary cannot report result less than sum of legitimate sensor readings. Upper bound? S = X i l eg i t v i
haowen chan cmu Upper Bound Reduce upper bound problem to lower bound Compute simultaneously the complement sum aggregate (recall that ) Querier checks: Adversary: to increase, must decrease. But neither nor can be decreased below contribution of legitimate nodes. S = n X i = 1 v i S = n X i = 1 ( m ¡ v i ) v i 2 [ 0 ; m ] S S S SS S = nm ¡ S
haowen chan cmu Efficiency Suppose aggregation tree is balanced When node A self-verifies, it receives all off-path vertices in the commitment tree Maximum congestion: leaf edge messages A O ( l ogn ) O ( l ogn )
haowen chan cmu Efficiency Self-verification of other nodes (e.g. node B) does not increase communication load on any edge of the path between node A and the root A B C Y X M X M Y M Y M X M Y M X
haowen chan cmu Efficiency Edge congestion in balanced aggregation trees: For arbitrary unbalanced aggregation topology: Define a balanced logical aggregation overlay over the physical topology (details in paper) Incurs multiplicative factor Edge congestion for general aggregation trees: O ( l ogn ) l ogn O ( l og 2 n )
haowen chan cmu Naive Commitment Tree Aggregation Tree Commitment Tree F E D C B A M A M A M AB M B M AB M C M C M D M ABCD M ABCD M E M E M F M ABCDEF Topology of commitment tree is identical to aggregation tree (with addition of pendant vertices to all internal nodes)
haowen chan cmu Balancing the Commitment Tree Aggregation Tree unbalanced Naïve Commitment Tree unbalanced Long paths in commitment tree High communication overhead Idea: Instead of one commitment tree, keep a forest of complete commitment trees Construct this using delayed aggregation )) O ( l ogn )
haowen chan cmu Delayed Aggregation Only perform aggregation on subtrees of equal height M A M A D C B A M AB M B M AB M C M AB ; M C M D M CD M ABCD M ABC M ABC
haowen chan cmu Delayed Aggregation All trees in commitment forest are complete and have distinct heights Tallest tree has height at most At most trees Each sensor node receives (and transmits) commitment subtree root values edge congestion (proof in paper) l ogn O ( l ogn ) l ogn O ( l og 2 n )
haowen chan cmu Congestion Bound Commitment tree overlay network of the aggregation tree Each commitment tree vertex resides at the sensor node that created it For A to self-probe, Send all off-path vertices to its leaf vertex. congestion at leaf edge of MAMA O ( l ogn ) O ( l ogn ) ! M A M A
haowen chan cmu Congestion Bound In aggregation tree, each sensor node reports roots of subtrees to its parent Responsible for receiving traffic for parent edges incident to these vertices in the commitment forest edge-congestion in commitment forest edge congestion in aggregation tree. O ( l ogn ) O ( l ogn ) O ( l ogn ) ) O ( l og 2 n )
haowen chan cmu Conclusion Secure data aggregation algorithm Suitable for general tree-based aggregation topologies Resilient vs multiple malicious nodes Tightest possible guarantees on adversary detection (without assuming application knowledge) Low edge congestion Limitation: need to know the set of responding nodes Future Work: Secure versions of more sophisticated aggregation functions Defences vs sensor reading falsification O ( l og 2 n )
haowen chan cmu Secure Hierarchical In-network Data Aggregation for Sensor Networks Haowen Chan Adrian Perrig and Dawn Song Carnegie Mellon University