Optimizing Mixing in Pervasive Networks: A Graph-Theoretic Perspective

1 Optimizing Mixing in Pervasive Networks: A Graph-Theoretic Perspective
Murtuza Jadliwala, Igor Bilogrevic and Jean-Pierre Hubaux ESORICS, 2011

2 Wireless Trends Always on Background apps Smart Phones Vehicles
Watches Cameras Passports

3 Peer-to-Peer Wireless Networks
1 2 Peer-to-Peer wireless network Vehicular networks, delay tolerant networks, mobile social networks WiFi, Bluetooth Location privacy problem Third party can track location of nodes by monitoring identifiers Obtain location traces MAC address, authentication credentials Message Identifier

4 Examples VANETs Social networks Urban Sensing networks
Nokia Instant Community Urban Sensing networks Delay tolerant networks Peer-to-peer file exchange

5 Location Privacy Problem
Monitor identifiers used in peer-to-peer communications a Easy mass surveillance of location (not by network operator, but by anyone with WiFi sniffer) b c

6 Location Privacy Attacks
Pseudonym Message Identifier Pseudonymous location traces Home/work location pairs are unique [1] Re-identification of traces through data analysis [2,4,3,5] Attack: Spatio-Temporal correlation of traces Linkability breaks anonymity. Need spatial and temporal decorrelation of traces => Filtering based on tracking model [1] P. Golle and K. Partridge. On the Anonymity of Home/Work Location Pairs. Pervasive Computing, 2009 [2] A. Beresford and F. Stajano. Location Privacy in Pervasive Computing. IEEE Pervasive Computing, 2003 [3] B. Hoh et al. Enhancing Security & Privacy in Traffic Monitoring Systems. Pervasive Computing, 2006 [4] B. Hoh and M. Gruteser. Protecting location privacy through path confusion. SECURECOMM, 2005 [5] J. Krumm. Inference Attacks on Location Tracks. Pervasive Computing, 2007

7 Location Privacy with Mix Zones Prevent long term tracking
b ? 1 a 2 Traditional solution Spatial and temporal decorrelation of location traces Mix zones Mix zone Change identifier in mix zones [6,7] Key used to sign messages is changed MAC address is changed [6] A. Beresford and F. Stajano. Mix Zones: User Privacy in Location-aware Services. Pervasive Computing and Communications Workshop, 2004 [7] M. Gruteser and D. Grunwald. Enhancing location privacy in wireless LAN through disposable interface identifiers: a quantitative analysis. Mobile Networks and Applications, 2005

8 Mix-zone Placement in Road Networks
Mix zone placement most effective at intersections [8] Enables mixing (covers) at roads leading in and out of the intersection Mix-zones incur cost Communication loss Routing delays Cost vary from intersection to intersection How to place mix-zones? All roads are covered Overall cost is minimized Mix Cover problem Communication loss due to silent period Identifier change causes routing delays [8] L. Buttyan, T. Holczer, and I. Vajda. On the effectiveness of changing pseudonyms to provide location privacy in VANETs. ESAS 2007

9 Previous Work on Mix zone Placement
Optimization Approach [9] Mixing effectiveness using a flow-based metric Given upper bound on mix zones, max. distance between them and cost, where to place mix zones that maximizes mixing effectiveness Do not address the coverage problem Game-theoretic Approach [10,11] Game-theoretic model of optimal attack and defense strategies Only consider local, and not network-wide, intersection characteristics [9] J. Freudiger, R. Shokri, and J-P. Hubaux. On the optimal placement of mix zones. PETS 2009 [10] M. Humbert, M. H. Manshaei, J. Freudiger, and J-P. Hubaux. Tracking games in mobile networks. GameSec 2010 [11] T. Alpcan and S. Buchegger. Security games for vehicular networks. IEEE Transactions on Mobile Computing,

10 Outline Mix Cover (MC) Problem Algorithms Evaluation and Results
What are mix zones formally? And what was done in the past with mix zones.

11 Graph-Theoretic Model
𝐺≡ 𝑉,𝐸,𝑤,𝑑 Intersections  Vertices (V) Roads  Edges (E) Mixing cost at intersection  Vertex weight (w) Node intensity on road or demand  Edge weight (d) One for each direction, for 𝑒≡ 𝑢,𝑣 , 𝑑 𝑒 = ( 𝑑 𝑢 𝑒 , 𝑑 𝑣 𝑒 ) 3 7 9 2 6 4 8 6 3 4 2 2 10 8 8 7 2 6 2 3 7 6 3 2 12 9 2 2 5 2 9 1 8 1 2 2 1 9 4 Computations offline Assume knows mobility profiles 2 5 4 4 1

12 Mix Cover (MC) Problem 𝐺≡ 𝑉,𝐸,𝑤,𝑑
Determine a subset 𝑉 𝑀𝐶 ⊆𝑉 and a capacity 𝑐 𝑣 , ∀𝑣∈ 𝑉 𝑀𝐶 s.t. ∀𝑒≡ 𝑢,𝑣 , at least one of 𝑢 or 𝑣∈ 𝑉 𝑀𝐶 ∀𝑣∈ 𝑉 𝑀𝐶 , 𝑐 𝑣 ≥max ( 𝑑 𝑣 𝑒 ) for all 𝑒 covered by 𝑣 (capacity indicates the largest demand the intersection can handle) Total weighted cost 𝑥∈ 𝑉 𝑀𝐶 𝑐 𝑥 . 𝑤 𝑥 is minimized 3 10 6 7 9 2 6 4 8 6 3 4 2 2 10 8 8 7 7 2 6 2 3 7 6 3 2 12 9 2 2 5 2 9 9 1 8 1 2 2 1 9 4 Computations offline Assume knows mobility profiles 2 2 5 4 4 1 4 6x6 + 2x5 + 7x12+ 10x8 + 4x1 + 9x9 = 295

13 Why Mix Cover? A mix cover provides both these!
Mix zone deployment that provides two guarantees: Privacy guarantee All roads are covered at least at one end Nodes go without mixing over at most one intersection Cost guarantee Minimum network-wide mixing cost A mix cover provides both these! Privacy: Nodes go without mixing at at most one intersection

14 Combinatorial Properties
Generalization of Weighted Vertex Cover (WVC) problem Different from the Facility Terminal Cover (FTC) [13] generalization of WVC In FTC, each edge has only a single demand Result 1: Mix Cover problem is NP-hard No efficient algorithm for finding optimal solution, even finding a good approximation seems hard Proof by polynomial-time reduction from WVC Privacy: Nodes go without mixing at at most one intersection [13] G. Xu, Y. Yang, and J. Xu. Linear Time Algorithms for Approximating the Facility Terminal Cover Problem. Networks 2007

15 Outline Mix Cover (MC) Problem Algorithms Evaluation and Results
What are the possible strategies and constraints to take into account?

16 Three Algorithms Optimization using Linear Programming
“Divide and Conquer” approach Largest Demand First Smallest Demand First 1 2 3

17 Integer Program Formulation
Cost guarantee Privacy guarantee Capacity requirement where 𝑤 𝑣 mixing cost at vertex 𝑣 𝑥 𝑣 decision variable indicating selected capacity of vertex 𝑣 Constraint Integer programming : Makes sense to guarantee minimal privacy. Might not have solution. We can relax this and consider instead Linear programming + heuristics Uses our metric to compute mix zone effectiveness a priori 𝑧 𝑣 𝑒 decision variable for vertex 𝑣 covering edge 𝑒 Result 2: LP relaxation of the above IP can guarantee a polynomial-time 2-approximation for the Mix Cover problem

18 Largest Demand First (LDF)
For each edge, replace smaller demand with larger demand Round off the demands to the closest power of 2 Divide into subgraphs 𝐺 𝑘 based on the rounded edge demands 2 𝑘 Obtain 𝑆 𝐺 𝑘 =WVC−2Approx( 𝐺 𝑘 ) for each 𝐺 𝑘 For all 𝑣∈𝑆 𝐺 𝑘 , 𝑆 𝑀𝐶 =(𝑣,𝑐 𝑣 ) , where 𝑐 𝑣 = max{ 2 𝑘 |∀𝑘 s.t. 𝑣∈𝑆 𝐺 𝑘 } Output 𝑆 𝑀𝐶 𝐺≡ 𝑉,𝐸.𝑤.𝑑 𝐺′≡ 𝑉,𝐸.𝑤.𝑑′ Intuition for 1: The intuition behind such a transformation is that if a vertex is able to cover the larger demand, then it will definitely be able to cover any demand smaller or equal to the larger demand

19 LDF – Combinatorial Results
A solution to MC problem on 𝐺′ is also a solution for 𝐺 Result 3: 𝑂𝑃𝑇 𝐺 ′ ≤2𝛼𝑂𝑃𝑇(𝐺), where 𝑂𝑃𝑇 is the optimal solution and 𝛼=max{ 𝑑 𝑢 𝑒 − 𝑑 𝑣 𝑒 ,∀𝑒∈𝐺} Result 4: LDF is a linear time 4𝛼𝛽-approximation algorithm for mix cover where 𝛽 is approximation ratio of WVC−2Approx Proofs in the paper! Computations offline Assume knows mobility profiles

20 Smallest Demand First (SDF)
LDF highly sub-optimal  chosen capacity depends on larger edge demand value SDF similar to LDF, except In step 1, replace larger edge demand value by smaller value Additional step: For each vertex, remember the largest edge demand 𝑑 𝑣 𝑚𝑎𝑥 incident on it In 𝑆 𝑀𝐶 , choose capacity 𝑐 𝑣 =max{max 2 𝑘 ∀𝑘 s.t. 𝑣∈𝑆 𝐺 𝑘 , 𝑑 𝑣 𝑚𝑎𝑥 } Result 5: SDF is a 𝑂(𝑚𝑛) time 4𝛽-approximation algorithm for mix cover where 𝛽 is approximation ratio of WVC−2Approx Computations offline Assume knows mobility profiles

21 Outline Mix Cover (MC) Problem Algorithms Evaluation and Results
Let’s see if it makes any difference

22 Experimental Setup Input graph constructed using real vehicular traffic data 2 US states, Florida and Virginia 3 sizes of road network, 25%, 65% and 100% of total state municipalities 3 different distributions of vertex weight, constant (1), uniform (between 1 and 100) and Gaussian (mean=50, sd=10) Edge demands chosen from real traffic intensities Algorithms implemented in MATLAB, executed on multi-core computer Results average over 100 runs

23 Solution Quality Naïve solution: Select all vertices in final solution
Ratio of LDF/SDF solution cost to naïve strategy cost Naïve solution: Select all vertices in final solution SDF outperforms LDF in both cases for all graph sizes SDF achieves as low as 34% of the cost of the naïve solution Performance best for uniform vertex weight distribution and worst for constant distribution v/e v/e LDF Florida SDF LDF Virginia SDF

24 Execution Efficiency Duration (in seconds) of algorithm execution SDF runs slower compared to LDF in both cases for all graph sizes Algorithms fastest when vertex weight constant and worst when selected from a Gaussian distribution LDF Florida SDF LDF Virginia SDF

25 Results for LP-based Algorithm
Too slow for large graphs Executed on reduced Florida graph of 515 and 1024 vertices For 515 vertices, ratio of solution cost compared to naïve strategy improves to 0.24 (better than LDF and SDF) Execution time is twice compared to LDF and four times that of SDF For 1024 vertices, execution time increased by a factor of 20

26 Conclusion Mix Cover: cost-efficient mix zone placement that guarantees mixing coverage Modeled as a generalization of weighted vertex cover problem Never been studied Model general enough and applicable to other scenarios Approximation algorithms using Linear programming LDF and SDF based on “Divide and Conquer” approach Results Proposed algorithms provide solution quality and execution time guarantees Experimentation using real data and standard computation resources show feasibility Mix cover has never been studied, both by the privacy and the combinatorics community Combinatorial hardness result not surprising; dynamic road conditions reinforces need for fast approximation algorithms

28 How to obtain mix zones? Silent mix zones Passive mix zones
Turn off transceiver Passive mix zones Where adversary is absent Before connecting to Wireless Access Points Encrypt communications With help of infrastructure Distributed

29 People do that for fun
Marketing value: Most popular Nokia phone, brand of GPS module in car Social value: How do people move in a city

30 Or people do that for malicious motives

31 Mix networks vs Mix zones
Alice home Mix Zones Mix node Mix node Alice work Bob Alice Mix node - Nodes move on road network - Road network = restricted network - Mix zones must be placed

32 Assumption Central authority periodically computes optimal mix cover offline Knows the (dynamic) node or traffic intensity on roads Knows mixing cost at each intersection Nodes or vehicles access the latest mix cover computation from the central authority Computations offline Assume knows mobility profiles

33 Solution Size SDF performs better than LDF in Florida
Number of vertices in the final solution SDF performs better than LDF in Florida LDF performs better than SDF in Virginia Algorithms do not optimize solution size; depends on road network topology Solution size between 46% and 58% of the total number of vertices v/e v/e LDF Florida SDF LDF Virginia SDF

