Download presentation
Presentation is loading. Please wait.
Published byEdmund Palmer Modified over 9 years ago
1
Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information Processing in Sensor Networks | April 24, 2008 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A
2
2 Motivation: Traffic monitoring Deployed sensors, high accuracy speed data What about 148 th Ave? How can we get accurate road speed estimates everywhere? Detector loops Traffic cameras
3
3 Cars as traffic sensors Many cars have Personal Navigation Devices (PNDs) Know exact location and speed! Fuse GPS, map information, engine speed, … Modern PNDs have network connection Can use cars as speed sensors! Example: Dash Express (GPS + GPRS/WiFi)
4
4 Community Sensing Vision Realize full potential of population owned sensors Must respect privacy and preference about sharing! Privately-held sensors Common goal Estimate spatial phenomenon (traffic, weather, …) Construct 3D cities News coverage Contribute sensor data Request data SenseWeb
5
5 Privacy concern of GPS traces Dense GPS traces allow to identify people’s locations, activities, intents, etc. Even anonymization or strong obfuscation doesn’t help. Key idea: Avoid dense sampling! Need to predict from sparse samples Images courtesy of John Krumm
6
6 s1s1 s2s2 s3s3 s4s4 s5s5 s7s7 s6s6 s 11 s 12 s9s9 s 10 s8s8 Phenomenon modeling (Normalized) speeds as random variables Joint distribution allows modeling correlations Can predict unmonitored speeds from monitored speeds using P(S 5 | S 1, S 9 ) s1s1 s3s3 s 12 s9s9 Which segments should we monitor?
7
7 Minimizing uncertainty s 1 =.9 s 2 =1 S 3 =1 s5s5 s6s6 s 4 =1 s7s7 P(S 5 |s A ) 01 Var(S 5 |s A )=.01 Var(S 5 |s A )=.1 Var(S 5 |S A )= A={S 1,S 2,S 3,S 4 } s 1 =.5 s 2 =.6 s 3 =.8 s 4 =.6.08 Var(S 6 |S A )=.1 Var(S 7 |S A )=.3 s1s1 s2s2 s3s3 s4s4 s1s1 s2s2 s3s3 s4s4 Can estimate prediction error at segment S i Var(S i | S A = s A ) Expected error at segment S i Expected mean squared error EMSE(A) = i Var(S i | S A ) = + + A* = argmin |A| · k EMSE(A) Does not take “importance” of S i into account Frequently travelled Less travelled
8
8 Taking demand into account Model demand D i as random variables (e.g., Poisson) E.g., D i = #cars on segment S i Demand weighted MSE DMSE(A) = i E[D i ] Var(S i | S A ) Error reduction: R(A) = DMSE( ; )-DMSE(A) Want: A* = argmax |A| · k R(A) NP-hard optimization problem s1s1 s3s3 s4s4 Var(S 5 |S A )=.08 Var(S 6 |S A )=.1 Var(S 7 |S A )=.3 50 D 5 = s2s2 s5s5 10 D 6 = 200 D 7 = = ¢¢¢ + + s6s6 s7s7
9
9 Selecting informative locations Greedy algorithm: A ; For i = 1:k do s*= argmax s R(A [ {s}) A A [ {s*} How well does this heuristic do? s1s1 s2s2 s3s3 s4s4 s5s5 s7s7 s6s6 s 11 s 12 s9s9 s 10 s8s8 s2s2 s 11 s7s7 s 10
10
10 s1s1 s2s2 s3s3 s4s4 s5s5 s7s7 s6s6 s 11 s9s9 s 10 s8s8 Selection B Diminishing returns s1s1 s2s2 s3s3 s4s4 s5s5 s7s7 s6s6 s 11 s9s9 s 10 s8s8 s’ Observe new location S’ B A + + Large improvement Small improvement Submodularity: For A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B) Utility R(A) is submodular*! *See store for details Selection A Adding s’ helps a lot!Adding s’ doesn’t help much
11
11 Why is submodularity is useful? Theorem [Nemhauser et al ‘78] Greedy algorithm gives constant factor approximation F(A greedy ) ¸ (1-1/e) F(A opt ) Greedy algorithm gives near-optimal set of locations to observe Have no control over where the sensors (cars, cell phones) are going to be! ~63%
12
12 Querying a roving sensor How can we cope with uncertain sensor availability? s1s1 s3s3 s6s6 s4s4 s7s7 s2s2 s5s5 Query! Response: “I’m at S 2, going 55 mph” Query! No response (no data) s 2 =.9
13
13 Road segments V = {S 1,…,S n } Random A µ V from P(A | B) Modeling sensor availability Set W of observations (cars) we can select from If select car C j, observe S i with probability P(i | C j ) s1s1 s3s3 s6s6 s4s4 s7s7 s2s2 s5s5 C1C1 C2C2 C3C3 Observations W = {C 1,…,C m } Pick B µ W Utility R(A) s1s1 s7s7 Goal: Maximize expected utility: B* = argmax |B| · k A P(A j B) R(A)
14
14 Optimizing community sensing Lemma: Whenever R(A) is submodular, the function F(B) = |A| · k P(A j B) R(A) is submodular Can use the greedy algorithm to optimize selection F(B) is sum over exponentially many terms Theorem: For any , can find set B’ such that F(B’) ¸ (1-1/e) max |B| · k F(B) - with probability 1- , using independent samples of R(A)
15
15 Handling user preferences Need to respect user preferences “Sample my speed at most once per day” “Don’t measure my speed for the next hour” “Never sample close to my home” “Wait at least 10 minutes between samples” Can accommodate preferences using constraint optimization: B* = argmax B F(B) subject to C(B) · L Can still get near-optimal solutions (details in paper) Complex cost function Sensing Budget
16
16 Community Sensing Summary Optimize value of probing roving sensors Utility (expected error reduction) Demand (usage: “utilitarian” impact) Sensor availability Predict location based on history Preferences Abide by preferences E.g., frequency / number of probes, min. inter-probe interval Other constraints: e.g., “Not near my home!” Phenomenon Demand Availability & Preferences
17
17 Phenomenon modeling 3 months of data from 534 segments across 7 highways and interstates near Seattle, WA Samples at 15 minute intervals Use Gaussian Process to model road speeds (covariance function based on road network topology) Can compute utility R(A) in closed form!
18
18 Demand modeling Demand = #cars on road segment Estimate demand based on 3166 ClearFlow route requests Expected demand (rush hour)
19
19 Evaluating model accuracy Accurate estimation of prediction error! Number of locations Demand-weighted RMS Lower is better
20
20 Demand driven querying 65% error reduction using only 10 (of 534) observations! Optimized sensing requires 10x fewer samples! Lower is better
21
21 Availability modeling Microsoft Multiperson Location Survey (MSMLS) [Krumm ‘06] GPS traces from 85 drivers, 6+ days each Associate GPS readings with road segments “Map matching” Two models of sensor availability Spatial obfuscation Sparse querying GPS used in MSMLS
22
22 Spatial obfuscation Motivation: Privacy through enforcing uncertainty about sensor location Community Sensing Service Population of sensors Request road speed at some location in area X Anonymized response from random car in cell X (if available) X
23
23 Spatial obfuscation Discretization ≈ Utility / Privacy knob High accuracy even with coarse discretization 23 Lower is better
24
24 Obfuscation by sparse querying Associate roving sensors with anonymous ID Learn availability model for each sensor from data Community Sensing Service Population of sensors Request road speed and location from car C i Response from car C i (if connected to network available)
25
25 Obfuscation by sparse monitoring Biggest difference in “important” part of the curve 50% error reduction over mean if querying 10 “cars” 25 Lower is better
26
26 Mobile vs. fixed sensors When does it “pay off” to use mobile vs. fixed sensors? Experiment: cost C(B) = #fixed(B) + #mobile(B) Mobile sensors pay off if fixed sensors 4x as expensive Fixed budget max F(B) s.t. C(B) · L
27
27 Extensions / Future work Spatio-temporal models (see paper) How to quickly learn good models (see paper) Other applications: Population fitness? News coverage? Reconstruction of 3D cities? Formal privacy guarantees?
28
28 Related work Travel time estimation using cell phones [Wunnava et al ’07] Privacy-aware querying of cars with GPS & cell phones [Bayen et al ’08, forthcoming] Spatial monitoring, experimental design etc. (see paper)
29
29 Conclusions Presented integrated approach to community sensing Theoretical analysis near-optimal sensing policies Extensive empirical evaluation on traffic monitoring case study Phenomenon Demand Availability & Preferences
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.