1 / 29

Toward Community Sensing

Toward Community Sensing. Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information Processing in Sensor Networks | April 24, 2008. TexPoint fonts used in EMF.

blue
Download Presentation

Toward Community Sensing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng ZhaoMicrosoft Research Information Processing in Sensor Networks | April 24, 2008 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA

  2. Motivation: Traffic monitoring Deployedsensors,high accuracyspeed data What about148th Ave? Detector loops Traffic cameras How can we get accurate road speed estimates everywhere?

  3. Cars as traffic sensors • Many cars have Personal Navigation Devices (PNDs) • Know exact location and speed! • Fuse GPS, map information, engine speed, … • Modern PNDs have network connection  Can use cars as speed sensors!  Example: Dash Express (GPS + GPRS/WiFi)

  4. SenseWeb Community Sensing Vision Realize full potential of population owned sensors Must respect privacy and preference about sharing! Privately-heldsensors Common goal Estimate spatialphenomenon(traffic, weather, …) Construct 3D cities News coverage Contributesensor data Request data

  5. Privacy concern of GPS traces Images courtesy of John Krumm Dense GPS traces allow to identify people’s locations, activities, intents, etc. Even anonymization or strong obfuscation doesn’t help. Key idea: Avoid dense sampling! Need to predict from sparse samples

  6. Phenomenon modeling s1 s4 s3 s1 s2 s3 s5 s7 s8 s6 s9 s9 s10 s11 s12 s12 Which segments should we monitor? (Normalized) speeds as random variables Joint distribution allows modeling correlations Can predict unmonitoredspeeds from monitored speeds using P(S5 | S1, S9)

  7. Minimizing uncertainty • Can estimate prediction error at segment Si Var(Si | SA = sA) • Expected error at segment Si • Expected mean squared error EMSE(A) = i Var(Si | SA) = + + • A* = argmin|A|· k EMSE(A) Does not take “importance” of Si into account  A={S1,S2,S3,S4} s1 s1 s1=.5 s1=.9 s2 s2=.6 s2 s2=1 s3 s3=.8 S3=1 s3 s5 s6 Var(S5|SA)= s4 s4=.6 s4 s4=1 Var(S6|SA)= .08 .1 Lesstravelled P(S5|sA) s7 Var(S7|SA)= 0 1 .3 Frequentlytravelled Var(S5|sA)=.01 Var(S5|sA)=.1

  8. Taking demand into account • Model demand Di as random variables (e.g., Poisson)E.g., Di = #cars on segment Si • Demand weighted MSEDMSE(A) = i E[Di] Var(Si | SA) • Error reduction: R(A) = DMSE(;)-DMSE(A) Want: A* = argmax|A|· k R(A) NP-hard optimization problem  s1 s2 s3 50 D5 = 10 D6 = s5 = ¢ ¢ ¢ + + s6 Var(S5|SA)= s4 Var(S6|SA)= .08 .1 200 D7 = s7 Var(S7|SA)= .3

  9. Selecting informative locations Greedy algorithm: • A  ; • For i = 1:k do • s*= argmaxs R(A [ {s}) • A  A [ {s*} How well does this heuristic do? s4 s1 s2 s2 s3 s5 s7 s7 s8 s6 s9 s10 s10 s11 s11 s12

  10. Selection B s4 s1 s2 s3 s5 s7 s8 s6 s9 s10 s11 Diminishing returns Selection A s4 s1 s2 s3 s5 s7 s8 s6 s9 Utility R(A) is submodular*! s10 s11 Adding s’ helps a lot! Adding s’ doesn’t help much s’ + S’ Observe new location B Large improvement Submodularity: A + S’ Small improvement For A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B) *See store for details

  11. ~63% Why is submodularity is useful? Theorem [Nemhauser et al ‘78] Greedy algorithm gives constant factor approximation F(Agreedy) ¸ (1-1/e) F(Aopt) Greedy algorithm gives near-optimal set of locations to observe  Have no control over where the sensors (cars, cell phones) are going to be! 

  12. Querying a roving sensor How can we cope with uncertain sensor availability? Query! s1 s2=.9 s2 Response: “I’m at S2,going 55 mph” s3 s5 s6 s4 Query! No response(no data) s7

  13. Modeling sensor availability • Set W of observations (cars) we can select from • If select car Cj, observe Si with probability P(i | Cj) Observations W = {C1,…,Cm} Pick B µ W s1 s1 s2 s3 C1 s5 s6 Road segmentsV = {S1,…,Sn} Random A µ V from P(A | B) C2 s4 Goal: Maximize expected utility: B* = argmax|B|· kA P(Aj B) R(A) s7 s7 C3 Utility R(A)

  14. Optimizing community sensing Lemma: Whenever R(A) is submodular, the function F(B) = |A|· k P(A j B) R(A) is submodular Can use the greedy algorithm to optimize selection  F(B) is sum over exponentially many terms  Theorem: For any ,  can find set B’ such that F(B’) ¸ (1-1/e) max|B|· k F(B) -  with probability 1-, using independent samples of R(A)

  15. Handling user preferences • Need to respect user preferences • “Sample my speed at most once per day” • “Don’t measure my speed for the next hour” • “Never sample close to my home” • “Wait at least 10 minutes between samples” • Can accommodate preferences using constraint optimization: B* = argmaxB F(B) subject to C(B) · L Can still get near-optimal solutions  (details in paper) SensingBudget Complex cost function

  16. Community Sensing Summary Phenomenon Demand Availability & Preferences • Optimize value of probing roving sensors • Utility (expected error reduction) • Demand (usage: “utilitarian” impact) • Sensor availability • Predict location based on history • Preferences • Abide by preferences • E.g., frequency / number of probes, min. inter-probe interval • Other constraints: e.g., “Not near my home!”

  17. Phenomenon modeling • 3 months of data from 534 segments across 7 highways and interstates near Seattle, WA • Samples at 15 minute intervals • Use Gaussian Process to model road speeds (covariance function based on road network topology) • Can compute utility R(A) in closed form! 

  18. Demand modeling Expected demand(rush hour) Demand = #cars on road segment Estimate demand based on 3166 ClearFlow route requests

  19. Evaluating model accuracy Accurate estimation of prediction error! Demand-weighted RMS Lower is better Number of locations

  20. Demand driven querying Lower is better 65% error reduction using only 10 (of 534) observations! Optimized sensing requires 10x fewer samples!

  21. Availability modeling • Microsoft Multiperson Location Survey (MSMLS) [Krumm ‘06] • GPS traces from 85 drivers, 6+ days each • Associate GPS readings with road segments“Map matching” • Two models of sensor availability • Spatial obfuscation • Sparse querying GPS usedin MSMLS

  22. Spatial obfuscation • Motivation: Privacy through enforcing uncertainty about sensor location Request road speed at some location in area X X CommunitySensing Service Populationof sensors Anonymized response fromrandom car in cell X (if available)

  23. Spatial obfuscation Lower is better Discretization ≈ Utility / Privacy knob High accuracy even with coarse discretization 23

  24. Obfuscation by sparse querying • Associate roving sensors with anonymous ID • Learn availability model for each sensor from data Request road speed and location from car Ci CommunitySensing Service Populationof sensors Response from car Ci (if connected to network available)

  25. Obfuscation by sparse monitoring Lower is better Biggest difference in “important” part of the curve 50% error reduction over mean if querying 10 “cars” 25

  26. Mobile vs. fixed sensors • When does it “pay off” to use mobile vs. fixed sensors? • Experiment: cost C(B) =  #fixed(B) + #mobile(B) max F(B) s.t. C(B)· L Fixedbudget Mobile sensors pay off if fixed sensors 4x as expensive

  27. Extensions / Future work • Spatio-temporal models (see paper) • How to quickly learn good models (see paper) • Other applications: • Population fitness? • News coverage? • Reconstruction of 3D cities? • Formal privacy guarantees?

  28. Related work • Travel time estimation using cell phones [Wunnava et al ’07] • Privacy-aware querying of cars with GPS & cell phones [Bayen et al ’08, forthcoming] • Spatial monitoring, experimental design etc. (see paper)

  29. Conclusions • Presented integrated approach to community sensing • Theoretical analysis  near-optimal sensing policies • Extensive empirical evaluation on traffic monitoring case study Phenomenon Demand Availability& Preferences

More Related