1 / 38

When the Sensors Hit the Road The Car Tel Project

When the Sensors Hit the Road The Car Tel Project. Sam Madden MIT CSAIL http://cartel.csail.mit.edu

jera
Download Presentation

When the Sensors Hit the Road The Car Tel Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. When the Sensors Hit the RoadThe CarTel Project Sam Madden MIT CSAIL http://cartel.csail.mit.edu With Hari Balakrishnan, Vladimir Bychkovsky, Jakob Eriksson, Bret Hull, Yang Zhang, Kevin Chen, Waseem Daher, Michel Goraczko, Hongyi Hu, Sejoon Lim, Allen Miu, Daniela Rus, Eugene Shih, Arvind Thiagarajan, Sivan Toledo Stanford Database Group

  2. ~= “Mote” 1st Generation Sensor Networks • Periodic monitoring • Wake up and sense • Sleep for Minutes • Event-based monitoring • Transmit on external event • Low data rates, duty cyles • Static, small area • Extremely underpowered devices

  3. Participatory Sensing • “Crowd-sourced” sensing • Individuals contribute their data • Rather than relying on fixed deployments • Examples: • Personalized Environmental Impact Report • Microsoft SensorMap • Dartmouth BikeNet • MapMyRun, etc. CarTel

  4. Cars as a Vehicle for Participatory Sensing • Observation: • Static sensing infeasible over very wide areas • Some apps do not need high temporal fidelity • Real-world problems: • Civil infrastructure monitoring • Road-surface conditions • Visual mapping • Commute optimization • Carpool finding • Speed trap mapping

  5. Opportunistic Mobility • Rather than deploy new mobile nodes, take advantage of existing mobility • Example: cellphones w/ sensors • 1.5 billion phones worldwide • High spatial coverage • High-performance processor • Cars equipped with sensors • 650 million cars on the road • Abundance of power and space • Have >100 embedded sensors What system architecture is best suited for mobile, wide-area sensing?

  6. Clients Data collection servers Open Wi-Fi GPRS CarTel System Overview Visualize Portal Deliver Cabernet / QuickWifi (Carry & Forward Network) Per-node sensing & computation hardware Collect / Process ICEDB (Intermittently Connected & Embedded DB)

  7. Roadmap • Overview • CarTel Components • Portal • IceDB • Cabernet • Managing missing and uncertain data • Case Studies • Traffic Analysis • WiFi Mapping • Potholes

  8. Portal • Web-based visualization framework • Apps retrieve sensor data by issuing queries to ICEDB • Visualize sensor data using map overlays • Continuous queries to direct sensing • Pushed to remote nodes using Cabernet • Local ad-hoc queries read streaming results Web server Portal Applications Traffic Wi-Fi ICEDB server Portal Data Viz. Rel. DB CQ Cabernet Streaming sensor data Cont. queries Visualize / Analyze

  9. Data Collection Demo Visualize / Analyze

  10. ICEDB: Intermittently Connected Embedded DB • Relational model is a convenient abstraction for data collection • ICEDB: • Queries written in extended SQL • Continuous query processor • Distributed • Bandwidth is variable • Buffer query and DDL commands • Buffer query results • Support “drill down” queries • Prioritize results SELECT img FROM camera, gps WHERE gps.pos in [x,y] AND camera.time= gps.time SAMPLE 1s Collect / Process

  11. queries ICEDB Server results ICEDB Remote ICEDB Query Processing Remote Node • 2 data paths to cope with limited BW CQ Output Buffers Cabernet A DAPTERS sensor sensor Ad-hoc Query Processor DB sensor Collect / Process

  12. Remote node SELECT lat, lon FROM gps WHERE insert_time > cqtime – 5 EVERY 5 seconds BUFFER IN gpsbuf DELIVERY ORDER fifo(gpsbuf) Portal Portal Portal FIFO t=1 t=2 t=3 BISECT Remote node SELECT lat, lon FROM gps WHERE insert_time > cqtime – 5 EVERY 5 seconds BUFFER IN gpsbuf DELIVERY ORDER bisect(gpsbuf) Portal Portal Portal Inter-query Prioritization Collect / Process

  13. to ICEDB Server to ICEDB Server Global Prioritization SELECT lat, lon, image FROM camera WHERE insert_time > cqtime – 5 EVERY 5 seconds BUFFER IN cambuf SUMMARIZE AS SELECT floor(lat/100), floor(lon/100) FROM cambuf GROUP BY floor(lat/100), floor(lon/100) Collect / Process

  14. ICEDB Server Global Prioritization 1 1 2 2 2 1 Collect / Process

  15. Specifying Priorities • Three SQL language extensions: • Global prioritization: SUMMARIZE AS • Inter-query prioritization: PRIORITY • Intra-query prioritization: DELIVERY ORDER BY • For more details: ICDE ’06 Collect / Process

  16. Cabernet • Disconnection-tolerant transport layer • Buffer data until connectivity becomes available • Not connection oriented • QuickWifi: Fast connection establishment • CTP: Unlike TCP, wireless losses ≠ congestion losses QuickWifi Jakob Eriksson et al – Mobicom 08 Deliver

  17. Challenge: Connection Establishment is Slow • Process • Scan 11 channels (600 ms average), receive beacon • Authenticate (1 round trip, 3s timeout) • Associate (1 round trip, 3s timeout) • DHCP Discovery (1 round trip, 3s timeout) • DHCP Request (1 round trip, 3s timeout) • ARP Request (1 round trip, 2s timeout) • Many messages, loss rates high, retries often needed • Default Linux stack took 13s to associate, on average (whole connection lasts only 19s avg!) Deliver

  18. Connection Optimizations • Scan more popular channels first (1,6,11) • Since authentication in open networks always succeeds, do it in parallel with association • Set timeouts to 100 ms • Mean conn. time: 370 ms DHCP via broadcast addr

  19. Roadmap • Overview • CarTel Components • Portal • Cabernet • Cabernet • Managing missing and uncertain data • Case Studies • Traffic Analysis • WiFi Mapping • Potholes

  20. Noisy Data Challenge • Challenge: how to store and query all of this data? • Discrete points don’t work well • Most apps don’t actually want raw data! • Prefer trajectories, fields, fit functions • Idea: support these as first class objects inside the DBMS

  21. Model-Based Views • Proposed in MauveDB [SIGMOD 06] • Models can be queried like database views • Basic idea: • compute model • grid modeled area • use model to compute values at each cell of grid • answer queries using grid • FunctionDB: efficient implementation for an important class of models • Continuous functions of one or more variables

  22. Benefits • Declarative Queries • No need to write procedures to manipulate models, or re-implement or re-optimize each new query • View provides intuitive SQL-like interface • If data is already in database, do not need to move data to/from math package like MATLAB

  23. FunctionDB: Key Idea • Database system that fits continuous functions to data • MauveDB-like query Interface (SQL + Grids) • Algebraic query processor temp Regression Function temp(t) Raw data (temp readings) Solveequation temp(t) = thresh Query: Report when temp crosses threshold SELECT time WHERE temp = thresh time

  24. FunctionDB: System Architecture User Query Query Result Algebraic Query Processor (Operates on algebraic representation) Raw Data “Function Table” (Storing Piecewise Model) User fits or imports functions

  25. Query Results • Grid semantics: all queries yield discrete points sampled at user-specified interval (“grid size”) • SELECT x,y WHERE temp < 20 GRID x 8, y 8 temp < 20 8 8

  26. Algebraic Execution: Overview • FunctionDB supports efficient algebraic execution for piecewise polynomial functions • Restriction helps achieve grid semantics even when no closed form solutions are available • E.g., Non-linear polynomials, multi-variate polynomials, complex constraints • X^2 + Y^3 < 25

  27. SELECT * WHERE Temp < 20 GRID X 8, Y 8 Efficient Algebraic Implementation Gridded Result User Hypercubes+Boundary Points X+Y < 20 Grid X+Y < 20 Approx Not needed for linear univariate polynomials X:[0,20) , Y:[0,20) , Temp = X+Y , X+Y-20 < 0 Substitute X:[0,20) , Y:[0,20) , Temp = X+Y , Temp < 20 Symbolic Filter (Temp < 20) X:[0,20) , Y:[0,20) , Temp = X+Y Function Table Temp = F(X,Y)

  28. Hypercube Approximation • Use technique from graphics rasterization (Taubin’s test) • Tests if polynomial F(X,Y, …) can have zeros within a given hypercube H (i.e., F lies on boundary of H) Zero of F Z = F(X,Y) Y Z = 0 X Hypercube H • Allows efficient pruning by testing corners of hypercube

  29. Subdivision Algorithm - - Bounding Box Predicate Check centers of all grid cells smaller than grid size + + - + + +

  30. More Algebraic Operators • Solver for single-variable equations • Function Inference • E.g., X+Y = 20  X = 20-Y • (Eliminates independent variables) • Continuous aggregates • E.g., Average  Integration

  31. Evaluation: Temperature Data • Data: <Sensor ID, X, Y, time, temp> • 54 sensors, 10 days of data, 1 million raw data points • Fit regression model temp = F(X,Y,time) • Degree-2 piecewise polynomial with 22 pieces • Evaluation queries: • Find regions where temp < threshold (Filter) • Area of region where temp < threshold (Aggregate) • Compared algebraic approach to two baselines: • Evaluating all grid points • Pruning search using a precomputed B-Tree

  32. Result 1: Benefits Of Algebraic Execution Depend On Selectivity

  33. Result 2: Algebraic Execution Wins Significantly For Aggregate Queries Wider grid size is faster, but too wide a grid size results in low accuracy owing to discretization error

  34. Roadmap • Overview • CarTel Components • Portal • Cabernet • Cabernet • Managing missing and uncertain data • Case Studies • Traffic Analysis • WiFi Mapping • Potholes

  35. Route Planning • Match traces to map • Compute Gaussian delay for each segment • Assume independence • Minimize 3 Objectives • Distance • Google Maps • Expected delay • Pr(missing time goal)

  36. 1 3 A C B 2 Max. Probability Planning • Travel time of each edge is a Gaussian • If indepdendent, travel time of a path is also Gaussian • Goal: find path with max. probability of reaching destination by deadline • Unlike standard shortest paths, no optimal substructure • If AxCyB is best path from A to B, AxC is not necessarily the best path from A to C • Implies cannot use A* or Dijkstra Lim et al. “Stochastic Motion Planning and Applications to Traffic.” To Appear, Workshop on Algorithmic Foundations of Robotics, 2008

  37. WiFi As A Sensor • Is WiFi feasible as an uplink from cars? • Study with taxis; 30,000 distinct APs • Answer: • Significant connectivity “in the wild” • ~ 200 kB / minute • Even at normal driving speeds • Connections last about 20 seconds • See an access point about every 20 seconds! See MOBICOM 2006

  38. Conclusion • Mobile and participatory sensornets can sense at much higher scale over larger areas than static networks • Applications: traffic, fleet management, automotive diagnostics, wireless network monitoring, civil/environmental monitoring, traffic planning,… • CarTel technologies: Portal, IceDB, Cabernet/QuickWifi, FunctionDB • Platform enabled research results: Pothole Patrol, Traffic, WiFi as Uplink • For more info: http://cartel.csail.mit.edu

More Related