1 / 23

Processing Continuous Join Queries in Sensor Networks: A Filtering Approach by Mirco Stern, Klemens Böhm, Erik Buchmann

Processing Continuous Join Queries in Sensor Networks: A Filtering Approach by Mirco Stern, Klemens Böhm, Erik Buchmann SIGMOD Conference 2010: 267-278 Presenter: Bryan Guthrie . Wireless Sensor Networks. Consist of battery-operated nodes equipped with sensors

tremain
Download Presentation

Processing Continuous Join Queries in Sensor Networks: A Filtering Approach by Mirco Stern, Klemens Böhm, Erik Buchmann

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Processing Continuous Join Queries in Sensor Networks: A Filtering Approach by Mirco Stern, Klemens Böhm, Erik Buchmann SIGMOD Conference 2010: 267-278 Presenter: Bryan Guthrie

  2. Wireless Sensor Networks • Consist of battery-operated nodes equipped with sensors • Constrained communication/computation capabilities

  3. WSN Query Processing • Abstract the network into a relation, with one tuple per node and attributes representing the sensors of the node • Previous work: processing selections and projections well understood, but less attention to joins (until recently)

  4. Continuous queries • Reports current sensor readings periodically • Joins allow us to combine data from different nodes • Applications: monitoring and surveillance

  5. Example query • Acquire data from nodes observing similar temperature and humidity conditions

  6. Goal: Minimize Energy Costs • Sensing and communication costs dominate other areas – thus, goal is to minimize communications • IDEAL: Each node discards non-joining tuples, then sends remaining tuples to base station (where computation occurs) • Infeasible because each node would need to know if its tuple joins - expensive!

  7. Prior work • Precompute the set of tuples that join • Not optimal for continuous queries, because of the cost of updating this set prior to each execution

  8. Continuous Join Filtering • Maintain filters at nodes • Discard tuples whose attribute value is within filter interval, send the rest • Filter size needs to be optimized for efficiency

  9. Maintaining filters • Base station continuously computes filters that minimize communication costs • For each execution, the base station decides which filters to update • Updates require sending them to nodes, so small changes may not pay off

  10. Some filter definitions • A filter is a multidimensional interval [ai, bi] • Node j's filter is filterj • If the attribute values of node j are within the interval of filterj in all dimensions of the filter, then j does not send its data

  11. Ensuring correctness • If node j has filtered its tuple tj, note that this means tj must be within filterj • Base station can check if any values in filterj would join with data from other nodes, and if so retrieves tj from node j (ensuring correctness)

  12. Example • Say node j's filter is [22ºC, 23ºC] • Node h sends a tuple with temperature value 24ºC → cannot join with j's tuple • Node h sends a tuple with temperature value 23.1ºC → could join with j's tuple, so we need to retrieve it

  13. Filter size • How big should filters be? • Not so small that unneeded tuples sent, not so large that needed tuples aren't sent • Avoid collisions between filters, which happen when some of the values in the filters join • Optimal filter size is not uniform across nodes • Smaller filters if there are more potential join partners for a node

  14. Optimizing filters • Goal: minimize communication costs for next query execution • Therefore, we want to find the filter size that minimizes communications • If there are several minima, pick the one that has the smallest filter size (less risk of collisions) • Continuously updated based on previous filter size and projected sensor readings

  15. Predicting measurements • CJF is not tied to any particular model for predicting measurements • For evaluation purposes, the authors used a linear regression model • Known to work well with sensor data sets • No need to fit model to data • Low maintenance costs

  16. Updating filters • Redistributing new filter sizes every execution will cost more than it saves • Therefore, only send updates if the expected savings outweigh update costs • This can't be done in isolation; whether filterj is updated or not affects all nodes depending on filterj

  17. Example • filterj = [22ºC, 23ºC], filterh = [23.5ºC, 23.9ºC] • filterj wants to shrink to [22ºC, 22.7ºC], filterh wants to grow to [23.1ºC, 23.9ºC] • Assuming the join condition |A.temp – B.temp| < 0.3ºCh can't update unless j updates because the old filterj and new filterh can contain joining tuples • j is a blocking node, h is a dependent node

  18. Updating filters • Must be done in following order: • Resolve filter collisions (because these double communications costs, since the base station must retrieve the colliding tuples) • Shrinking filters – but only if the cost of a suboptimal filter + the cost of suboptimal filters on blocked nodes is greater than the update cost • Enlarging filters – if the cost of a suboptimal filter is greater than the update cost

  19. Evaluation • Used publicly available LUCE data set (environmental sensors) • Compared against 5 alternatives • External joins • SENS-Join (uses precomputation) • IDEAL • Adaptive Precision Setting – like CJF but does not directly account for dependencies • UNIFORM (all filters are the same size)

  20. Eval. – Communications Needed • CJF outperforms other methods and is closest to IDEAL

  21. Evaluation - Dependencies • Considering dependencies reduces collisions, and thus reduces communications

  22. Eval. – Individual optimization • Using individual filter sizes is better than uniform sizes in all cases

  23. Conclusions • Continuous Join Filtering minimizes communications (and therefore energy) costs compared to other models • CJF is closest to optimal for continuous queries • Considering dependencies improves performance

More Related