770 likes | 921 Views
Wireless Sensor Networks Data and Databases. Professor Jack Stankovic Department of Computer Science University of Virginia. Outline. Overview of Database Perspective for WSN Storage Issues General Architectures Queries (what they look like) TinyDB/TAG Example Protocol SEAD.
E N D
Wireless Sensor Networks Data and Databases Professor Jack Stankovic Department of Computer Science University of Virginia
Outline • Overview of • Database Perspective for WSN • Storage Issues • General Architectures • Queries (what they look like) • TinyDB/TAG • Example Protocol • SEAD
Classical DB Schema (Personnel Records) Query Query Optimizer Plan Data Indices Streams Stock Market Quotes News Feeds Database (Confidence of Data)
Ad Hoc WSN – DB View Temperature Map Data(i) Cluster Head More Storage Query Optimizer Plan
Why Different? • Amount of memory small • No disks • Highly decentralized • Volatile • Nodes sleep/awake • Nodes fail • RAM (and FLASH) • Data is transient • Data is uncertain (range queries) • Query on time/location/area
Why Different? • Multiple queries that follow each other • Real-Time Streams • Cost models for optimizing the plans for executing queries are difficult • Goal: Answer the query to a specified confidence level at minimum cost • Minimize energy, messages, time, …
Why Different? • Data is correlated • Av. Temperature in area of 10 nodes • Expensive to query all 10 • Nodes near each other have similar temperatures • Learn correlation • Sensor on a window sill x degrees warmer than center of room on sunny days
WSN - Data Perspective • Raw sensor readings – data • Process data into information • Example • Magnetic + acoustic + motion => vehicle
WSN – Data Perspective • In-network aggregation • Minimize energy used • Reduce end-to-end delay • Archive all data ?? • Handle (dynamic, periodic) queries • Disseminate queries into WSN • Raise level of abstraction • View as a database
Data Storage • Collect physical measurements, including data streams • Store the data - where Raw Data Detection Information Classification Sensor Node Situation Assessment Small storage Flash No disk Cache Log Mote Sensor Node Mem. Communicate Append only
Data Storage • In-network processing to reduce storage requirements • Send results of queries back to (multiple) users • Can be mobile • Replicate in network stored data for • Efficiency • Reliability
Data Storage • Tag data with confidence level • Encrypt data • Compress data • Drop data • Age data • Aggregate data (min, max, mean, …) • Blur data => privacy
Data Storage • Data consists of real world measurements and is inherently noisy • Exact match queries not always useful • Range-based queries more appropriate • Real-time queries • Sample rates • Deadlines • Data freshness (temporal validity) • Continuous, long running queries
More Complicated Scenario Tree construction: • Hierarchical Structure • Subscription Requests • Replica Placement • Mobility Management
Data Association • Tracking “N” targets • People, vehicles, animals • RFID tags • Known/friendly targets
Architecture (1) Base Station Data Stored here Queries performed here Data Data Data Data
Applications • Monitor soil moisture • Create temperature maps • …
Architecture (2) Base Station Queries Flood Data Stored Decentralized at Each Node
Applications • Number of horses in meadow • Tank appears • … • DD • RAP
Architecture (3) Hierarchical Network Query to Rendezvous Points Base Station Stargates/ Log motes
Applications • Medical • Environmental • …
Architecture (4) Distant WorkStation Disconnected System Data Stored Decentralized at Each Node Collected by Data Mules
Applications • Environmental Studies • Bridge Analysis • Structural Assessments • Difficult to access areas – use helicopters • …
Example of SQL Query • Retrieve, every 45 seconds, the rainfall level if it is greater than 50 mm SELECT R.Sensor.getRainfallLevel() FROM RFSensors R WHERE R.Sensor.getRainfallLevel() > 50 AND $every(45);
Queries - Extensions • Choose area • Choose lifetime • Aggregate data over a group of sensors • Set conditions restricting which sensors can contribute data • Correlate data from different sensors • Sound alarm whenever two sensors within 10 m of each other detect abnormality • Specify probabilities for equality tests • Ask for range data • Ask for confidence level on answer
Examples of Queries • Military Surveillance • ??? • Medical Domain • Assisted Living Spaces • ??? • Nursing Homes • ??? • Environmental • ???
Disseminating the Query • Flooding • Selective Flooding (to an area) • Spanning Tree • Multiple needed if multiple base stations • Multiple needed for different queries at same base station • Store data by name and hash to that location to retrieve the data
Geographic Hash Table (GHT) • Translate from a attribute to a storage location • Distribute data evenly over the network • Example: GHT system (A Geographic Hash Table for Data Centric Storage – see Ch 6.6 in text))
GHT • Events are named with keys • Storage and retrieval of event are performed with these keys • Key is hashed to a geographic position • Locate node closest to this position Hash to x x Closest node
GHT Base Station Store Tank Info Here Query
Disseminating the Query • Given a cost model for using the WSN • Given the request with a confidence level • Create a plan to disseminate the query at minimum cost to obtain the answer AND meet confidence in the answer
TinyDB • For Periodic (Environmental) Applications • Integrates query and query response with power management by scheduling sleep/wake-up times depending on the depth of the tree • Coordinate sleep/wake-up with sensing • Note the need for clock sync
TAG of TinyDB • 2 Phases (sleep when possible) • disseminate periodic query • collect data (scheduled) Base Station Epoch Pipelining
Another Issue - Indexing • Indexing • Cost of building and maintaining an index may be too high for WSN • More likely when nodes begin to have more storage/memory • Example system: DIFS (A Distributed Index for Features in Sensor Networks) • Low average search costs • Hash chooses a location within a region not over the whole system like GHT
Underlying Support • ELF: An Efficient Log-Structured Flash File System • Persistent storage • Appending data to file • Delivery to base station – later • Supports garbage collection • Accounts for limited number of writes to flash (e.g., 10,000 writes) • Wear leveling • API – open(); read(); write(); delete()
Content Distribution Mobile Monitoring Agents Information (Quality dimensions: Refresh Rate, Accuracy, …) Time Ad hoc Wireless Sensor Network Must be Minimized Energy (Computation, Communication) Environmental Measurements
Content Dissemination Receivers (Refresh Rate, Accuracy) Goal: Find the optimal communication path to send sensory data from a monitored source to multiple mobile sinks such that energy usage is minimized and requirements are met. Data replicas (Placement?) Information source (Aperiodic or Periodic updates)
Applications • Soldiers with PDA monitoring for chemical contamination • Note: • Current: 1 to n • Multiple sinks and multiple sources are possible
Issues • Building the dissemination tree • Maintaining it as nodes enter/leave • Disseminating the data • Maintaining linkage to mobile sinks • Save energy!!! • Meet end-to-end delay • Meet refresh rate Self-Organizing
Dissemination Trees Unicast (Geographic Forwarding) Dense sensor network
Dissemination Trees Unicast (Geographic Forwarding) Minimum Spanning Tree Dense sensor network
Dissemination Trees Unicast (Geographic Forwarding) Steiner Tree Minimum Spanning Tree Steiner Point: replicas Dense sensor network
Sending the Data Regular Multicast Steiner Tree R R Steiner Point R R R Update rate = R
Sending the Data Asynchronous Multicast Regular Multicast r1, r2, r3, r4 are receiver refresh rates Steiner Tree Weighted Steiner Tree r4 R r3 r4 Weighted Steiner Tree ? r2 r3 R r2 Steiner Point R r1 r2 r1 R R Caching Steiner points r4 R Update rate = R Update rate = R
SEAD: Scalable Energy Efficient Asynchronous Dissemination Protocol • An asynchronous content distribution multicast tree is maintained • Tree is modified when • a sink joins • a sink leaves • a sink moves beyond some threshold • Cost of building tree is minimized Mobile node Access node Forwarding chain r4 Dissemination to Mobile Sinks r3 r2 r1 r2 Caching Steiner points r4
Assumptions • WSN is static and then mobile nodes (e.g., PDAs) enter the network • Dissemination trees are among the static nodes • Important – mobile nodes are NOT part of the tree • SEAD works with an overlay network • Source, sink representatives (access points) and Steiner points (see previous slide)
4 Phases • First - Subscription Query • Mobile node attaches to nearest node as access point • Access node sends join query to source
Subscription Query (1) Sink 1 (access node) Sink 2 (access node) Source
4 Phases • Subscription Query • Mobile node attaches to nearest node as access point • Access node sends join query to source • Second -Gate replica search • Attach new node on current tree at best gate replica