data centric view of sensornets an overview n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Data-centric view of sensornets: An Overview PowerPoint Presentation
Download Presentation
Data-centric view of sensornets: An Overview

Loading in 2 Seconds...

play fullscreen
1 / 36

Data-centric view of sensornets: An Overview - PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on

Data-centric view of sensornets: An Overview. Puru Kulkarni Vijay Sundaram Bhuvan Urgaonkar. Motivation. Ubiquitous presence of sensor networks Communication, computation, limited storage, sensing capabilities Used to sense, actuate, control Sensors everywhere = Data everywhere!

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Data-centric view of sensornets: An Overview


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data centric view of sensornets an overview

Data-centric view of sensornets: An Overview

Puru Kulkarni

Vijay Sundaram

Bhuvan Urgaonkar

motivation
Motivation
  • Ubiquitous presence of sensor networks
    • Communication, computation, limited storage, sensing capabilities
    • Used to sense, actuate, control
    • Sensors everywhere = Data everywhere!
  • Require an infrastructure for data access and storage
overview
Overview
  • Sensors sense/generate data
  • Users/Applications interested in data or some measure of data
  • Common user operations are:
    • Queries and Monitoring
    • Actuate and Control
typical queries
Typical Queries
  • Historical
    • What is the average rainfall over past 2 days?
  • Current
    • What is the current temperate in Rm# 226?
  • Long Running
    • Temperature in Rm# 226 over the next 4 hours every 30 seconds
issues
Issues
  • How to identify relevant sensors?
  • Computation vs. Communication tradeoff
    • Where to process query?
      • inside the sensor network (route query)
        • Need new techniques
      • at a centralized location (route data)
        • Large amounts of data transfer (not efficient)
        • Data gathering may not reflect query rate
    • How to process query?
      • queries on streaming data
slide6
DataSpace: Querying and Monitoring Deeply Networked Collections in Physical SpaceT. Imielinski and S. Goel, Rutgers University
  • Billions of objects populate space
  • Each produces and locally stores data
  • Location aware
  • Can be selectively monitored, queried and controlled
  • Physical world enhanced with data
characteristics
Characteristics
  • Dataspace
    • Data lives on the object
    • Users access not only “local” information but can navigate entire dataspace
    • Spatial world divided in 3-D datacubes
      • CS Bldg. , street, block etc
    • Communication, messaging and computation techniques for querying and monitoring required
querying and monitoring
Querying and Monitoring
  • Queries are spatially driven
  • Steps:
    • Identify relevant datacubes
    • Identify relevant nodes (dataflocks)
      • Datacube directory service
    • Aggregation for queries on several datacubes
      • e.g.: Information about Manhattan taxi cabs
architecting dataspace
Architecting DataSpace
  • Network as DataSpace engine
    • multicast mechanisms

(each node has an IP address!)

    • group membership based on
      • physical location
      • attribute (temperature, #vehicles etc)
    • multicast fits selective node addressing criteria to access relevant data
      • e.g.: what is average temperature in CS Bldg?
      • Query reaches only sensors in the CS Bldg datacube and have the corresponding group address
network as dataspace engine

Based on interested attribute

Based on location of datacube

<space-handle>

<subject-handle>

DataSpace address

Network as DataSpace engine
  • Space Handleencodes datacube information
  • Subject Handle attributes that are part of a multicast group
  • Dataspace address is a IPv6 mutlicast address

E.g.: Space handle: 224.4.5

Subject handle: 8

Dataspace address: 224.4.5.8

geographic routing infrastruture
Geographic Routing infrastruture
  • Route message based on physical location rather than IP address
    • Use GPS coordinates for locations
  • Avoids use of multicast for routing queries to datacubes
  • Once query reaches a region use mutlicast
geographic routing infrastruture1
Geographic Routing infrastruture
  • Geo-router (routes based on datacube location)
  • Geo-node (issue query to nodes in datacube)
  • Geo-host (process geographics messages)
  • Approach
    • Route query to datacube
    • Geo-nodes route query within datacube
      • mulitcast with a TTL of 1
slide13
The Sensor Network as a Database
      • Govindan, Hellerstein, Hong, Madden, Franklin, Shenker
  • Querying the Physical World
      • Bonnet, Gehrke, Seshadri
sensornet database architecture
Sensornet Database architecture
  • Given a routing and access mechanism, how to process queries?
  • Provide a DB-view to users/apps
    • well understood programming interface
    • common data operations use computation in network
      • help energy-efficiency
    • allow users to be unaware of actual network, but treat it as a database
    • Sensor Network + Data => Sensor Network Database
what is required
What is required?
  • Core DB operations tailored for sensor networks
  • Design appropriate building blocks for DB operations
    • Join, aggregation, grouping, selection etc
sensornet database architecutre
SensornetDatabase Architecutre
  • Two important ideas:
  • in-network implementationsof primitive database query operators such as grouping, aggregation, and joins
    • group communication and routing protocols with possible processing at intermediate nodes implement the operator in an application independent way
sensornet database architecutre1
SensornetDatabase Architecutre
  • Relax the semantics of database queries to allow approximate results
    • relaxation enables energy-efficient implementations even given the expected high level of network dynamics
  • A sensor network is a proxy for a continuous realworld phenomenon, and by nature samples that phenomenon discretely at some rate, with some degree of error.
in network implementation
In-network Implementation
  • JOIN operator
    • selection over cross-product of a pair of tables
    • Tuples generated at different nodes might be joined at a single node
    • Some JOIN implementations are blocking
  • Blocking is infeasible in sensor networks
    • tables can contain unbounded streams of data
    • amount of memory available is limited
  • Need to retool these operations
    • Pipelining
    • Partitioning
non blocking pipelinined joins
Non Blocking Pipelinined Joins
  • Symmetric hash-join:
    • Maintains two hash tables (keyed by the column(s) used for the join)
    • On an input tuple, looks up matching tuples from other input’s hash table
    • Outputs any matching results
  • Ripple joins:
    • Statistically sample the two tables to be joined, in order to produce a stream of joined tuples
    • Relative rates at which the two tables are sampled adapt to match the variance produced by the data in each
    • low energy approach to obtain approximateanswers
partitioning
Partitioning
  • Partitioning:
    • tuples are partitioned based on their join-column values and redistributed on the fly across multiple nodes;
    • the work of joining the individual partitions is done in parallel by each of the nodes
  • Partitions can be defined by value, geographically, or by sensor type, and a node (or nodes) can be designated to perform the join for the partition
in network implementation1
In-network Implementation
  • Aggregation operators
      • summarization of a column(s) into a single numerical value E.g. SUM, COUNT, AVERAGE, MIN, MAX etc
      • query flooded in the network and the responses are routed on the reverse path trees,
      • results aggregated across several nodes
      • E.g: to calculate AVERAGE each node returns (SUM, COUNT) values to parent
      • Can be a very common operator
distributed sensnet dbs
Distributed Sensnet DBs
  • How to represent devices in DBs on sensornets?
    • ADTs (Abstract Data Types)
    • Methods correspond to sensing functionality
    • Virtual Relations (VRs) store local data
    • Network used for query operations
virtual relation
Virtual Relation
  • VR with attributes as
  • Inputs to an ADT (device) function
  • Arguments to an ADT function
  • Output of the function
  • Timestamp of the function
virtual relation1
Virtual Relation
  • Some VR properties
    • records are never updated or deleted
    • is naturally partitioned over the sensnet (each device takes care of its set of VR records)
  • What does this mean? – a distributed DB
  • Records from the VRs (distributed over the devices) are processed using distributed query execution plans
approximate results
Approximate Results
  • Energy-efficiency can be achieved using approximate aggregates
  • Uniform sampling:
    • Tuples are uniformly sampled and the resulting average is assumed to represent the actual average
    • Packet loss might invalidate the statistical assumptions that these intervals depend on.
  • Logarithmic sampling
    • The number of respondents (or the size of memory needed for the count) scales logarithmically with the size of the network
    • Provides looser error bounds but uses significantly less memory or communication.
complex query evaluation
Complex query evaluation
  • R x S x T
    • What order to follow?
      • (RxS)xT or Rx(SxT) or (RxT)XS
    • Decided by query optimizer
      • Usually depends on table size
  • With Sensernret DB
      • Need adaptive policy to route tuples based on
        • Energy consumption
        • Topology
        • Loss rates
conclusions
Conclusions
  • Explosion of data from sensor networks needs an infrastructure for access, storage etc
  • Organizing sensors
    • Datacubes
    • Other techniques ?
  • Identifying relevant sensors is preliminary to fetch data
    • Dataspace provided two solutions
    • Other approaches ?
conclusions1
Conclusions
  • Sensornets as Distributed DB
    • Provide a database view to sensornet data
    • Pros
      • App development easy
      • In-network processing helps resource usage
    • Cons
      • Distributed DB can be difficult
      • Requires to retool DB operations for sensornets
      • Other approaches?
representations for devices functions
Representations for Devices Functions
  • Internal Representation
  • We can’t use trad OO DB methods
  • - they all demand immediate access
  • - with asynchronous quality of sensnets this is unacceptable
overview1
Overview
  • Direction of sensor networks progress
    • Small form-factor devices
    • On-board computation
    • Wireless communication
    • Increased sensing capabilities
    • Improved OS and networking functionalities
  • Prediction:
    • Every device (> 1 $) will have some sensor
    • Ubiquitous presence of sensor networks
overview2
Overview
  • Typical sensor networks usage:
    • Sense, collect and convey data
    • Provides a ubiquitous computing platform
    • Applications query/monitor sensed data
      • Ecosystem dynamics
      • Temperature/weather sensing
      • Automobile traffic analysis
    • Data-centric network, generated data more important than node identity
requirements
Requirements
  • Addressing
    • Identify relevant sensors
  • How to access/process data?
    • Communicate data and process centrally
    • Compute query at node and perform DB operations
  • Interface for querying/monitoring and control
what to do with data
What to do with data?
  • Answer queries/give useful info
  • How ??
    • Centralized approach
      • Communicate data
      • Store and process all data at central location (traditional DB approach)
      • Is all temporal data to be stored?
      • Communication overhead?
what to do with data1
What to do with data?
  • De-centralized approach
    • Communicate query (query routing)
    • Required data attribute of node
    • Node stores and communicates data to queries
    • Processing at node
    • Computation overhead
      • Computation overhead smaller than communication!
    • How to aggregate data?
    • How to route queries?
    • How to map nodes to addresses for communication purposes?
need for decentralization
Need for Decentralization
  • Centralized (Traditional databases)
    • Inefficient use of resources
      • Large amounts of data communicated to central location
      • All sensors send data all the time
    • Dissociates access to device from query load
    • Communication more expensive than computation
  • Decentralized (Distributed DBs)
    • Data on devices
    • In-network query processing
pipelining benefits
Pipelining Benefits
  • Provide streamed partial answers, hence, can enable query refinement
  • Schemes like ripple joins form a low energy approach to obtain approximate answers and can be used together with sampling