Data dissemination
This presentation is the property of its rightful owner.
Sponsored Links
1 / 53

Data Dissemination PowerPoint PPT Presentation


  • 55 Views
  • Uploaded on
  • Presentation posted in: General

Data Dissemination. Peyman Teymoori. Introduction. Data dissemination : the process by which queries or data routed in the network Source : the node generating data Event : the information to be reported Sink : the node interested in an event Two general steps:

Download Presentation

Data Dissemination

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Data dissemination

Data Dissemination

Peyman Teymoori


Introduction

Introduction

  • Data dissemination: the process by which queries or data routed in the network

  • Source: the node generating data

  • Event: the information to be reported

  • Sink: the node interested in an event

  • Two general steps:

    • Interest propagation: temperature, intrusion

    • Data propagation: routing, aggregation


Routing models

Routing Models

  • Address Centric

    • Each source independently send data to sink

  • Data Centric

    • Routing nodes en-route look at data sent

Source 2

Source 2

Source 1

Source 1

A

B

A

B

DC Routing

AC Routing

Sink

Sink


Differences with current networks

Differences with Current Networks

  • Difficult to pay special attention to any individual node:

    • Collecting information within the specified region

    • Collaboration between neighbors

  • Sensors may be inaccessible:

    • embedded in physical structures.

    • thrown into inhospitable terrain.

  • Sensor networks deployed in very large ad hoc manner

    • No static infrastructure


Differences with current networks1

Differences with Current Networks

  • They will suffer substantial changes as nodes fail:

    • battery exhaustion

    • accidents

    • new nodes are added.

  • User and environmental demands also contribute to dynamics:

    • Nodes move

    • Objects move

  • Data-centric and application-centric

    • Location aware

    • Time aware


Data dissemination

Overall Design of Sensor Networks

  • One possible solution?

    • Internet technology coupled with ad-hoc routing mechanism

      • Each node has one IP address

      • Each node can run applications and services

      • Nodes establish an ad-hoc network amongst themselves when deployed

      • Application instances running on each node can communicate with each other


Why different and difficult

Why Different and Difficult?

  • A sensor node is not an identity (address)

    • Content based and data centric

      • Where are nodes whose temperatures will exceed more than 10 degrees for next 10 minutes?

      • Tell me the location of the object ( with interest specification) every 100ms for 2 minutes.

  • Multiple sensors collaborate to achieve one goal.

  • Intermediate nodes can perform data aggregation and caching in addition to routing.

    • where, when, how?


Challenges

Challenges

  • Energy-limited nodes

  • Computation

    • Aggregate data

    • Suppress redundant routing information

  • Communication

    • Bandwidth-limited

    • Energy-intensive

  • Scalability: ad-hoc deployment in large scale

  • Robustness: unexpected sensor node failures

  • Dynamic changes: no a-priori knowledge, mobility

Goal: Minimize energy dissipation


Aggregation

Aggregation

  • Many studies addressing not only the routing problem but also representing and combining data more efficiently

  • Process of data while being forwarded toward the sink

  • Reducing the number of transmissions

  • Definition:

    • Gathering & routing information in a multihop network

    • Processing data at intermediate nodes to

      • Reduce resource consumption

      • Increase network lifetime


Aggregation1

Aggregation

  • Two approaches:

    • In-network with size reduction

      • More ability to reduce traffic

      • Less accuracy

      • Difficulty in reconstructing the original data

    • In-network without size reduction

      • Merging some smaller packets into one

  • Requires:

    • Networking protocol: routing

    • Effective aggregation functions

    • Data representation


Aggregation2

Aggregation

  • A different routing paradigm is required

  • Data-centric routing: nodes route packets based on packet content

  • Taking into account:

    • The most suitable aggregation points

    • Data type

    • Priority of information

  • Timing strategies:

    • Periodic simple aggregation

    • Periodic per-hop aggregation

    • Periodic per-hop adjusted aggregation

      • depends on the node’s position in the gathering tree


Aggregation3

Aggregation

  • Aggregation functions:

    • Lossy & lossless

    • Duplicate sensitive vs. duplicate insensitive

      • AVG vs. MAX

  • Data representation:

    • limited storage capabilities

    • vary according to the application requirements

    • distributed source coding: A method to deal with data representation and compression


Network protocols for in network aggregation

Network Protocols for In-Network Aggregation

  • How to forward packets in order to facilitate in-network aggregation

  • Several approaches:

    • Tree-based (Hierarchical): SPTs rooted at sink, or nodes grouped into clusters

    • Multi-path routing: DAG, more robust

    • Hybrid approaches


Flooding

(a)

(a)

A

  • Implosion

  • Data overlap

B

A

B

C

(a)

(a)

D

C

r

q

s

(r,s)

(q,r)

Flooding

  • Just broadcast what you receive and is not yours (consider the max hop count)

  • Disadvantages:

  • Resource blindness


Gossiping

Gossiping

  • A modified version of flooding

  • Random selection of neighbors for broadcast

  • Avoids implosion

  • Disadvantages:

    • Taking a long time to propagate

    • No delivering guarantee


Rumor routing

Rumor Routing

  • An agent-based path creation algorithm

  • Agents or “ants”:

    • long-lived entities created at random by nodes

    • packets circulated to establish shortest paths to events they encounter

    • perform path optimization

  • Motivation

    • Sometimes a non-optimal route is satisfactory


Rumor routing1

Rumor Routing

  • Creating Paths:

    • Nodes having observed an event send out agents which leave routing info to the event as state in nodes

    • Agents attempt to travel in a straight line

    • If an agent crosses a path to another event, it begins to build the path to both

    • Agent also optimizes paths if they find shorter ones.


Rumor routing2

Event

Source

Rumor Routing

  • Basis for algorithm:

    • Observation: Two lines in a bounded rectangle have a 69% chance of intersecting

    • Create a set of straight line gradients from event, then send query along a random straight line from source.


Rumor routing3

Rumor Routing

(b)

(a)


Rumor routing4

Rumor Routing

  • Advantages:

    • Tunable best effort delivery

    • Tunable for a range of query/event ratios

  • Disadvantages:

    • Optimal parameters depend heavily on topology (but can be adaptively tuned)

    • Does not guarantee delivery


Sensor protocols for information via negotiation spin

Sensor Protocols for Information via Negotiation (SPIN)

  • A data-centric routing approach

  • Uses negotiation & resource adaptation to address deficiencies of flooding

  • Two basic ideas:

    • Exchanging sensor data may be expensive, but exchanging data about sensor data may not be.

    • Nodes need to monitor and adapt to changes in their own energy resources


Sensor protocols for information via negotiation spin1

Sensor Protocols for Information via Negotiation (SPIN)

  • Data negotiation

    • Meta-data (data naming)

    • Application-level control

    • Model “ideal” data paths

  • SPIN messages

    • ADV- advertise data

    • REQ- request specific data

    • DATA- requested data

  • Resource management

ADV

A

B

REQ

A

B

DATA

A

B


Sensor protocols for information via negotiation spin2

REQ

DATA

DATA

DATA

DATA

REQ

ADV

REQ

ADV

ADV

ADV

DATA

REQ

ADV

DATA

ADV

ADV

REQ

REQ

Sensor Protocols for Information via Negotiation (SPIN)

A

B


Cost field approach

Cost-Field Approach

  • Sets up minimum cost paths to a sink

  • A two-phase process:

    • Set up the cost field (metrics such as delay)

    • Data dissemination using the cost

  • At each node, cost = min cost to the sink

    • No explicit path information


Cost field approach1

Cost-Field Approach

  • Setting up the cost field:

    • Sink broadcasts: ADV + its cost as 0

    • A node N hears an ADV from M:

      • Its path

        • Ln : total cost from node N to sink

        • Lm : the cost of node M to sink

        • Cnm : the cost from node N to M

      • If Ln was updated, the new cost is broadcasted (new ADV)

  • Flooding-based implementation of Dijkstra’s algorithm

  • A back-off-based approach, Time to defer = γ * Cmn

    • γ is a parameter of the algorithm


Cost field approach2

Cost-Field Approach

  • An example of setting up the cost field

    • γ = 10


Cost field approach3

Cost-Field Approach

  • Data dissemination:

    • A source sends a message

      • cost = Cs

      • cost-so-far = 0

    • In an intermediate node M (with cost = Cm):

      • If cost-so-far + Cm = Cs then

        • Forward the packet


Directed diffusion

Directed Diffusion

  • A reactive data-centric protocol

  • Suitable for monitoring

  • Expressed in terms of named data

  • Organized in three phases:

    • Interest dissemination

    • Gradient setup

    • Path reinforcement & forwarding


Directed diffusion1

Reply

Node data

Type =four-legged animal

Instance = elephant

Location = [125, 220]

Confidence = 0.85

Time = 02:10:35

Directed Diffusion

  • Naming

    • A list of attribute – value pairs

    • Animal tracking:

Request

Interest ( Task ) Description

Type = four-legged animal

Interval = 20 ms

Duration = 1 minute

Location = [-100, -100; 200, 400]


Directed diffusion2

Directed Diffusion

  • Interest

    • The sink periodically broadcasts interest messages

    • Every node maintains an interest cache

      • Each item corresponds to a distinct interest

      • No information about the sink

      • Interest aggregation : identical type, completely overlap rectangle attributes

    • Each entry in the cache has several fields

      • Timestamp: last received matching interest

      • Several gradients: data rate, duration, direction


Directed diffusion3

Directed Diffusion

  • Setting Up Gradient

Source

Sink

Neighbor’s choices :

1. Flooding

2. Geographic routing

3. Cache data to direct interests

Interest = Interrogation

Gradient = Who is interested

(data rate , duration, direction)


Directed diffusion4

Directed Diffusion

  • Data propagation

    • Sensor node computes the highest requested event rate among all its outgoing gradients

    • When a node receives a data:

      • Find a matching interest entry in its cache

        • Examine the gradient list, send out data by rate

      • Cache keeps track of recent seen data items (loop prevention)

      • Data message is sent individually to the relevant neighbors (unicast)


Directed diffusion5

Directed Diffusion

  • Reinforcing the best path

Source

The neighbor reinforces a path:

1. At least one neighbor

2. Choose the one from whom

it first received the latest event (low delay)

3. Choose all neighbors from which

new events were recently received

Sink

Low rate event

Reinforcement = Increased interest


Directed diffusion6

Directed Diffusion

  • The reinforced path must be periodically refreshed

  • A trade off based on network dynamics:

    • Frequency of gradient setup

    • Achieved performance

  • MAC layer issues:

    • Keeping local (control) traffic at a low level

      • Avoid collision, delay

    • Enhanced Directed Diffusion

      • Joint of Directed Diffusion & cluster-based arch.


Tiny aggregation tag

Tiny AGgregation (TAG)

  • A tree-based data-centric approach

  • Timing: periodic per hop adjusted

  • Two main phases:

    • Distribution: disseminating queries

    • Collection: aggregating & routing readings

  • Declarative interface for data collection and aggregation – SQL style


Tiny aggregation tag1

Tiny AGgregation (TAG)

  • A sample query:

  • SELECT: an expression over one or more aggregation values

  • expr: the name of a single attribute

  • agg: aggregation function

  • attrs: the attributes by which the sensor readings are partitioned

  • WHERE, HAVING: filters out irrelevant readings

  • GROUP BY: specifies an attribute based partitioning of readings

  • EPOCH DURATION: time interval of aggr record computation

SELECT {agg(expr), attrs} from SENSOR

WHERE {selPreds}

GROUP BY {attrs}

HAVING {havingPreds}

EPOCH DURATION i


Tiny aggregation tag2

Tiny AGgregation (TAG)

  • Distribution:

    • The sink broadcasts an organizing message

      • Message contains level & distance from the root

    • Node receiving the message:

      • If it doesn’t belongs to any level:

        • Its level = message.level + 1

        • Its parent = the sender node

        • Rebroadcasts the message adding its own ID & level

    • Broadcasting the query along the structure


Tiny aggregation tag3

Tiny AGgregation (TAG)

  • Collection:

    • Each parent waits for data

    • Then sends its aggregation up the tree

    • Epochs are divided slots equals to the max depth of the tree – Sleep & Wake up

    • Every epoch, new aggregate produced

    • Most of the times, motes are idle and in low power state


Tiny aggregation tag4

1

2

3

4

5

Tiny AGgregation (TAG)

SELECT COUNT(*) FROM sensors

Sensor #

<- Time

1


Tiny aggregation tag5

1

2

3

4

5

Tiny AGgregation (TAG)

SELECT COUNT(*) FROM sensors

Sensor #

2

<- Time


Tiny aggregation tag6

1

2

3

4

5

Tiny AGgregation (TAG)

SELECT COUNT(*) FROM sensors

Sensor #

1

3

<- Time


Tiny aggregation tag7

1

2

3

4

5

Tiny AGgregation (TAG)

SELECT COUNT(*) FROM sensors

5

Sensor #

<- Time


Tiny aggregation tag8

1

2

3

4

5

Tiny AGgregation (TAG)

SELECT COUNT(*) FROM sensors

Sensor #

<- Time

1


Tiny aggregation tag9

Tiny AGgregation (TAG)

  • A grouping example


Synopsis diffusion

Synopsis Diffusion

  • Synopsis Diffusion : In-network aggregation

  • Synopsis Functions

    • Synopsis Generation: s=SG(r)

    • Synopsis Fusion: s=SF(s1,s2)

    • Synopsis Evaluation: r*=SE(s)

  • Efficient Topology for Synopsis Diffusion

    • Rings (R0 R1  … Ri-1  Ri )

  • Duplicate Sensitive Aggregates Mapping

    • DS Aggregates  Order- and duplicate-insensitive synopsis


Synopsis diffusion1

Class of Aggregates

Sink

Not TAG

Median

Count Distinct

Histogram

Average

MIN

Count

Synopsis Diffusion

  • Aggregate : A metric of aggregation


Synopsis diffusion2

Synopsis Diffusion

[

]


Synopsis diffusion3

Synopsis Diffusion

  • Phases of SD:

    • Distribution Phase

      • Aggregate query is flooded through the network

      • Network node form a set of rings

    • Aggregation Phase

      • Each node uses SG to convert local data to local synopsis and then uses SF to merge two synopsis to create a new one. The query initiator uses the SE to generate the final result.

  • Adapting the Topology

    • Ring Topology

    • Adaptive Ring Topology

    • Nodes moves up or down in the rings dependent upon the messages it overhears.


Delay bounded medium access control db mac

Delay Bounded Medium Access Control (DB-MAC)

  • A tree-based aggregation scheme

  • A joint design of routing & MAC protocols

  • Minimizes the latency for delay bounded apps.

  • Takes advantage of data aggregation

  • Adopts CSMA/CA scheme based on RTS/CTS/DATA/ACK handshake

  • Suitable for cases where:

    • Different sources sense an event

    • There is delay constraint


Delay bounded medium access control db mac1

Delay Bounded Medium Access Control (DB-MAC)

  • A message exchange example in DB-MAC


Delay bounded medium access control db mac2

Delay Bounded Medium Access Control (DB-MAC)

  • Dynamically aggregates data while forwarding toward the sink

    • Aggregation tree is built on the fly

    • No knowledge of the network topology

  • RTS/CTS are exploited to do aggregation

  • Back-off intervals respect priorities

  • By overhearing CTSs, the relay node is chosen

  • Advantages:

    • Flexible & distributed construction of the tree

    • Suitable for dynamic topologies

    • Energy-efficient due to cross-layer design


Tributaries and deltas

Tributaries and Deltas

  • A hybrid approach: tree-based & multipath

  • Idea

    • In low packet loss rate regions, use tree (T)

    • In high loss rate regions, use multipath (M)

  • How to link the regions:

    • M nodes only send data to M nodes

    • M nodes subgraph includes the sink

  • Thresholds for M node percentage


Tributaries and deltas1

Tributaries and Deltas


  • Login