Clustering

Clustering Murat Demirbas SUNY Buffalo

Challenges for scalability of wireless sensor networks • Energy constraint : Sending a message is expensive; depletes battery 1000x faster  Communication-efficient programs are needed • Distribution of controller : Centralized solutions or predefined controllers are unsuitable  Ad hoc and efficient distribution of control are needed • Faults : Message losses & corruptions, nodes fail in complex ways  Locally self-healing programs are needed

Clustering for scaling: why and how? • Why ? • Enables efficient and scalable distribution of controllers • Saves energy and reduces network contention by enabling locality of communication • How ? • Should be locally-healing • To confine faults and changes within that part of the network • Should produce approximately equal-sized clusters • To achieve an even distribution of controllers • Should result in minimum overlaps • To avoid overhead of reporting to many clusterheads

Solid-disc clustering • Solid-disc clustering: • All nodes within a unit distance of the clusterhead belong only to that cluster • All clusters have a nonoverlapping unit radius solid-disc • Why ? • Reduces intra-cluster signal contention • clusterhead is shielded at all sides with members, does not have to endure over-hearing nodes from other clusters • Yields better spatial coverage with clusters • aggregation at clusterhead is more meaningful since it is median of the cluster • Results in a guaranteed upper bound on the number of clusters

( ( ) ( ) ) ( ) cascading ) ( ) ( ) ( ) ( A B new node Challenges for local healing of solid-disc clustering • Equi-radius solid-disc clustering with bounded overlaps is not achievable in a distributed and local manner

Our contributions • Solid-disc clustering with bounded overlaps is achievable in a distributed and local manner for approximately equal radius • Stretch factor, m≥2, produces partitioning that respects solid-disc • Each clusterhead has all the nodes within unit radius of itself as members, and is allowed to have nodes up to m away of itself • FLOC is locally self-healing, for m≥2 • Faults and changes are contained within the respective cluster or within the immediate neighboring clusters (precisely within m+1 units)

Our contributions … • By taking unit distance to be the reliable communication radius and m be the maximum communication radius, FLOC • exploits the double-band nature of wireless radio-model • achieves communication- and energy-efficient clustering • FLOC achieves clustering in O(1) time regardless of the size of the network • Time, T, depends only on the density of nodes and is constant • Through analysis, simulations, and implementations, we suggest a suitable value for T for achieving fast clustering without compromising the quality of resulting clusters

Outline • Model • Justification for m≥2 • Basic FLOC program • Extended FLOC program • Simulation & implementation results • Concluding remarks

Model • Undirected graph topology • Radio model is double-band * • Reliable communication within unit distance = in-band • Unreliable communication within 1 < d < m = out-band • Nodes have i-band/ o-band estimation capability • Time of flight of audio for ranging, or • RSSI-based using signal-strength as indicator of distance • Fault model • Fail-stop and crash • New nodes can join the network • Transient corruption of state and messages *Zhao-Govindan(03), Woo-Tong-Culler(03)

Self-stabilization • A program is self-stabilizing iff after faults stop occurring the program eventually recovers to a state from where its specification is satisfied. • A self-stabilizing program is fault-local self-stabilizing if the time and number of messages required for stabilization are bounded by functions of perturbation size rather than the network size. • Perturbation size for a given state is the minimum number of nodes whose state must change to achieve a consistent state of the network.

Problem statement • A distributed, local, scalable, and self-stabilizing clustering program, FLOC, to construct network partitions such that • a unique node is designated as a leader of each cluster • all nodes in the i-band of each leader belong to that cluster • maximum distance of a node from its leader is m • each node belongs to a cluster • no node belongs to multiple clusters

( ( ) ( ) ) ( ) ) ( ) ( ) ( ) new node subsumed Justification for stretch factor > 2 • For m≥2 local healing is achieved: a new node is • either subsumed by one of the existing clusters, • or allowed to form its own cluster without disturbing neighboring clusters ) ( ( ) ( ( ) ) ( ) new cluster

Basic FLOC program • Status variable at each node j: • idle : j is not part of any cluster and j is not a candidate • cand : j wants to be a clusterhead, j is a candidate • c_head : j is a clusterhead, j.cluster_id==j • i_band : j is an inner-band member of a clusterhead j.cluster_id • o_band :j is an outer-band member of j.cluster_id • The effects of the 6 actions on the status variable:

FLOC actions • idle Λ random wait time from [0…T] expired  become a cand and bcast cand msg • receiver of cand msg is within in-band Λ its status is i_band  receiver sends a conflict msg to the cand • candidate hears a conflict msg  candidate becomes o_band for respective cluster • candidacy period Δ expires  cand becomes c_head, and bcasts c_head message • idle Λ c_head message is heard  become i_band or o_band resp. • receiver of c_head msg is within in-band Λis o_band  receiver joins cluster as i_band

FLOC is fast • Assumption: atomicity condition of candidacy is observed by T • Theorem: Regardless of the network size FLOC produces the partitioning in T+ Δ time. • Proof: • An action is enabled at every node within at most T time • Once an action is enabled at a node, the node is assigned a clusterhead within Δtime • Once a node is assigned to a clusterhead, this property cannot be violated • action 6 makes a node change its clusterhead to become an i-band member, action 2 does not cause clusterhead to change

Selection of T • To achieve atomicity of elections, ensure (with a high probability) that for a node j whose idle-timer expires, the idle timers of none of the nodes within 2 units of j expire within next Δ time • The probability of atomicity of elections is (1- Δ/T)w • w is the maximum number of nodes within 2 units of a node • As seen from the formula T is independent of network size

Self-stabilization of FLOC Invariant of FLOC • For all j,k • j.idle \/ j.cand ≡ j.cluster_id=┴ • j.c_head ≡ j.cluster_id=j • j.i_band Λ j.cluster_id=k  k.c_head Λ j in i-band of k • j.o_band Λ j.cluster_id=k  k.c_head Λ j in o-band of k • k.c_head Λ j in i-band of k  j.i_band Λ j.cluster_id=k

Stabilization actions • I1 is locally corrected • I2 is locally corrected • Clusterhead send heartbeats for detecting any violation of I3 & I4 • For correcting I3 & I4 leases are used, on expiration node returns to idle state • Violation of I5 is detected when node receives a c_head_msg as an i-band; a demote message is sent to both clusterheads • Upon receiving demote, the clusterheads return to idle state

FLOC is fault-locally stabilizing • I1 and I2 are locally detected and corrected • Correction of I3 and I4 are local to the node • A violation of I5 is reduced to violation of I3 and I4 for the nodes that are at most m+1 distance to j Once the invariant is satisfied due to locality of clustering reclustering is achieved locally

FLOC is locally-healing… • Node failures • inherently robust to failure of non-clusterhead members • clusterhead failure dealt via S3 and S4 • effects contained within at most m • Node additions • either join existing cluster, or • form a new cluster without disturbing immediate neighboring clusters, or • if the new node is within i-band of multiple clusterheads, S5 and S6 ensure stabilization

Extensions to basic FLOC algorithm • The extended FLOC algorithm ensures that solid-disc property is satisfied even when atomicity of candidacy are violated occassionally • Insight: Bcast is an atomic operation • The candidate that bcasts first locks the nodes in the vicinity for Δ time • The later candidates become idle again by dropping their candidacy when they find some of the nodes are locked • 4 additional actions to implement this idea

Simulation for determining T • Prowler, realistic wireless sensor network simulator • MAC delay 25ms • Tradeoffs in selection of T • Short T leads to network contention, and hence, message losses • Tradeoff between faster completion time and quality of clustering • Scalability wrt network size • T depends only on the node density • In our experiments, the degree of each node is between 4-12 • a constant T is applicable for arbitrarily large-scale networks • Code is at www.cse.buffalo.edu/~demirbas/floc

Tradeoff in selection of T

Constant T regardless of network size

Implementation • Mica2 mote platform, 5-by-5 grid • Confirms simulation

Sample clustering with FLOC

Related work • LEACH does not satisfy solid-disc clustering • FLOC complements LEACH • FLOC addresses network contention problem at clusterheads • LEACH style load-balancing readily applicable in FLOC • via probabilistic rotation function for determining the waiting-times for candidacy announcements • FLOC is the first time the solid-disc property is achieved

Concluding remarks • FLOC is • Fast : clustering is achieved in constant time, T+Δ • Locally self-healing : changes and faults are confined within the immediate cluster

Energy-efficient communication protocol for WSN Wendi Rabiner Heinzelman, Anantha Chandrakasan, and Hari Balakrishnan

Model • The base station is fixed and located far from the sensors • Communication with the base station is expensive • All nodes in the network are homogeneous and energy constrained • No high-energy nodes • Transmit and receive costs are approximately equal • the energy to power the radio dominates both costs

LEACH protocol • Adaptive clustering: • Nodes take turn to be cluster-heads. • After a fixed period (round) cluster heads are changed. • Dynamic clusters for different rounds • Some nodes become the cluster head. • Other nodes choose a cluster to join.

Optimal percentage of clusterheads is determined empirically Percentage of clusterheads

LEACH Algorithm • Each round is divided into 4 phases • Advertisement phase • Cluster set up phase • Schedule Creation phase • Data transmission phase • Multiple Clusters problem • Transmission in one cluster may corrupt transmission in a nearby cluster. • Use CDMA to solve. • But CDMA is unavailable in almost any of the platforms.

“Me Head !!!” (CSMA-MAC) “I am with you” (CSMA-MAC) Cluster Set up Phase Advertisement Phase • Every node chooses a random number (R) and compute a threshold T(n). • T(n) = P/(1-P*(r mod(1/P)) if n element of G, • = 0 else • P – desired percentage of cluster heads (e.g. 5%) • r – the current round • G – set of nodes that have not been cluster head in the last 1/P rounds • It elects itself as a cluster-head if R< T(n) • Every cluster-head broadcast an advertisement message, with the same transmit energy. • Non-cluster-head node decide which cluster it joins in this round based on the received signal stregth. • Largest strength  closer  minimal enery needed for communication. After decide which cluster it joins, each node informs the cluster-head • To reduce energy consumption non- cluster-head nodes: • Use minimal amount of energy chosen based on the strength of the cluster-head advertisement. • Can turn off the radio until their allocated transmission time. Based on the number of nodes in the cluster, the cluster-head node creates a TDMA schedule telling each node when it can transmit. This schedule is broadcast back to the nodes in the cluster. “Thanks for the time slot, Here’s my data” “Here’s your time slot” Schedule Creation Phase Data Transmission Phase (TDMA)

Simulation results

Simulation results… Direct MTE LEACH

Clustering

Clustering

Presentation Transcript

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering: Partition Clustering

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering

Clustering