Efficient Labeling Scheme for Scale-Free Networks

Efficient Labeling Scheme for Scale-Free Networks Shai Carmi1, Reuven Cohen1,2 and Shlomo Havlin1 1 Minerva Center and the Department of Physics, Bar-Ilan University, Ramat-Gan, Israel 2 Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel Background and motivation The scheme in details Performance of the scheme First we fix the number of hubs (to O(log(N))) and show the stretch for various graph sizes and values of λ (λ is the exponent in the degree distribution). We present the average stretch over all pairs, averaged over large number of graphs : Next we show the distribution of the possible values of the stretch (For N=1000, and λ =2.5). Finally we fix N (8000) and show how the stretch changes with the number of hubs, for two different exponents. The internet is composed of approximately 107 routers and end-units. Those are connected mainly through physical links. The main network protocol is IP, which uses packet switching, that is – on each separate packet, the protocol decides what is the best next hop. Routing Schemes Today, routing is made using very large tables containing a list of all the main IP addresses. Those tables are kept at each and every router. The main problem is that this method is poorly scalable, that is, with the increasing size of the internet, tables become very large since their size is proportional to the total number of nodes. To address this problem, new class of algorithms have been developed. Those algorithm do not guarantee routing through the shortest path, but rather try to route in a smart way, such that most of the paths will remain shortest. The benefit is reducing the size of the routing table considerably. The efficiency of a routing scheme is measured in terms of its stretch factor – the ratio between the length of path computed by the scheme and that of the shortest path. A more sophisticated approach uses labeling as a preprocessing part of the routing scheme. The labeling step Fix the number of hubs, Nh. Pick the Nh nodes with the highest degree, they are called – the ‘hubs’. For each node in the network : Start BFS with the current node being a root, keeping for each node its predecessor. Stop the search whenever we find one of the hubs. The label of the current node will begin with the hub found followed by the shortest path from that hub to the node. Analysis : Assuming that the average path length from each node to one of the hubs is O(log(N)), the following is concluded : The running time of the labeling step is O(N*log(N)). (Assuming sparse graph). The average label size is O(log2(N)) bits (Since the average path length is O(log(N)), and each node on the path can be represented with O(log(N)) bits). Next is a plot of the average label size as a function of N (For Nh = O(log(N) and λ=2.5) : Labeling Labeling is a preprocessing step of giving new name to each node in the network. The new name is meaningful, i.e.we can use the new name as a source of information when making routing decisions, thus reducing the amount of data that must be stored in the routing tables. For example, consider the following grid, with meaningless name for each node. The routing table will have to hold one entry for each node i.e. the table will be of size O(N). But if smarter names are used, for example the new name is the grid coordinates, then routing table is no longer needed at all, since each node can route to each other node using its coordinates only. The routing table creation step For each hub : Perform BFS with the current hub being a root, keeping for each node its predecessor. For each node reachable from that hub, store its predecessor in that node’s routing table, in the entry that belongs to the current hub. For each node : Store all his immediate neighbors in the table. Current labeling strategies [4] choose a subset of nodes, called central nodes. Each node is assigned one of the central nodes (the closest), and in addition a group of nodes that are in his neighborhood (the node’s cluster). Every node stores in his routing table the link through which to route to all central nodes and all the nodes in his cluster. The label of each node consists of his central node name and the first routing decision on the shortest path from this central node to itself. Routing to some node v is done by first routing to the central node of v (using the label, everyone knows who it is, and everyone knows how to route to him on shortest path). Then using the label the central node knows the next step, after that we are assured to be in the cluster of node v, from there using the routing tables we can route to v on shortest path. The table size at each node is proportional to the number of central nodes plus the cluster size. Therefore we need to find the balance between the sizes of those two groups. Analysis : Total run time : O(N*Nh). (Sparse graphs). Table size (number of entries) of each node : Nh + k (Where k is the degree). Conclusions We used the fact that the internet forms a scale-free graph, and exploited the unique properties of those graphs to create a new routing scheme. Our scheme is extremely simple and fast (both in the preprocessing and routing), demanding labeling with very short label sizes, and routing tables as small as we wish. Nevertheless, the scheme performs very well, finding almost always shortest paths, better than current schemes [3]. The scheme was also tested on real routers network (N~104) and was found highly effective (Stretch less than 1.06 with table size of less than 40 entries on average). The routing procedure The routing decision made by node v on routing request to node u : If v=u, we are done. Otherwise : Using the table, check if u is one of the immediate neighbors. If it is, send through the link (v,u) (which must exist). Otherwise, scan u’s label. For each entry in the label – label[i]: If v is label[i], (for some i) send through the link (label[i],label[i+1]). If v is different from all nodes in the label, extract the hub from the label, (label[1]), find that hub in the routing table, and send through the link that appears there. Scale Free networks In recent years it was discovered that many natural networks, including the internet (in both the routers and AS levels) are scale free. Scale free networks have a power law degree distribution, i.e. they contain some very highly connected nodes (hubs). It was found that in random scale-free networks: 1. Most short paths goes through one of the hubs. 2. The average distance between nodes in the network is ultra-small (O(log(N) or even O(log(log(N)))). [1,2] Routing schemes for scale-free graphs Due to the above mentioned properties of scale-free networks it is natural to propose a new labeling scheme : 1. The central nodes will be chosen to be the hubs. 2. The label for each node will contain its path to the closest hub. References [1] R. Cohen and S. Havlin, "Scale free networks are ultrasmall", Phys. Rev. Lett. 90, 058701 (2003). [2] R. Cohen, D. Dolev, S. Havlin, T. Kalisky, O. Mokryn, and Y. Shavitt, "On the tomography of networks and trees", cond-mat/0305582 (2003). [3] D. Krioukov, K. Fall, X. Yang, “Compact Routing on Internet-Like Graphs”, Proc. INFOCOM 2004, Mar. 2004 [4] M. Thorup and U. Zwick, “Compact routing schemes”, in Proc. Of the 13th SPAA. ACM, 2001. Analysis : Since the routing table is small, it can be implemented as a hash-table, therefore finding the route in constant time. The label size in practical cases is of size O(1) also, therefore we can scan it in constant time, reaching total decision procedure carried out in constant time.

Efficient Labeling Scheme for Scale-Free Networks

Efficient Labeling Scheme for Scale-Free Networks

Presentation Transcript

How Scale-Free are Biological Networks.

Range-free Localization Schemes for Large Scale Sensor Networks

Introduction to Small-World Networks and Scale-Free Networks

Scale Free Networks

Range-Free Localization Schemes for Large Scale Sensor Networks

Branch Code: A Labeling Scheme for Efficient Query Answering on Trees

Resilience Notions for Scale-Free Networks

An Efficient Scheme for Authenticating Public Keys in Sensor Networks

An Efficient and Practical Authenticated Communication Scheme for Vehicular Ad Hoc Networks

Scale Free Networks

Lecture 11: Scale Free Networks

1 Scale-free networks: mathematical properties

Range-Free Localization Schemes for Large Scale Sensor Networks

Towards Efficient Simulation of Large Scale P2P Networks

LEAP: Efficient Security Mechanisms for Large-Scale Distributed Sensor Networks

Efficient Scheduling for Sensor Networks

LAEP: Efficient Security Mechanisms for Large-Scale Distributed Sensor Networks

Scale - free networks

Scale Free Networks

Random Networks: Scale-free Networks

Hierarchical Data Dissemination Scheme for Large Scale Sensor Networks

Scale Free Networks in Biological Systems