Complex networks and decentralized search algorithms

Complex networks and decentralized search algorithms By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Complex Network What is “Complex Network”? • Research of large-scale network structure • Importance: Limit of Reductionism • Mathematics, computer science, social science and biological science • Computer science: Internet, WWW • Social science: social network • Biological science: interaction in the pathways of a cell’s metabolism, Neurology (e.g. Neural burst modeling)

Complex Network History 1 • Euler(1736)- Graph Theory • New Problem- How is a network created? What are rules dominate its topology and structure?

Complex NetworkGoal • Observe real-world network property • Modeling (produced by a random mechanism) • Reproduce another properties (It may be observed in the real-world network) • We can explain and predict!

Complex Network History 2 • Erdos, Renyi (1959)- Random graph theory • Different systems have different rules- intentionally ignored • Connecting a pair of nodes randomly • Giant component(Phase transition) G(n,p) where p= (c: const.) 1) c<1: G(n,p) consists a.a.s. of small components all of which have O(logn) vertices 2) c>1: a.a.s. a unique large component which consists of Θ(n) vertices

The small–world phenomenonsix degrees of separation • six degrees of separation “I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet. The President of the United States, a gondolier in Venice, just fill in the names.” • <six degrees of separation> by Gaure(1991) • or • <Chains> by Karinthy(1929) • Yuna Kim, and You, just fill in • the names.

The small–world phenomenonMilgram’s experiment • Stanley Milgram’s experiment (1967) • Want to know the “distance” of two person in America • Target person(stockbroker in Boston) • Considerably Randomly chosen starters(Wichita, Cansas, Omaha, Nebraska) • Personal information • Forward letter to a person on a first-name basis. • The median length among the complete paths was 5.5. (42/160)  We are living in a Small World!

The small–world phenomenonother exmaples • Example- Social Network, Web, Biology… • Barabasi(1998) Web – 19 degrees of separation d=0.35+2logN (d: average distance, N: # of web pages) • General phenomenon observed in a lot of network • Caution: It doesn’t mean we can find someone/something easily. (We don’t know the shortest path)

The small–world phenomenonResult 1. Such short chains are ubiquitous. 2. Individuals operating with purely local information are very adept at finding these chains. (using “analysis”)  Length of the shortest path ≤ 6

Basic models of small-world networks in classic graph theory • Thm (Bollobas, de la Vega, 1982) Fix a constant k≥3. If we choose u.a.r. from the set of all n-node graph in which each node has degree exactly k, the with high probability every pair of nodes will be joined by a path length of O(logn).

Basic models of small-world networksin classic graph theory • Thm (Bollobas,Chung 1988) Consider a graph G formed by adding a random matching to an n-node cycle(assume n is even, pair up the nodes on the cycle u.a.r. and add edges between each of these node pairs). With high probability, every pair of nodes will be joined by a path length O(logn).

Basic models of small-world networks Watts-Strogatz model • Granoveter(1972)- Existence of Cluster • Real Network – Highly clustered(Erdos number, • No cluster in Erdos-Reney model  need to be modified! • Watts-Strogatz model (1998, Nature) • nxn grid-based model • For each node v, one extra directed edge to some other node w chosen u.a.r. • (w: long range contact. ↔ local contacts) • Superposition of structured and random links. • Trade off- clustering ⇔ no clustering large world small world

Basic models of small-world networksWatts-Strogatz model Two-dimensional grid with a single random shortcut superimposed. Two-dimensional grid with many random shortcuts superimposed (as in the Watts-Strogatz model).

Decentralized search in small-world networks • Decentralized search algorithm An algorithm finding efficient paths to a destination using purely local information i.e. an algorithm searching the shortest path under the following rule; At each step, the holder of the message must pass it across one of its connections. (In grid model, current holder doesn’t know the long-range connection of nodes that have not touched the message.) • Thm (Kleinberg, 2000) The delivery time of any decentralized algorithm in the grid-based model is Ω(n2/3).

Decentralized search in small-world networks • Extend model (Kleinberg, 2000) – Watt-strogatz model has no decentralized algorithm finding short paths. • α≥0 controls long range link correlated with the geometry of the underlying grid • Grid distance ρ(v,w) • Choose u.a.r w for v with probability proportional to ρ(v,w)-α • α=0, Watts-Strogatz model • α is small: long range links are too random • α is large: “ “ not random enough.

Decentralized search in small-world networks • Thm (Kleinberg, 2000) 1. 0≤α<2, delivery time of any decentralized algorithm in the grid-based model: Ω(n(2-α)/3) 2. α=2, There is a decentralized algorithm with delivery time: O(log2n) 3. α>2, delivery time of any decentralized algorithm in the grid-based model: Ω(n(α-2)/(α-1))

Decentralized search in small-world networks A node with several random shortcuts spanning different distance scales.

Decentralized search in other models 1. Hierarchical models • Network is embedded in a hierarchy; Node resides at the leaves if a complete b-ary tree • Natural variation – Milgram’s experiment, Web page Arts Science Music Biology Opera Genetics Verdi’s Aida Yeast genome

Decentralized search in other models 1. Hierarchical models • Def b-ary tree A tree with no more than b children for each node • Def depth of a node The distance from the node to the root of the tree • Def complete b-ary tree A b-ary tree with all leaf nodes at same depth. All internal node have b children. …

Decentralized search in other models 1. Hierarchical models • Natural assumption: density of links is lower for node pairs that are more widely separated in the underlying hierarchy. • Hierarchical model with exponent β. • Complete b-ary tree with n leaves(h=logbn) • Tree distance h(v,w)=the height of their lowest common ancestor • Define random graph G on the set V of leaves • k edge out of each v • w as endpoint of the ith edge independently with probability proportional to b-βh(v,w) . (β≥0)

Decentralized search in other models 1. Hierarchical models • Starting node s, target node t • It must construct a path from s to t • We know: edges out of nodes that it explicitly visit. • Caution: G may not contain a path from s to t. • Def Delivery time f(n) A decentralized algorithm has delivery time f(n) ↔ on a randomly generated n-node network, with s and t chosen u.a.r., the algorithm produces a path of length O(f(n)) with probability at least 1-ε(n), ε→0 as n→∞

Decentralized search in other models 1. Hierarchical models • Thm (Kleinberg, 2001) • In the hierarchical model with exponent β=1and out-degree k=clog2n, for a sufficiently large const. a, ∃ a decentralized algorithm with polylogarithmic delivery time. • ∀ β≠1 and every polylogarithmic function k(n), there is no decentralized algorithm (in the hierarchical model with exponent β and out-degree k(n)) that achieves polylogarithmic delivery time.

Decentralized search in other models 1. Hierarchical models • Watts, Dodds and Newman (2002)independently proposed a similar model.

Design principles and network data 1. Peer-to-peer systems and focused web crawling • Napster and music file sharing (1999) • Centralized index  Decentralized algorithm • Focused web crawler ↔ standard web search engine

Design principles and network data 2. Social Network data • (Adamic, Adar, 2005) e-mail network ofobservation: g(v,w)-3/4 compared with g(v,w)-1. • (Liben-Nowell, 2005) LiveJournal observation • Rank-based friendship • Thm (Liben-Nowell, 2005) For an arbitrary population density on a grid, the expected delivery time of the decentralized greedy algorithm in the rank-based friendship model is O(log3n).

Recall: Flavor of research in this area Experiment in the social science : Highlights a fundamental and non-obvious property of network (efficient searchability in this case) • Random graph modeling, analyzing • measure on large-scale data • further results, question in algorithm, graph theory and discrete probability

References • J. Kleinberg. Navigation in a Small World. Nature 406(2000), 845. • J. Kleinberg. The Small-World Phenomenon and Decentralized Search. A short essay as part of Math Awareness Month 2004, appearing in SIAM News 37(3), April 2004 • J. Kleinberg. Complex Networks and Decentralized Search Algorithms. Proceedings of the International Congress of Mathematicians (ICM), 2006. • Albert-LászlóBarabási, Linked: How Everything Is Connected to Everything Else and What It Means(2002) • NogaAlon, Joel H. Spencer, The Probabilistic Method, 2nd Edition(2000)

Thanks

Complex networks and decentralized search algorithms

Complex networks and decentralized search algorithms

Presentation Transcript

Search Algorithms

Complex networks and random matrices.

Search Algorithms

Genetic Algorithms, Search Algorithms

Small World: decentralized search

Complex networks and decentralized search algorithms

Complex networks and their models

Complex Networks

Complex Networks

Search and Congestion in Complex Communication Networks

Search algorithms

Search Algorithms

Complex Networks

Models and Algorithms for Complex Networks

Complex Networks

Models and Algorithms for Complex Networks

Complex (Biological) Networks

Complex Networks

Performance in Decentralized Filesharing Networks

Complex Networks

Search Algorithms

Models and Algorithms for Complex Networks