- By
**dot** - Follow User

- 125 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'CS728 Lecture 5 Generative Graph Models and the Web' - dot

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Importance of Generative Models

Gives insight into the graph formation process:

- Anomaly detection – abnormal behavior, evolution
- Predictions – predicting future from the past
- Simulations and evaluation of new algorithms
- Graph sampling – many real world graphs like the web are too large and complex to deal with
- Goal: generating graphs with small world property, clustering, power-laws, other naturally occurring structures

Graph Models: Waxman Models

- Used for models of clustering in Internet-like topologies and networks with long and short edges
- The vertices are distributed at random in a plane.
- An edge is added between each pair of vertices with probability p.

p(u,v) = * exp( -d / (*L) ), 0 , 1.

- L is the maximum distance between any two nodes.
- Increase in alpha increases the number of edges in the graph.
- Increase in beta increases the number of long edges relative to short edges.
- d is the Euclidean distance from u to v in Waxman-1.
- d is a random number between [0, L] in Waxman-2.

Graph Models: Configuration Model

- Random Graph from given degree sequence
- Problem: Given a degree sequence, d1,d2, d3, …., dn generate a random graph with that degree sequence
- Solution:

Place di stubs onto vertex I

Choose pairs of stubs at random

Problem: we may construct graphs with loops and multiedges

- To prevent this there must be enough “absorbing” residual degree capacity.
- Algorithm:
- Maintain list of nodes sorted by residual degrees d(v)
- Repeat until all nodes have been chosen:
- pick arbitrary vertex v
- add edges from v to d(v) vertices of highest residual degree
- update residual degrees

To randomize further, we can start with a realization and repeatedly 2-swap pairs of edges (u,v), (s,t) to (u,t), (s,v)

Works OK, But is there a more ‘natural’ generative model?

Generative Graph models: Preferential attachment

- Price’s Model [65] : Physics citations – “cummulative advantage”
- Herb Simon [50’s]: Nobel and Turing Awards, political scientist “rich get richer” (Pareto)
- Matthew effect / Matilda effect: sociology
- Barabasi and Albert 99: Preferential attachment:
- Add a new node, create d out-links
- Probability of linking a node is proportional to its current degree
- Simple explanation of power-law degree distributions

Issues with preferential attachment and Power-laws

- Barabasi model fixed constant m for out-degree
- Price’s model directed with m mean out-degree
- Probability of adding a new edge is proportional to its (in) degree k
- problem at the start degree 0
- Price’s model: prop to deg + 1
- Analysis: prob a node has degree k
- pk ~ k-3 (Barabasi model)
- pk ~ k-(2+1/m) power-law with exponent 2-3 (Price)
- Exercise: give pseudocode that generates such a graph in linear time

Variations on the PA Theme

- Clustering, Small-World and Ageing
- Copying Model
- Alpha and beta Models
- Temporal Evolution
- Densification

Graph models: Copying model

- Copying model
- [Kleinberg, Kumar, Raghavan, Rajagopalan and Tomkins, 99]:
- Add a node and choose the number of edges to add
- Choose a random vertex and “copy” its links (neighbors)
- Also generates power-law degree distributions
- Generates communities - clustering

Graph Models: The Alpha Model

Watts (1999)

a model: Add edges to nodes, as in random graphs, but makes links more likely when two nodes have a common friend.

For a range of a values:

- The world is small (average path length is short), and
- Groups tend to form (high clustering coefficient).

Probability of linkage as a function

of number of mutual friends

(a is 0 in upper left,

1 in diagonal,

and ∞ in bottom right curves.)

Graph Models: The Beta Model

Watts and Strogatz (1998)

“Link Rewiring”

b = 0

b = 1

b = 0.125

People know

their neighbors,

and a few distant people.

Clustered and

“small world”

People know

others at

random.

Not clustered,

but “small world”

People know

their neighbors.

Clustered, but

not a “small world”

Graph Models: The Beta Model

Watts and Strogatz (1998)

First five random links reduce the average path length of the network by half, regardless of N!

Both a and b models reproduce short-path results of random graphs, but also allow for clustering.

Small-world phenomena occur at threshold between order and chaos.

Clustering coefficient /

Normalized path length

Clustering coefficient (C) and average

path length (L) plotted against b

Other Related Work

- Hybrid models: Beta + Waxman on grid
- Huberman and Adamic, 1999: Growth dynamics of the world wide web
- Argue against Barabasi model for its age dependence
- Kumar, Raghavan, Rajagopalan, Sivakumar and Tomkins, 1999: Stochastic models for the web graph
- Watts, Dodds, Newman, 2002: Identity and search in social networks
- Medina, Lakhina, Matta, and Byers, 2001: BRITE: An Approach to Universal Topology Generation
- …

Modeling Ageing and Temporal Evolution

- N(t) … nodes at time t
- E(t) … edges at time t
- Suppose that

N(t+1) = 2 * N(t)

- Q: what is guess for

E(t+1) =? 2 * E(t)

- A: over-doubled?

Temporal Evolution of Graphs

- Densification Power Law
- networks appear denser over time
- the number of edges grows faster than the number of nodes – average degree is increasing

a … densification exponent

or

equivalently

Graph Densification

- Densification Power Law
- Densification exponent: 1 ≤ a ≤ 2:
- a=1: linear growth – constant out-degree (assumed in the literature so far)
- a=2: quadratic growth – clique
- Let’s see the real graphs!

Densification – ArXiv citation graph in Physics

- Citations among physics papers
- 1992:
- 1,293 papers,

2,717 citations

- 2003:
- 29,555 papers, 352,807 citations
- For each month M, create a graph of all citations up to month M

E(t)

1.69

N(t)

Densification – Patent Citations

- Citations among patents granted
- 1975
- 334,000 nodes
- 676,000 edges
- 1999
- 2.9 million nodes
- 16.5 million edges
- Each year is a datapoint

E(t)

1.66

N(t)

Densification – Internet Autonomous Systems

- Graph of Internet
- 1997
- 3,000 nodes
- 10,000 edges
- 2000
- 6,000 nodes
- 26,000 edges
- One graph per day

E(t)

1.18

N(t)

Evolution of the Diameter

- Prior work on Power Law graphs hints at Slowlygrowing diameter:
- diameter ~ O(log N)
- diameter ~ O(log log N)
- What is happening in real data?
- Diameter shrinks over time
- As the network grows the distances between nodes slowly decrease

Diameter – ArXiv citation graph

diameter

- Citations among physics papers
- 1992 –2003
- One graph per year

time [years]

Next Time: Densification – Possible Explanations

- Generative models to capture the Densification Power Law and Shrinking diameters
- 2 proposed models:
- Community Guided Attachment – obeys Densification
- Forest Fire model – obeys Densification, Shrinking diameter (and Power Law degree distribution)

Download Presentation

Connecting to Server..