Loading in 5 sec....

Dynamic Models of On-line Social NetworksPowerPoint Presentation

Dynamic Models of On-line Social Networks

- 147 Views
- Uploaded on
- Presentation posted in: General

Dynamic Models of On-line Social Networks

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

December, 2009

Dynamic Models of On-line Social Networks

Anthony Bonato

Ryerson University

On-line Social Networks - Anthony Bonato

On-line Social Networks - Anthony Bonato

Complex Networks

- web graph, social networks, biological networks, internet networks, …

On-line Social Networks - Anthony Bonato

nodes: web pages

edges: links

over 1 trillion nodes, with billions of nodes added each day

The web graphOn-line Social Networks - Anthony Bonato

Social Networks

nodes: people

edges:

social interaction

(eg friendship)

On-line Social Networks - Anthony Bonato

On-line Social Networks (OSNs)Facebook, Twitter, Orkut, LinkedIn, GupShup…

On-line Social Networks - Anthony Bonato

A new paradigm

- half of all users of internet on some OSN
- 250 million users on Facebook, 45 million on Twitter

- unprecedented, massive record of social interaction
- unprecedented access to information/news/gossip

On-line Social Networks - Anthony Bonato

Properties of Complex Networks

- observed properties:
- massive, power law, small world, decentralized

(Broder et al, 01)

On-line Social Networks - Anthony Bonato

Small World Property

- small world networks introduced by social scientists Watts & Strogatz in 1998
- low diameter/average distance (“6 degrees of separation”)
- globally sparse, locally dense (high clustering coefficient)

On-line Social Networks - Anthony Bonato

Paths in Twitter

Dalai Lama

Arnold

Schwarzenegger

Queen Rania

of Jordan

Christianne Amanpour

Ashton Kutcher

On-line Social Networks - Anthony Bonato

Why model complex networks?

- uncover the generative mechanisms underlying complex networks
- models are a predictive tool
- nice mathematical challenges
- models can uncover the hidden reality of networks
- in OSNs:
- community detection
- advertising
- security

- in OSNs:

On-line Social Networks - Anthony Bonato

Many different models

On-line Social Networks - Anthony Bonato

Social network analysis

On-line

- Milgram (67): average distance between two Americans is 6
- Watts and Strogatz (98): introduced small world property
- Adamic et al. (03): early study of on-line social networks
- Liben-Nowell et al. (05): small world property in LiveJournal
- Kumar et al. (06): Flickr, Yahoo!360;average distances decrease with time
- Golder et al. (06): studied 4 million users of Facebook
- Ahn et al. (07): studiedCyworld in South Korea, along with MySpace and Orkut
- Mislove et al. (07): studiedFlickr, YouTube, LiveJournal, Orkut
- Java et al. (07): studied Twitter: power laws, small world

On-line Social Networks - Anthony Bonato

Key parameters

- power law degree distributions:
- average distance:
- clustering coefficient:

Wiener index, W(G)

On-line Social Networks - Anthony Bonato

Power laws in OSNs

On-line Social Networks - Anthony Bonato

Sample data: Flickr, YouTube, LiveJournal, Orkut

- (Mislove et al,07): short average distances and high clustering coefficients

On-line Social Networks - Anthony Bonato

(Leskovec, Kleinberg, Faloutsos,05): Densification Power Law where 1 < a ≤ 2:densification exponent

- many complex networks (including on-line social networks) obey two additional laws:

- networks are becoming more dense over time;
- i.e. average degree is increasing
- e(t) ≈ n(t)a

- a=1: linear growth – constant average degree, such as in web graph models
- a=2: quadratic growth – cliques

On-line Social Networks - Anthony Bonato

- Decreasing distances
- distances (diameter and/or average distances) decrease with time
- noted by Kumar et al. in Flickr and Yahoo!360

- Preferential attachment model (Barabási, Albert, 99), (Bollobás et al, 01)
- diameter O(log t)

- Random power law graph model (Chung, Lu, 02)
- average distance O(log log t)

On-line Social Networks - Anthony Bonato

Models for the laws

- (Leskovec, Kleinberg, Faloutsos, 05, 07):
- Forest Fire model
- stochastic
- densification power law, decreasing diameter, power law degree distribution

- Forest Fire model
- (Leskovec, Chakrabarti, Kleinberg,Faloutsos, 05, 07):
- Kronecker Multiplication
- deterministic
- densification power law, decreasing diameter, power law degree distribution

- Kronecker Multiplication

On-line Social Networks - Anthony Bonato

Models of OSNs

- many models exist for general complex networks
- few models for on-line social networks
- goal: find a model which simulates many of the observed properties of OSNs
- must be simple and evolve in a natural way
- must be different than previous complex network models: densification and constant diameter!

On-line Social Networks - Anthony Bonato

“All models are wrong, but some are more useful.”

– G.P.E. Box

On-line Social Networks - Anthony Bonato

Iterated Local Transitivity (ILT) model(Bonato, Hadi, Horn, Prałat, Wang, 08)

- key paradigm is transitivity: friends of friends are more likely friends; eg (Girvan and Newman, 03)
- iterative cloning of closed neighbour sets

- deterministic: amenable to analysis
- local: nodes often only have local influence
- evolves over time, but retains memory of initial graph

On-line Social Networks - Anthony Bonato

ILT model

- parameter: finite simple undirected graph G = G0
- to form the graph Gt+1 for each vertex x from time t, add a vertex x’, the clone ofx, so that xx’ is an edge, and x’ is joined to each neighbour of x
- order of Gt is 2tn0

On-line Social Networks - Anthony Bonato

G0 = C4

On-line Social Networks - Anthony Bonato

Properties of ILT model

- average degree increasing to ∞ with time
- average distance bounded by constant and converging, and in many cases decreasing with time; diameter does not change
- clustering higher than in a random generated graph with same average degree
- bad expansion: small gaps between 1st and 2nd eigenvalues in adjacency and normalized Laplacian matrices of Gt

On-line Social Networks - Anthony Bonato

Densification

- nt = order of Gt, et = size of Gt
Lemma: For t > 0,

nt = 2tn0, et = 3t(e0+n0) - 2tn0.

→ densification power law:

et ≈ nta, where a = log(3)/log(2).

On-line Social Networks - Anthony Bonato

Average distance

Theorem 2: If t > 0, then

- average distance bounded by a constant, and converges; for many initial graphs (large cycles) it decreases
- diameter does not change from time 0

On-line Social Networks - Anthony Bonato

Clustering Coefficient

Theorem 3: If t > 0, then

c(Gt) = ntlog(7/8)+o(1).

- higher clustering than in a random graph G(nt,p) with same order and average degree as Gt, which satisfies
c(G(nt,p)) = ntlog(3/4)+o(1)

On-line Social Networks - Anthony Bonato

Sketch of proof of lower bound

- each node x at time t has a binary sequence corresponding to descendants from time 0, with a clone indicated by 1
- let e(x,t) be the number of edges in N(x) at time t
- we may show that
e(x,t+1) = 3e(x,t) + 2degt(x)

e(x’,t+1) = e(x,t) + degt(x)

- if there are k many 0’s in the binary sequence of x, then
e(x,t) ≥ 3k-2e(x,2) = Ω(3k)

On-line Social Networks - Anthony Bonato

Sketch of proof, continued

- there are many nodes with k many
0’s in their binary sequence

- hence,

On-line Social Networks - Anthony Bonato

Example of community structure

- Wayne Zachary’s Ph.D. thesis (1970-72): observed social ties and rivalries in a university karate club (34 nodes,78 edges)
- during his observation, conflicts intensified and group split

On-line Social Networks - Anthony Bonato

Spectral results

- the spectral gapλ of G is defined by
min{λ1, 2 - λn-1},

where 0 = λ0 ≤ λ1 ≤ … ≤ λn-1 ≤ 2 are the eigenvalues of the normalized Laplacian of G: I-D-1/2AD1/2(Chung, 97)

- for random graphs, λtends to 1 as order grows
- in the ILT model, λ < ½
- bad expansion/small spectral gaps in the ILT model found in social networks but not in the web graph (Estrada, 06)
- in social networks, there are a higher number of intra- rather than inter-community links

On-line Social Networks - Anthony Bonato

Random ILT model

- randomize the ILT model
- add random edges independently to new nodes, with probability a function of t
- makes densification tunable
- densification exponent becomes
log(3 + ε) / log(2),

where ε is any fixed real number in (0,1)

- gives any exponent in (log(3)/log(2), 2)

- similar (or better) distance, clustering and spectral results as in deterministic case

- densification exponent becomes

On-line Social Networks - Anthony Bonato

Degree distribution

- generate power law graphs from ILT?
- deterministic ILT model gives a binomial-type distribution

On-line Social Networks - Anthony Bonato

Geometric model for social networks

- OSNs live in social space: proximity of nodes depends on common attributes (such as geography, gender, age, etc.)
- IDEA: embed OSN in m-dimensional Euclidean space

On-line Social Networks - Anthony Bonato

Dimension of an OSN

- dimension of OSN: minimum number of attributes needed to classify nodes
- like game of “20 Questions”: each question narrows range of possibilities
- what is a credible mathematical formula for the dimension of an OSN?

On-line Social Networks - Anthony Bonato

Random geometric graphs

- nodes are randomly distributed in Euclidean space according to a given distribution
- nodes are joined by an edge if and only if their distance is less than a threshold value
(Penrose, 03)

On-line Social Networks - Anthony Bonato

Spatial model for OSNs

- we consider a spatial model of OSNs, where
- nodes are embedded in m-dimensional Euclidean space
- number of nodes is static
- threshold value variable: a function of ranking of nodes

On-line Social Networks - Anthony Bonato

Prestige-Based Spatial (PBS) Model(Bonato, Janssen, Prałat, 09)

- parameters: α, β in (0,1), α+β < 1; positive integer m
- nodes live in hypercube of dimension m, measure 1
- each node is ranked 1,2, …, n by some function r
- 1 is best, n is worst
- we use random initial ranking

- at each time-step, one new node v is born, one node chosen u.a.r. dies (and ranking is updated)
- each existing node u has a region of influence with volume
- add edge uv if v is in the region of influence of u

On-line Social Networks - Anthony Bonato

Notes on PBS model

- models uses both geometry and ranking
- dynamical system: gives rise to ergodic (therefore, convergent) Markov chain
- users join and leave OSNs

- number of nodes is static: fixed at n
- order of OSNs has ceiling

- top ranked nodes have larger regions of influence

On-line Social Networks - Anthony Bonato

Properties of the PBS model (Bonato, Janssen, Prałat, 09)

- with high probability, the PBS model generates graphs with the following properties:
- power law degree distribution with exponent
b = 1+1/α

- average degree d =(1+o(1))n(1-α-β)/21-α
- dense graph
- tends to infinity with n

- diameter D = (1+o(1))nβ/(1-α)m
- depends on dimension m
- m = clog n, then diameter is a constant

- power law degree distribution with exponent

On-line Social Networks - Anthony Bonato

Dimension of an OSN, continued

- given the order of the network n, power law exponentb, average degree d, and diameterD, we can calculate m
- gives formula for dimension of OSN:

On-line Social Networks - Anthony Bonato

Uncovering the hidden reality

- reverse engineering approach
- given network data (n, b, d, D), dimension of an OSN gives smallest number of attributes needed to identify users

- that is, given the graph structure, we can (theoretically) recover the social space

On-line Social Networks - Anthony Bonato

Examples

On-line Social Networks - Anthony Bonato

Future directions

- what is a community in an OSN?
- (Porter, Onnela, Mucha,09): a set of graph partitions obtained by some “reasonable” iterative hierarchical partitioning algorithm
- motifs
- Pott’s method from statistical mechanics
- betweeness centrality

- lack of a formal definition, and few theorems

On-line Social Networks - Anthony Bonato

Spatial ranking models

- rigorously analyze spatial model with ranking by
- age
- degree

- simulate PBS model
- fit model to data
- is theoretical estimate of the dimension of an OSN accurate?

On-line Social Networks - Anthony Bonato

Who is popular?

- how to find popular users?
- PageRank in OSNs
- domination number
- constant in ILT model
- in OSN data, domination number is large (end-vertices)
- which is the correct graph parameter to consider?

On-line Social Networks - Anthony Bonato

WOSN’2010

On-line Social Networks - Anthony Bonato