Exponential Random Graph Models (ERGM)

Exponential Random Graph Models(ERGM) Michael Beckman PAD777 April 9, 2010

Introduction • “The purpose of ERGM, in a nutshell, is to describe parsimoniously the local selection forces that shape the global structure of a network.” • “ERGM may then be used to understand a particular phenomenon or to simulate new random realizations of networks that retain the essential properties of the original.” (Hunter et al 2008) • General characteristics of ERGM • Single observation rather than successive waves • Change statistics compare observed network to random realizations • Still computes Markov or Markov-like statistics • Can model both structural and attribute parameters • Assumptions and constraints are important to estimations • Improved SE’s even where pseudolikelihood produces acceptable estimates • Goodness of fit statistics are reliable • Significant move towards true stochastic modeling of networks

Agenda • Wasserman and Robins (2005) An Introduction to Random Graphs, Dependence Graphs, and p* • Snijders ( 2002) Markov chain monte carlo estimation of ERGM • Robins et al (2007) Recent developments in exponential random graph (p*) models for social networks • Hunter et al (2008) A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks • Morris et al (2008) Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects • Andrew (2009) Regional integration through contracting networks

Wasserman & Robins - Intro • Wasserman and Robins (2005) An Introduction to Random Graphs, Dependence Graphs, and p* • Historic development of p* distribution for Markov random graphs • Frank and Strauss 1986 • Strauss and Ikeda 1990 (estimation of distribution parameters) • Wasserman and Pattison 1996 (extend parameter assumptions) • Wasserman and Robins 2005 – Family of models from dependence graphs • Versus approximate autologistic regression (pseudo-likelihood) • Standard network notation • r=1- single relation, dichotomous data • Random variables, assumed interdependent • Can use multivariate or valued relations • Dependence graphs allows testing for independent elements in matrix X

The complement relation has no tie coded from i to j - one can view this single variable as missing Wasserman & Robins - Intro • Model parameters estimated from three new arrays; converse, composition, intersection of measured relations

Wasserman & Robins - Intro • Consider the observed network as a subset of all possible configurations • Dependence graphs help distinguish among possible distributions, by identifying ties that are statistically independent • Dependence graph: graph of nodes whose edges signify pairs of random variables that are assumed to be conditionally dependent

Wasserman & Robins - Intro • Three classes of dependence graphs: • Bernoulli – assumption of conditional independence for each pair of ties • Empty graph, due to complete independence • Conditional uniform distribution • Dyadic dependence – assumes all dyads are statistically independent • Dependence graph has edge set for each dyad • Basis for p1 model of Holland and Leinhardt (1977,1981) • General dependence graph – arbitrary edge set with general probability distribution – basis for p*

Wasserman & Robins - Intro • Markov graphs and p* • Any two relational ties associated if they involve same actor • Observed network considered a realization x of random array X • Dependence graph D consists of any complete subgraphs, or cliques • Hammersley-Clifford theorem characterizes Pr(X=x) in the form of an exponential family of distributions • Set of non-zero parameters depends on maximal cliques

Wasserman & Robins - Intro • Estimating parameters can overwhelm the model, so constraints are needed • Impose dependence assumptions on parameters • Homogeneity – ie, isomorphic dyads (MAN) • Higher-order configurations typically set to zero (stars, triads etc) • Constrained social settings • Exact differentiation of log likelihood is mathematically challenging • Pseudolikelihood – measures of fit problematic • MCMC – model degeneracy may be a problem • MCMC is normally preferred, improved algorithms are available and/or being developed

Snijders – MCMC Estimation • Snijders ( 2002) Markov chain Monte Carlo Estimation of ERGM • Random graph is a Markov graph if number of nodes is fixed, and non-incident edges are independent conditional upon rest of graph • Exponential family of probability functions (p*) • Where y is the adjacency matrix of a digraph and the sufficient statistic u(y) is any vector of statistics of the digraph • Pseudolikelihood not a function of complete sufficient statistic u(Y) so not a “suitable” estimator • Dahmstrom and Dahmstrom (1993) proposed MCMC

Snijders – MCMC Estimation • Random graph is a Markov graph if number of nodes is fixed, and non-incident edges are independent conditional upon rest of graph • Gibbs Sampling – all elements Yij are updated randomly, one element per draw, with all other elements left unchanged • Assumes convergence at t ->  • Conditional distribution toggles between Yij = 1 and Yij = 0 • Can result in “severe convergence problems” • Model may not simulate effects properly, or • May result in an ‘explosion of ties’ after significant stasis • Bi-modal distribution results, consisting of high-density and low-density states or regimes • Regime is defined as a subset of the outcome space • Other regimes are possible (besides bi-modal)

Snijders – MCMC Estimation • Reciprocity p* model – # of edges and reciprocity • Assumes dyadic independence • Probabilities calculated for MAN • Independence assumption precludes the ‘explosion’ effect • Twostar p* model - # of edges and out-twostars • Rows in adjacency matrix are statistically independent • If total number of Y++ are fixed, number of out-twostars is a linear function of out-degree variance • Combined reciprocity and twostar p* model – density, reciprocity, out-twostar • Transforms digraph into its complement • Changes Yij to (1 – Yij) • Density must be set to 0.5 • Simulates graphs equal to, less than or greater than 0.5 density • Can result in the “explosion effect” • In effect, results are determined by initial state ( high or low density)

Snijders – MCMC Estimation • Gibbs sampling algorithm • For every two outcomes, there is a positive probability to go from one outcome to the other in finite steps, but • It is possible one regime is dominant, so that sojourn time from one state to the other is practically infinite, so • Initial state determines outcome with 0.5 probability – coin toss • Three problems arise • Bi-modal distribution is undesirable for single network observation • Convergence with two regimes can be so slow that generating a random draw is practically impossible • Expected values of sufficient statistics are extremely sensitive to parameter values, causing instability of estimation • Other iteration procedures have been proposed and tested

Snijders – MCMC Estimation • Detailed balance technique • Set of all adjacency matrices Yg • Results in unique stationary distribution • Small updating steps – one element of Yij per step, as with Gibbs sampling • Cell being updated is random, rather than deterministic • Referred to as mixing, versus cycling • Metropolis-Hastings algorithm - Changes Yij to (1 – Yij), all other ties constant • Updates more frequently than Gibbs, so more efficient • Dyadic or triplet updating steps – update several elements per step • Dyad or triplets chosen randomly • “Groupwise” updating • Slower to converge

Snijders – MCMC Estimation • Large updating steps – update Yij from 0 to 1 or vice versa in blocks • Biggest step is converting graph to its complement (inversion) • Satisfies the detailed balance equation • May be appropriate for bimodal distributions • Inversion may reduce variance in estimation (conditioning) • Fixed density – only digraphs with given number of ties are drawn • Random undirected graphs – applied to half matrix of unique elements • ML estimation – not easily applied to exponential random graphs, due to problematic calculation for complex models • Pseudolikelihood estimates can be good, but standard errors are too low • Monte Carlo Markov Chain estimates • Monte carlo simulation of Markov graph estimates moments • Moments are used to estimate parameter effects for a neighborhood

Snijders – MCMC Estimation • MCMC: Newton-Raphson Algorithm and Robbins-Monro Algorithm similar • Robbins-Monro Algorithm – three phases • Estimate diagonal matrix using derivative of initial parameter estimate • Iteratively determines provisional estimation values, leads quickly to solution of moment equation • Large steps can lead to instability • Parameter value is kept constant, then large number of steps used to check validity of equation • Use of MC with Robbins-Monro yields, in theory, convergence probability of 1 • Snijders recommends use of inversion steps for models with triplet counts

Robins et al – Recent Developments • Robins et al (2007) Recent developments in exponential random graph (p*) models for social networks • Technically, MCMC estimation does not converge due to degeneracy problem – “near degenerate” • Problem is more acute as network size grows larger • Inclusion of suitable constraints on parameters allows for estimation • Parameters then provide information on structural effects • Recall from Snijders problem of bimodal distribution/model degeneration • Gradual increase in triangle parameter does not lead to gradual increase in graph triangulation, so inclusion of star/triangle parameters does not overcome problem

Robins et al – Recent Developments

Robins et al – Recent Developments • Inclusion of higher-order structures • Alternating k-stars • Alternating k-triangles • Alternating independent two-paths • Alternating k-stars, technically only structure still a Markov random graph • Assumption allows stars up to (n-1) • Recall in previous models, higher-order stars normally set to 0 • In alternating k-star, higher-order stars are allowed • Impact of higher-order stars is gradually diminished • Essentially, there is weighting of structure from simple to complex • Allows for interesting inference regarding network structure • Positive parameter indicates “hubs” in node structure • Negative parameter indicates smaller variance in degree (decentralized)

Robins et al – Recent Developments • Interpreting alternating k-star models • Positive parameter – tendency toward large number of low degree nodes, and small number of high-degree nodes • Node degree may become saturated • Increase in “popularity” plateaus: additional ties do not “add value” • Indicative of a loose core-periphery structure • Alternation between positive and negative values helps prevent distribution graph from being forced to empty or complete graphs ( a la Snijders et al 06)

Robins et al – Recent Developments • Alternating k-triangles introduces conditional dependence • In short, two possible edges in a graph, Yrs and Yuv, for distinct nodes r, s, u, v, are assumed to be conditionally dependent if Ysu = Yuv = 1. • In other words, if the two possible edges in the graph were actually observed, they would create a 4-cycle. • Defines social circuit dependence • Chance of Ysu is conditionally dependent on presence of Yuv • Snijders et al (2006) combine k-triangles with Markov dependence • K-triangle is combination of individual triangles that share one edge (base) • Shared adjacency with other nodes are triangle sides • Conditionally dependent structure, IF either Markov configuration (shared node), or Social Circuit Configuration (4-cycle)

Robins et al – Recent Developments • Interpreting k-triangles • Positive parameter provides evidence of transitivity effects • Also can suggest core-periphery structure, but due to triangulation rather than popularity influence • More of a structural effect than an attribute effect • IE, outcome of the triangulation process • Alternating k-twopaths • Lower order structure • Combine with k-triangles • Distinguish tendency to form ties at base versus side of triangle • Side edges absent base edges indicates precondition to transitivity • Presence of base edge indicates transitive closure • Combination of parameters can indicate pressure towards closure

Robins et al – Recent Developments • Other possible parameters

Robins et al – Recent Developments • Estimating parameters • MCMC is preferred method, when available • When model converges, simulation produces distribution of graphs in which observed graph is typical for all effects • Reliable standard errors • Snijders et al (2006) conditioned on edges • No density parameter • Diminishes degeneracy problem with moderate impact on other parameters • Robins et al find that, at least for smaller networks, conditioning on edges may not be needed

Robins et al – Recent Developments • Modeling with SIENA • Output of estimates, standard error, t-stat for estimate (how well model converges) • t-ratio close to zero = good convergence of model • Large ratios may indicate model has not converged, or is degenerate • For non-degenerate models, absolute value of less than 0.1 is converged • Other tests in SIENA • Hysteresis analysis • Simulate from estimates and compare with observed graph • Modeling with statnet • Newton-Raphson algorithm • Fewer simulation runs, then weights graphs for estimating • Incorporates advances from Metropolis-Hastings

Robins et al – Recent Developments Comparing pseduolikelihood to MCMC UCINET datasets, SIENA modeling

Hunter et al – Package to Fit… • Hunter et al (2008) A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks • Implementing ERGM in R/statnet • Specify ERGM • Approximate/exact MLE • Goodness of fit tests • “The purpose of ERGM, in a nutshell, is to describe parsimoniously the local selection forces that shape the global structure of a network.” • “ERGM may then be used to understand a particular phenomenon or to simulate new random realizations of networks that retain the essential properties of the original.”

Hunter et al – Package to Fit… • Implementing ERGM in R/statnet – variables • Endogenous – result of structure • Exogenous – attribute based (can serve as predictors) • Attributes can be treated as functions of nodal covariates • Statistics depend on attribute and relationship information • Change statistics – recall we are comparing conditional distribution toggled between Yij = 1 and Yij = 0 (or some other Markov configuration) • Particular choice g() of statistics • Particular network y • Particular pair of nodes (i,j) • Seed can be specified for reproducibility

Hunter et al – Package to Fit… • Dyadic independence models • Dyadic independence term • Term in an ERGM for which change statistics can be calculated regardless of value of (i,j) or any knowledge of y • Dyadic independence ERGM • All terms in the model are dyadic independence terms • This model is purely stochastic • For undirected models, unconditional or marginal probability is allowed • Important to distinguish between dyadic and linear independence • Linear dependencies can arise with either form above • Implications for model specification • Statnet eliminates/allows for elimination of statistics as needed

Hunter et al – Package to Fit… • Dyadic dependence models • Dyads that do not share a node are conditionally independent • Analogous to nearest neighbor • Homogeneity condition may be added as a constraint • All isomorphic networks have same probability • Problems with model as previously discussed • Correctives suggested: • combine terms (endogenous and exogenous) • Specify triad-based curved exponential family terms • Geometrically weighted degree (GWD) • Geometrically weighted edgewise shared partner (GWESP) • Geometrically weighted dyadwise shared partner (GWDSP)

Hunter et al – Package to Fit… • Curved exponential family model

Hunter et al – Package to Fit… • Estimation and goodness of fit • Parameters: • Edges • Homophily term for grade • Main effect for sex • P. 23

Morris et al – Specification of ERGM • Morris et al (2008) Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects • Where Hunter et al focused more on theory and statistical formulas, Morris et al provide basic instruction on implement ERGM in R/statnet • Commands for basic effects, nodal attributes, relational attributes, structural configurations, higher-order configurations, actor specific effects, constraints • Tips to fine-tune algorithm and processing • Appendix A Table of Model Terms provides quick reference for what terms are appropriate to a particular model • IE, directed/undirected, bipartite, dyadic independence etc.

Morris et al – Specification of ERGM • Constraints • Model must include space of all possible networks • Some networks are bipartite – communication between but never within groups of nodes • ERGM automatically implements these constraints as needed

Andrews – Regional Integration • Andrew (2009) Regional integration through contracting networks • Research question: Under what conditions do local governments choose to contract for services, or enter into regional agreements for the provision of services? • Two hypotheses are advanced: • Bonding hypothesis – in the presence of uncertainty and complexity of interjurisdictional activities, a highly dense network structure will emerge over time • Bridging hypothesis – for interjurisdictional activities involving high asset specificity, a sparse, “core-periphery” network is anticipated • Institutional collective action framework – transaction cost analysis, enforcement and monitoring, free-rider problem

Andrews – Regional Integration • Bonding – local officials attracted to interjurisdictional, voluntary cooperation agreements • Flexible, non-binding, fosters “norm of reciprocity” • Can be constrained by local politics and coordination costs • Bridging – in asset-specific dilemma, local officials likely to choose strategic partner • May produce services in-house • Induce competition to attenuate opportunism of central actor • Expected to contract with partner who already has ties with other jurisdictions

Andrews – Regional Integration • Research Design: • Contractual ties among law enforcement community in Orlando-Kissimmee • Five waves from 1986 to 2003 • 66 total actors • List of goods & services derived from International City/County Management Association surveys • Studying one metropolitan area controls for geographic variation and allows for in-depth analysis of regional integration

Andrews – Regional Integration

Andrews – Regional Integration • Parameters • Transitive triads • Geodesic distance-2 • Covariate effects • Importance of level of government, where municipality is coded 1 and higher level government is treated as benchmark • Importance of professionalism, indicated by accreditation • Both coded as dummy variables, treated as control variables • Homophily effect • Rate parameters were all positive and significant • T-ration less than 0.3, indicating no problems with convergence (?)

Andrews – Regional Integration P.392

Exponential Random Graph Models (ERGM)

Exponential Random Graph Models (ERGM)

Presentation Transcript

Exponential Random Graph Models

Exploring Exponential Models

Exponential random graph ( p *) models for social networks

Exponential models

Random-Graph Theory

Exponential Models

Extending ERGM Functionality within statnet : Building Custom User Terms

Graph Models

ERGM 1413 Programming and Playing with Intelligent Robots

4.1 Graph Exponential GrowthFunctions

Random Graph Models of Social Networks

Exploring Exponential Models

Fitting Exponential Models

UNIFORM AND EXPONENTIAL RANDOM VARIABLES

Graph Exponential Growth Functions

Graph Exponential Growth Functions

Designing Random Graph Models using Variational Autoencoders…

Network (graph) Models

MIS 644 Social Newtork Analysis 2014/2015 Fall

1.1. Graph Models

Evaluate exponential functions. Identify and graph exponential functions.