440 likes | 1.21k Views
Exponential Random Graph Models (ERGM). Michael Beckman PAD777 April 9, 2010. Introduction. “The purpose of ERGM, in a nutshell, is to describe parsimoniously the local selection forces that shape the global structure of a network.”
E N D
Exponential Random Graph Models(ERGM) Michael Beckman PAD777 April 9, 2010
Introduction • “The purpose of ERGM, in a nutshell, is to describe parsimoniously the local selection forces that shape the global structure of a network.” • “ERGM may then be used to understand a particular phenomenon or to simulate new random realizations of networks that retain the essential properties of the original.” (Hunter et al 2008) • General characteristics of ERGM • Single observation rather than successive waves • Change statistics compare observed network to random realizations • Still computes Markov or Markov-like statistics • Can model both structural and attribute parameters • Assumptions and constraints are important to estimations • Improved SE’s even where pseudolikelihood produces acceptable estimates • Goodness of fit statistics are reliable • Significant move towards true stochastic modeling of networks
Agenda • Wasserman and Robins (2005) An Introduction to Random Graphs, Dependence Graphs, and p* • Snijders ( 2002) Markov chain monte carlo estimation of ERGM • Robins et al (2007) Recent developments in exponential random graph (p*) models for social networks • Hunter et al (2008) A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks • Morris et al (2008) Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects • Andrew (2009) Regional integration through contracting networks
Wasserman & Robins - Intro • Wasserman and Robins (2005) An Introduction to Random Graphs, Dependence Graphs, and p* • Historic development of p* distribution for Markov random graphs • Frank and Strauss 1986 • Strauss and Ikeda 1990 (estimation of distribution parameters) • Wasserman and Pattison 1996 (extend parameter assumptions) • Wasserman and Robins 2005 – Family of models from dependence graphs • Versus approximate autologistic regression (pseudo-likelihood) • Standard network notation • r=1- single relation, dichotomous data • Random variables, assumed interdependent • Can use multivariate or valued relations • Dependence graphs allows testing for independent elements in matrix X
The complement relation has no tie coded from i to j - one can view this single variable as missing Wasserman & Robins - Intro • Model parameters estimated from three new arrays; converse, composition, intersection of measured relations
Wasserman & Robins - Intro • Consider the observed network as a subset of all possible configurations • Dependence graphs help distinguish among possible distributions, by identifying ties that are statistically independent • Dependence graph: graph of nodes whose edges signify pairs of random variables that are assumed to be conditionally dependent
Wasserman & Robins - Intro • Three classes of dependence graphs: • Bernoulli – assumption of conditional independence for each pair of ties • Empty graph, due to complete independence • Conditional uniform distribution • Dyadic dependence – assumes all dyads are statistically independent • Dependence graph has edge set for each dyad • Basis for p1 model of Holland and Leinhardt (1977,1981) • General dependence graph – arbitrary edge set with general probability distribution – basis for p*
Wasserman & Robins - Intro • Markov graphs and p* • Any two relational ties associated if they involve same actor • Observed network considered a realization x of random array X • Dependence graph D consists of any complete subgraphs, or cliques • Hammersley-Clifford theorem characterizes Pr(X=x) in the form of an exponential family of distributions • Set of non-zero parameters depends on maximal cliques
Wasserman & Robins - Intro • Estimating parameters can overwhelm the model, so constraints are needed • Impose dependence assumptions on parameters • Homogeneity – ie, isomorphic dyads (MAN) • Higher-order configurations typically set to zero (stars, triads etc) • Constrained social settings • Exact differentiation of log likelihood is mathematically challenging • Pseudolikelihood – measures of fit problematic • MCMC – model degeneracy may be a problem • MCMC is normally preferred, improved algorithms are available and/or being developed
Snijders – MCMC Estimation • Snijders ( 2002) Markov chain Monte Carlo Estimation of ERGM • Random graph is a Markov graph if number of nodes is fixed, and non-incident edges are independent conditional upon rest of graph • Exponential family of probability functions (p*) • Where y is the adjacency matrix of a digraph and the sufficient statistic u(y) is any vector of statistics of the digraph • Pseudolikelihood not a function of complete sufficient statistic u(Y) so not a “suitable” estimator • Dahmstrom and Dahmstrom (1993) proposed MCMC
Snijders – MCMC Estimation • Random graph is a Markov graph if number of nodes is fixed, and non-incident edges are independent conditional upon rest of graph • Gibbs Sampling – all elements Yij are updated randomly, one element per draw, with all other elements left unchanged • Assumes convergence at t -> • Conditional distribution toggles between Yij = 1 and Yij = 0 • Can result in “severe convergence problems” • Model may not simulate effects properly, or • May result in an ‘explosion of ties’ after significant stasis • Bi-modal distribution results, consisting of high-density and low-density states or regimes • Regime is defined as a subset of the outcome space • Other regimes are possible (besides bi-modal)
Snijders – MCMC Estimation • Reciprocity p* model – # of edges and reciprocity • Assumes dyadic independence • Probabilities calculated for MAN • Independence assumption precludes the ‘explosion’ effect • Twostar p* model - # of edges and out-twostars • Rows in adjacency matrix are statistically independent • If total number of Y++ are fixed, number of out-twostars is a linear function of out-degree variance • Combined reciprocity and twostar p* model – density, reciprocity, out-twostar • Transforms digraph into its complement • Changes Yij to (1 – Yij) • Density must be set to 0.5 • Simulates graphs equal to, less than or greater than 0.5 density • Can result in the “explosion effect” • In effect, results are determined by initial state ( high or low density)
Snijders – MCMC Estimation • Gibbs sampling algorithm • For every two outcomes, there is a positive probability to go from one outcome to the other in finite steps, but • It is possible one regime is dominant, so that sojourn time from one state to the other is practically infinite, so • Initial state determines outcome with 0.5 probability – coin toss • Three problems arise • Bi-modal distribution is undesirable for single network observation • Convergence with two regimes can be so slow that generating a random draw is practically impossible • Expected values of sufficient statistics are extremely sensitive to parameter values, causing instability of estimation • Other iteration procedures have been proposed and tested
Snijders – MCMC Estimation • Detailed balance technique • Set of all adjacency matrices Yg • Results in unique stationary distribution • Small updating steps – one element of Yij per step, as with Gibbs sampling • Cell being updated is random, rather than deterministic • Referred to as mixing, versus cycling • Metropolis-Hastings algorithm - Changes Yij to (1 – Yij), all other ties constant • Updates more frequently than Gibbs, so more efficient • Dyadic or triplet updating steps – update several elements per step • Dyad or triplets chosen randomly • “Groupwise” updating • Slower to converge
Snijders – MCMC Estimation • Large updating steps – update Yij from 0 to 1 or vice versa in blocks • Biggest step is converting graph to its complement (inversion) • Satisfies the detailed balance equation • May be appropriate for bimodal distributions • Inversion may reduce variance in estimation (conditioning) • Fixed density – only digraphs with given number of ties are drawn • Random undirected graphs – applied to half matrix of unique elements • ML estimation – not easily applied to exponential random graphs, due to problematic calculation for complex models • Pseudolikelihood estimates can be good, but standard errors are too low • Monte Carlo Markov Chain estimates • Monte carlo simulation of Markov graph estimates moments • Moments are used to estimate parameter effects for a neighborhood
Snijders – MCMC Estimation • MCMC: Newton-Raphson Algorithm and Robbins-Monro Algorithm similar • Robbins-Monro Algorithm – three phases • Estimate diagonal matrix using derivative of initial parameter estimate • Iteratively determines provisional estimation values, leads quickly to solution of moment equation • Large steps can lead to instability • Parameter value is kept constant, then large number of steps used to check validity of equation • Use of MC with Robbins-Monro yields, in theory, convergence probability of 1 • Snijders recommends use of inversion steps for models with triplet counts
Robins et al – Recent Developments • Robins et al (2007) Recent developments in exponential random graph (p*) models for social networks • Technically, MCMC estimation does not converge due to degeneracy problem – “near degenerate” • Problem is more acute as network size grows larger • Inclusion of suitable constraints on parameters allows for estimation • Parameters then provide information on structural effects • Recall from Snijders problem of bimodal distribution/model degeneration • Gradual increase in triangle parameter does not lead to gradual increase in graph triangulation, so inclusion of star/triangle parameters does not overcome problem
Robins et al – Recent Developments • Inclusion of higher-order structures • Alternating k-stars • Alternating k-triangles • Alternating independent two-paths • Alternating k-stars, technically only structure still a Markov random graph • Assumption allows stars up to (n-1) • Recall in previous models, higher-order stars normally set to 0 • In alternating k-star, higher-order stars are allowed • Impact of higher-order stars is gradually diminished • Essentially, there is weighting of structure from simple to complex • Allows for interesting inference regarding network structure • Positive parameter indicates “hubs” in node structure • Negative parameter indicates smaller variance in degree (decentralized)
Robins et al – Recent Developments • Interpreting alternating k-star models • Positive parameter – tendency toward large number of low degree nodes, and small number of high-degree nodes • Node degree may become saturated • Increase in “popularity” plateaus: additional ties do not “add value” • Indicative of a loose core-periphery structure • Alternation between positive and negative values helps prevent distribution graph from being forced to empty or complete graphs ( a la Snijders et al 06)
Robins et al – Recent Developments • Alternating k-triangles introduces conditional dependence • In short, two possible edges in a graph, Yrs and Yuv, for distinct nodes r, s, u, v, are assumed to be conditionally dependent if Ysu = Yuv = 1. • In other words, if the two possible edges in the graph were actually observed, they would create a 4-cycle. • Defines social circuit dependence • Chance of Ysu is conditionally dependent on presence of Yuv • Snijders et al (2006) combine k-triangles with Markov dependence • K-triangle is combination of individual triangles that share one edge (base) • Shared adjacency with other nodes are triangle sides • Conditionally dependent structure, IF either Markov configuration (shared node), or Social Circuit Configuration (4-cycle)
Robins et al – Recent Developments • Interpreting k-triangles • Positive parameter provides evidence of transitivity effects • Also can suggest core-periphery structure, but due to triangulation rather than popularity influence • More of a structural effect than an attribute effect • IE, outcome of the triangulation process • Alternating k-twopaths • Lower order structure • Combine with k-triangles • Distinguish tendency to form ties at base versus side of triangle • Side edges absent base edges indicates precondition to transitivity • Presence of base edge indicates transitive closure • Combination of parameters can indicate pressure towards closure
Robins et al – Recent Developments • Other possible parameters
Robins et al – Recent Developments • Estimating parameters • MCMC is preferred method, when available • When model converges, simulation produces distribution of graphs in which observed graph is typical for all effects • Reliable standard errors • Snijders et al (2006) conditioned on edges • No density parameter • Diminishes degeneracy problem with moderate impact on other parameters • Robins et al find that, at least for smaller networks, conditioning on edges may not be needed
Robins et al – Recent Developments • Modeling with SIENA • Output of estimates, standard error, t-stat for estimate (how well model converges) • t-ratio close to zero = good convergence of model • Large ratios may indicate model has not converged, or is degenerate • For non-degenerate models, absolute value of less than 0.1 is converged • Other tests in SIENA • Hysteresis analysis • Simulate from estimates and compare with observed graph • Modeling with statnet • Newton-Raphson algorithm • Fewer simulation runs, then weights graphs for estimating • Incorporates advances from Metropolis-Hastings
Robins et al – Recent Developments Comparing pseduolikelihood to MCMC UCINET datasets, SIENA modeling
Hunter et al – Package to Fit… • Hunter et al (2008) A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks • Implementing ERGM in R/statnet • Specify ERGM • Approximate/exact MLE • Goodness of fit tests • “The purpose of ERGM, in a nutshell, is to describe parsimoniously the local selection forces that shape the global structure of a network.” • “ERGM may then be used to understand a particular phenomenon or to simulate new random realizations of networks that retain the essential properties of the original.”
Hunter et al – Package to Fit… • Implementing ERGM in R/statnet – variables • Endogenous – result of structure • Exogenous – attribute based (can serve as predictors) • Attributes can be treated as functions of nodal covariates • Statistics depend on attribute and relationship information • Change statistics – recall we are comparing conditional distribution toggled between Yij = 1 and Yij = 0 (or some other Markov configuration) • Particular choice g() of statistics • Particular network y • Particular pair of nodes (i,j) • Seed can be specified for reproducibility
Hunter et al – Package to Fit… • Dyadic independence models • Dyadic independence term • Term in an ERGM for which change statistics can be calculated regardless of value of (i,j) or any knowledge of y • Dyadic independence ERGM • All terms in the model are dyadic independence terms • This model is purely stochastic • For undirected models, unconditional or marginal probability is allowed • Important to distinguish between dyadic and linear independence • Linear dependencies can arise with either form above • Implications for model specification • Statnet eliminates/allows for elimination of statistics as needed
Hunter et al – Package to Fit… • Dyadic dependence models • Dyads that do not share a node are conditionally independent • Analogous to nearest neighbor • Homogeneity condition may be added as a constraint • All isomorphic networks have same probability • Problems with model as previously discussed • Correctives suggested: • combine terms (endogenous and exogenous) • Specify triad-based curved exponential family terms • Geometrically weighted degree (GWD) • Geometrically weighted edgewise shared partner (GWESP) • Geometrically weighted dyadwise shared partner (GWDSP)
Hunter et al – Package to Fit… • Curved exponential family model
Hunter et al – Package to Fit… • Estimation and goodness of fit • Parameters: • Edges • Homophily term for grade • Main effect for sex • P. 23
Morris et al – Specification of ERGM • Morris et al (2008) Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects • Where Hunter et al focused more on theory and statistical formulas, Morris et al provide basic instruction on implement ERGM in R/statnet • Commands for basic effects, nodal attributes, relational attributes, structural configurations, higher-order configurations, actor specific effects, constraints • Tips to fine-tune algorithm and processing • Appendix A Table of Model Terms provides quick reference for what terms are appropriate to a particular model • IE, directed/undirected, bipartite, dyadic independence etc.
Morris et al – Specification of ERGM • Constraints • Model must include space of all possible networks • Some networks are bipartite – communication between but never within groups of nodes • ERGM automatically implements these constraints as needed
Andrews – Regional Integration • Andrew (2009) Regional integration through contracting networks • Research question: Under what conditions do local governments choose to contract for services, or enter into regional agreements for the provision of services? • Two hypotheses are advanced: • Bonding hypothesis – in the presence of uncertainty and complexity of interjurisdictional activities, a highly dense network structure will emerge over time • Bridging hypothesis – for interjurisdictional activities involving high asset specificity, a sparse, “core-periphery” network is anticipated • Institutional collective action framework – transaction cost analysis, enforcement and monitoring, free-rider problem
Andrews – Regional Integration • Bonding – local officials attracted to interjurisdictional, voluntary cooperation agreements • Flexible, non-binding, fosters “norm of reciprocity” • Can be constrained by local politics and coordination costs • Bridging – in asset-specific dilemma, local officials likely to choose strategic partner • May produce services in-house • Induce competition to attenuate opportunism of central actor • Expected to contract with partner who already has ties with other jurisdictions
Andrews – Regional Integration • Research Design: • Contractual ties among law enforcement community in Orlando-Kissimmee • Five waves from 1986 to 2003 • 66 total actors • List of goods & services derived from International City/County Management Association surveys • Studying one metropolitan area controls for geographic variation and allows for in-depth analysis of regional integration
Andrews – Regional Integration • Parameters • Transitive triads • Geodesic distance-2 • Covariate effects • Importance of level of government, where municipality is coded 1 and higher level government is treated as benchmark • Importance of professionalism, indicated by accreditation • Both coded as dummy variables, treated as control variables • Homophily effect • Rate parameters were all positive and significant • T-ration less than 0.3, indicating no problems with convergence (?)