slide1
Download
Skip this Video
Download Presentation
Social Network Analysis American Sociological Association San Francisco, August 2004 James Moody

Loading in 2 Seconds...

play fullscreen
1 / 147

Social Network Analysis American Sociological Association San Francisco, August 2004 James Moody - PowerPoint PPT Presentation


  • 136 Views
  • Uploaded on

Social Network Analysis American Sociological Association San Francisco, August 2004 James Moody. Introduction. We live in a connected world:.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Social Network Analysis American Sociological Association San Francisco, August 2004 James Moody' - jana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
Social Network Analysis

American Sociological Association

San Francisco, August 2004

James Moody

slide2
Introduction

We live in a connected world:

“To speak of social life is to speak of the association between people – their associating in work and in play, in love and in war, to trade or to worship, to help or to hinder. It is in the social relations men establish that their interests find expression and their desires become realized.”

Peter M. Blau

Exchange and Power in Social Life, 1964

"If we ever get to the point of charting a whole city or a whole nation, we would have … a picture of a vast solar system of intangible structures, powerfully influencing conduct, as gravitation does in space. Such an invisible structure underlies society and has its influence in determining the conduct of society as a whole."

J.L. Moreno, New York Times, April 13, 1933

These patterns of connection form a social space, that can be seen in multiple contexts:

slide3
Introduction

Source: Linton Freeman “See you in the funny pages” Connections, 23, 2000, 32-42.

slide4
Introduction

High Schools as Networks

slide7
Introduction
  • And yet, standard social science analysis methods do not take this space into account.
    • “For the last thirty years, empirical social research has been dominated by the sample survey. But as usually practiced, …, the survey is a sociological meat grinder, tearing the individual from his social context and guaranteeing that nobody in the study interacts with anyone else in it.”
    • Allen Barton, 1968 (Quoted in Freeman 2004)
  • Moreover, the complexity of the relational world makes it impossible to identify social connectivity using only our intuitive understanding.
  • Social Network Analysis (SNA) provides a set of tools to empirically extend our theoretical intuition of the patterns that construct social structure.
slide8
Introduction

Why do Networks Matter?

Local vision

slide9
Introduction

Why do Networks Matter?

Local vision

slide10
Introduction
  • Why networks matter:
  • Intuitive: “goods” travel through contacts between actors, which can reflect a power distribution or influence attitudes and behaviors. Our understanding of social life improves if we account for this social space.
  • Less intuitive: patterns of inter-actor contact can have effects on the spread of “goods” or power dynamics that could not be seen focusing only on individual behavior.
slide11
Introduction
  • Social network analysis is:
  • a set of relational methods for systematically understanding and identifying connections among actors. SNA
    • is motivated by a structural intuition based on ties linking social actors
    • is grounded in systematic empirical data
    • draws heavily on graphic imagery
    • relies on the use of mathematical and/or computational models.
  • Social Network Analysis embodies a range of theories relating types of observable social spaces and their relation to individual and group behavior.
slide12
Introduction
  • Social Network Data
    • Basic data Elements
    • Collecting network data
    • Basic data structures
  • Measuring Networks
    • Flows within of goods in networks
      • Topology
      • Time
    • Structure of Social Space
      • Small Worlds, Scale-Free, Triads
      • Cohesive Groups
      • Role Positions
  • Modeling with Networks
    • Modeling Behaviors with Networks
      • Peer attribute models
      • Network Autocorrelation Models
      • Dyad / QAP Models
    • Modeling Network Network Structure
      • QAP for network structure
      • Exponential Random Graph Models
  • SNA Computer Programs
slide13
Social Network Data

The unit of interest in a network are the combined sets of actors and their relations.

We represent actors with points and relations with lines.

Actors are referred to variously as:

Nodes, vertices or points

Relations are referred to variously as:

Edges, Arcs, Lines, Ties

Example:

b

d

a

c

e

slide14
b

d

b

b

d

d

a

c

e

a

a

c

c

e

e

Social Network Data

In general, a relation can be:

Binary or Valued

Directed or Undirected

Directed, binary

Undirected, binary

b

d

1

2

1

3

4

a

c

e

Directed, Valued

Undirected, Valued

slide15
Social Network Data
  • Social network data are substantively divided by the number of modes in the data.
  • 1-mode data represents edges based on direct contact between actors in the network. All the nodes are of the same type (people, organization, ideas, etc). Examples:
    • Communication, friendship, giving orders, sending email.
  • 1-mode data are usually singly reported (each person reports on their friends), but you can use multiple-informant data, which is more common in child development research (Cairns and Cairns).
slide16
Social Network Data

Social network data are substantively divided by the number of modes in the data.

2-mode data represents nodes from two separate classes, where all ties are across classes. Examples:

People as members of groups

People as authors on papers

Words used often by people

Events in the life history of people

The two modes of the data represent a duality: you can project the data as people connected to people through joint membership in a group, or groups to each other through common membership

There may be multiple relations of multiple types connecting your nodes.

slide17
Social Network Data

We can examine networks across multiple levels:

  • 1) Ego-network
    • - Have data on a respondent (ego) and the people they are connected to (alters). Example: 1985 GSS module
    • - May include estimates of connections among alters
  • 2) Partial network
    • - Ego networks plus some amount of tracing to reach contacts of contacts
    • - Something less than full account of connections among all pairs of actors in the relevant population
    • - Example: CDC Contact tracing data for STDs
slide18
Social Network Data

We can examine networks across multiple levels:

  • 3) Complete or “Global” data
    • - Data on all actors within a particular (relevant) boundary
    • - Never exactly complete (due to missing data), but boundaries are set
    • Example: Coauthorship data among all writers in the social sciences, friendships among all students in a classroom
    • For the most part, I will be discussing techniques surrounding global networks today, though I will briefly mention some standard uses of ego-network data.
slide19
Social Network Data

Collecting Network Data

  • Data capture any connection between the nodes. Sources include surveys, published accounts, special informants, etc.
  • In general, you can only make conclusions about relations among the set of nodes you have collected, so it is important to observe as much of the network as possible.
  • See W&F, chap 2 on different types of data collection
slide20
Social Network Data

Collecting Network Data

  • If you use surveys to collect data, some general rules of thumb:
  • Network data collection can be time consuming. It is better (I think) to have breadth over depth. Having detailed information on <50% of the sample will make it very difficult to draw conclusions about the general network structure.
  • Question format:
    • If you ask people to recall names (an open list format), fatigue will result in under-reporting
    • If you ask people to check off names from a full list, you can often get over-reporting
  • c)It is common to limit people to ~5 nominations. This will bias network stats for stars, but is sometimes the best choice to avoid fatigue.
  • d) Concrete relational indicators are best (who did you talk to?) over attitudes that are harder to define (who do you like?)
slide21
Social Network Data

Collecting Network Data

  • Existing Sources of Social Network Data
    • Check INSNA: The International Network of Social Network Analysis
    • Many secondary sources (particularly for 2-mode data)
    • National Longitudinal Survey of Adolescent Health (Add Health)
slide22
Social Network Data

Basic Data Structures

Working with pictures.

No standard way to draw a sociogram: each of these are equal:

slide23
Social Network Data

Basic Data Structures

In general, graphs are cumbersome to work with analytically, though there is a great deal of good work to be done on using visualization to build network intuition.

I recommend using layouts that optimize on the feature you are most interested in, and find that either a hierarchical layout or a force-directed layout are best.

slide24
a

a

b

b

c

c

d

d

e

e

b

d

b

d

a

a

1

1

a

c

e

a

1

1

c

e

b

b

1

c

c

1

1

1

1

1

1

d

d

1

1

e

e

1

1

1

1

Social Network Data

Basic Data Structures

From pictures to matrices

Undirected, binary

Directed, binary

slide25
a b

b a c

c b d e

d c e

e c d

a

b

c

d

e

a

1

1

b

1

c

1

1

1

d

1

1

e

1

1

Social Network Data

Basic Data Structures

From matrices to lists

Arc List

Adjacency List

a b

b a

b c

c b

c d

c e

d c

d e

e c

e d

slide26
Measuring Networks: Flow

“Goods” flow through networks:

slide27
Measuring Networks: Flow
  • In addition to the simple probability that one actor passes information on to another (pij), two factors affect flow through a network:
  • Topology
    • the shape, or form, of the network
    • - Example: one actor cannot pass information to another unless they are either directly or indirectly connected
  • Time
    • - the timing of contact matters
    • - Example: an actor cannot pass information he has not receive yet
slide28
Measuring Networks: Flow

Two features of the network’s topology are known to be important: connectivity and centrality

  • Connectivity refers to how actors in one part of the network are connected to actors in another part of the network.
    • Reachability: Is it possible for actor i to reach actor j? This can only be true if there is a chain of contact from one actor to another.
    • Distance: Given they can be reached, how many steps are they from each other?
    • Number of paths: How many different paths connect each pair?
slide29
Measuring Networks: Flow

Without full network data, you can’t distinguish actors with limited information potential from those more deeply embedded in a setting.

c

b

a

slide30
b

f

c

e

d

Measuring Networks: Flow

Reachability

Indirect connections are what make networks systems. One actor can reach another if there is a path in the graph connecting them.

a

b

d

a

c

e

f

Paths can be directed, leading to a distinction between “strong” and “weak” components

slide31
Measuring Networks: Flow

Reachability

Reachability

If you can trace a sequence of relations from one actor to another, then the two are reachable. If there is at least one path connecting every pair of actors in the graph, the graph is connected and is called a component.

Intuitively, a component is the set of people who are all connected by a chain of relations.

slide32
Measuring Networks: Flow

Reachability

This example contains many components.

slide33
Measuring Networks: Flow

Distance & number of paths

Distance is measured by the (weighted) number of relations separating a pair:

Actor “a” is:

1 step from 4

2 steps from 5

3 steps from 4

4 steps from 3

5 steps from 1

a

slide34
Measuring Networks: Flow

Distance & number of paths

Paths are the different routes one can take. Node-independent paths are particularly important.

There are 2 independent paths connecting a and b.

b

There are many non-independent paths

a

slide35
1.2

1

10 paths

0.8

5 paths

probability

0.6

2 paths

0.4

1 path

0.2

0

2

3

4

5

6

Path distance

Measuring Networks: Flow

Distance & number of paths

Probability of transfer

by distance and number of paths, assume a constant pij of 0.6

slide36
Reachability in Colorado Springs

(Sexual contact only)

  • High-risk actors over 4 years
  • 695 people represented
  • Longest path is 17 steps
  • Average distance is about 5 steps
  • Average person is within 3 steps of 75 other people
  • 137 people connected through 2 independent paths, core of 30 people connected through 4 independent paths

(Node size = log of degree)

slide37
Measuring Networks: Flow

Centrality

  • Centrality refers to (one dimension of) location, identifying where an actor resides in a network.
    • For example, we can compare actors at the edge of the network to actors at the center.
    • In general, this is a way to formalize intuitive notions about the distinction between insiders and outsiders.
slide38
Measuring Networks: Flow

Centrality

  • At the individual level, one dimension of position in the network can be captured through centrality.
  • Conceptually, centrality is fairly straight forward: we want to identify which nodes are in the ‘center’ of the network. In practice, identifying exactly what we mean by ‘center’ is somewhat complicated, but substantively we often have reason to believe that people at the center are very important.
  • Three standard centrality measures capture a wide range of “importance” in a network:
      • Degree
      • Closeness
      • Betweenness
slide39
Measuring Networks: Flow

Centrality

The most intuitive notion of centrality focuses on degree. Degree is the number of ties, and the actor with the most ties is the most important:

slide40
Measuring Networks: Flow

Centrality

If we want to measure the degree to which the graph as a whole is centralized, we look at the dispersion of centrality:

Simple: variance of the individual centrality scores.

Or, using Freeman’s general formula for centralization (which ranges from 0 to 1):

slide41
Measuring Networks: Flow

Degree Centralization Scores

Centrality

Freeman: 0.0

Variance: 0.0

Freeman: 1.0

Variance: 3.9

Freeman: .02

Variance: .17

Freeman: .07

Variance: .20

slide42
Measuring Networks: Flow

Centrality

A second measure of centrality is closeness centrality. An actor is considered important if he/she is relatively close to all other actors.

Closeness is based on the inverse of the distance of each actor to every other actor in the network.

Closeness Centrality:

Normalized Closeness Centrality

slide43
Measuring Networks: Flow

Centrality

Closeness Centrality in the examples

C=0.0

C=1.0

C=0.36

C=0.28

slide44
Measuring Networks: Flow

Centrality

Betweenness Centrality:

Model based on communication flow: A person who lies on communication paths can control communication flow, and is thus important. Betweenness centrality counts the number of shortest paths between i and k that actor j resides on.

b

a

C d e f g h

slide45
Measuring Networks: Flow

Centrality

Betweenness Centrality:

Where gjk = the number of geodesics connecting jk, and

gjk(ni) = the number that actor i is on.

Usually normalized by:

slide46
Measuring Networks: Flow

Centrality

Betweenness Centrality:

Centralization: 1.0

Centralization: 0

Centralization: .59

Centralization: .31

slide47
Measuring Networks: Flow

Centrality

Actors that appear very different when seen individually, are comparable in the global network.

(Node size proportional to betweenness centrality )

slide48
Measuring Networks: Flow

Time

  • Two factors that affect network flows:
  • Topology
    • - the shape, or form, of the network
    • - simple example: one actor cannot pass information to another unless they are either directly or indirectly connected
  • Time
    • - the timing of contacts matters
    • - simple example: an actor cannot pass information he has not yet received.
slide49
Measuring Networks: Flow

Time

Timing in networks

  • A focus on contact structure has often slighted the importance of network dynamics,though a number of recent pieces are addressing this.
  • Time affects networks in two important ways:
    • The structure itself evolves, in ways that will affect the topology an thus flow.
    • 2) The timing of contact constrains information flow
slide50
Measuring Networks: Flow

Time

Drug Relations, Colorado Springs, Year 1

Data on drug users in Colorado Springs, over 5 years

slide51
Measuring Networks: Flow

Time

Drug Relations, Colorado Springs, Year 2

Current year in red, past relations in gray

slide52
Measuring Networks: Flow

Time

Drug Relations, Colorado Springs, Year 3

Current year in red, past relations in gray

slide53
Measuring Networks: Flow

Time

Drug Relations, Colorado Springs, Year 4

Current year in red, past relations in gray

slide54
Measuring Networks: Flow

Time

Drug Relations, Colorado Springs, Year 5

Current year in red, past relations in gray

slide55
Measuring Networks: Flow

Time

What impact does timing have on flow through the network?

8 - 9

E

C

3 - 7

2 - 5

A

B

0 - 1

3 - 5

D

F

Numbers above lines indicate contact periods

slide56
Measuring Networks: Flow

Time

The path graph for the hypothetical contact network

E

C

A

B

D

F

While clearly important, this is not often handled well by current software.

slide57
Measuring Networks: Structure & Social Space

The second broad division for measuring networks steps back to generalized features of the global network.

These factors almost always are of interest because of what they imply about how goods move through the network, but have resulted in a distinct line of methods and substantive research.

We focus on 3 such factors today:

1) Basic structure of large-scale networks

2) Cohesive Peer Groups

3) Identifying Role positions (blockmodels)

slide58
Measuring Networks: Large-Scale Models

Small World Networks

  • Based on Milgram’s (1967) famous work, the substantive point is that networks are structured such that even when most of our connections are local, any pair of people can be connected by a fairly small number of relational steps.
  • Works on 2 parameters:
    • The Clustering Coefficient (c) = average proportion of closed triangles
    • The average distance (L) separating nodes in the network
slide59
Measuring Networks: Large-Scale Models

Small World Networks

C=Large, L is Small =

SW Graphs

  • High probability that a node’s contacts are connected to each other.
  • Small average distance between nodes
slide60
Measuring Networks: Large-Scale Models

Small World Networks

In a highly clustered, ordered network, a single random connection will create a shortcut that lowers L dramatically

Watts demonstrates that small world properties can occur in graphs with a surprisingly small number of shortcuts

Diffusion / flow implications are unclear, but seem similar to a random graphs where local clusters are reduced to a single point.

slide61
Measuring Networks: Large-Scale Models

Scale-Free Networks

Across a large number of substantive settings, Barabási points out that the distribution of network involvement (degree) is highly and characteristically skewed.

slide62
Measuring Networks: Large-Scale Models

Scale Free Networks

Many large networks are characterized by a highly skewed distribution of the number of partners (degree)

slide63
Measuring Networks: Large-Scale Models

Scale Free Networks

Many large networks are characterized by a highly skewed distribution of the number of partners (degree)

slide64
Measuring Networks: Large-Scale Models

Scale Free Networks

The scale-free model focuses on the distance-reducing capacity of high-degree nodes:

slide65
Measuring Networks: Large-Scale Models

Scale Free Networks

The scale-free model focuses on the distance-reducing capacity of high-degree nodes, as ‘hubs’ create shortcuts that carry network flow.

slide66
Measuring Networks: Large-Scale Models

Scale Free Networks

Colorado Springs High-Risk

(Sexual contact only)

  • Network is approximately scale-free, with l = -1.3
  • But connectivity does not depend on the hubs.
slide67
Measuring Networks: Large-Scale Models

Social Cohesion

White, D. R. and F. Harary. 2001. "The Cohesiveness of Blocks in Social Networks: Node Connectivity and Conditional Density." Sociological Methodology 31:305-59.

Moody, James and Douglas R. White. 2003. “Structural Cohesion and Embeddedness: A hierarchical Conception of Social Groups” American Sociological Review 68:103-127

White, Douglas R., Jason Owen-Smith, James Moody, & Walter W. Powell (2004) "Networks, Fields, and Organizations: Scale, Topology and Cohesive Embeddings."  Computational and Mathematical Organization Theory. 10:95-117

Moody, James "The Structure of a Social Science Collaboration Network: Disciplinary Cohesion from 1963 to 1999" American Sociological Review. 69:213-238

slide68
Measuring Networks: Large-Scale Models

Social Cohesion

  • Formal definition of Structural Cohesion:
  • A group’s structural cohesion is equal to the minimum number of actors who, if removed from the group, would disconnect the group.
  • Equivalently (by Menger’s Theorem):
  • A group’s structural cohesion is equal to the minimum number of independent paths linking each pair of actors in the group.
slide69
Measuring Networks: Large-Scale Models

Social Cohesion

  • Networks are structurally cohesive if they remain connected even when nodes are removed

2

3

0

1

Node Connectivity

slide70
Measuring Networks: Large-Scale Models

Social Cohesion

Structural cohesion gives rise automatically to a clear notion of embeddedness, since cohesive sets nest inside of each other.

2

3

1

9

10

8

4

11

5

7

12

13

6

14

15

17

16

18

19

20

2

22

23

slide71
Measuring Networks: Large-Scale Models

Social Cohesion

Project 90, Sex-only network (n=695)

3-Component (n=58)

slide72
Measuring Networks: Large-Scale Models

Social Cohesion

Connected

Bicomponents

IV Drug Sharing

Largest BC: 247

k> 4: 318

Max k: 12

Structural Cohesion simultaneously gives us a positional and subgroup analysis.

slide73
Measuring Networks:

Cohesive Sub Groups

A primary interest in Social Network Analysis is the identification of “significant social subgroups” – some smaller collection of nodes in the graph that can be considered, at least in some senses, as a “unit” based on the pattern, strength, or frequency of ties.

There are many ways to identify groups. They all insist on a group being in a connected component, but other than that the variation is wide.

slide74
Measuring Networks:

Cohesive Sub Groups

Graph Theoretical Models.

Start with a clique. A clique is defined as a maximal subgraph in which every member of the graph is connected to every other member of the graph. Cliques are collections of nodes where density = 1.0.

  • Properties of cliques:
  • Density: 1.0
  • Everyone connected to n-1 alters
  • Distance between every pair is 1
  • Ratio of within group ties to between group ties is infinite
  • All triads are transitive
slide75
Measuring Networks:

Cohesive Sub Groups

Graph Theoretical Models.

In practice, complete cliques are not very useful. They tend to overlap heavily and are limited in their size.

Graph theorists have thus relaxed the complete connectivity requirement (with varying degrees of success). See the Moody & White (2003) for a discussion of these attempts.

slide76
Measuring Networks:

Cohesive Sub Groups

Identifying Primary groups:

1) Measures of fit

To identify a primary group, we need some measure of how clustered the network is. Usually, this is a function of the number of ties that fall within group to the number of ties that fall between group.

2) Algorithmic approaches to maximizing (1)

Once we have such an index, we need a method for searching through the network to maximize the fit.

3) Generalized cluster analysis

In addition to maximizing a group function such as (1) we can use the relational distance directly, and look for clusters in the data. We next go over two different styles of cluster analysis

slide77
Measuring Networks:

Cohesive Sub Groups

Segregation Index

(Freeman, L. C. 1972. "Segregation in Social Networks." Sociological Methods and Research 6411-30.)

Freeman asked how we could identify segregation in a social network. Theoretically, he argues, if a given attribute (group label) does not matter for social relations, then relations should be distributed randomly with respect to the attribute. Thus, the difference between the number of cross-group ties expected by chance and the number observed measures segregation.

slide78
Measuring Networks:

Cohesive Sub Groups

Consider the (hypothetical) network below. There are two attributes in this network: people with Blue eyes and Brown eyes and people who are square or not (they must be hip).

slide79
Blue Brown

Blue 6 17

Brown 17 16

Hip Square

Hip 20 3

Square 3 30

Measuring Networks:

Cohesive Sub Groups

Segregation Index

Mixing Matrix:

Seg = -0.25

Seg = 0.78

slide80
Measuring Networks:

Cohesive Sub Groups

  • The segregation index is one metric used to identify groups. Others include:
    • a) The ratio of in-group to out-group ties (Negopy, UCINET Factions)
    • b) Maximizing the probability of in-group contact (CliqueFinder)
    • c) The Segregation Matrix Index (SMI)
    • d) The dyadic factor loadings for overlapping groups (akin to a latent class model)
    • e) Minimize the within-group distance
  • Once a metric has been chosen, some algorithm is needed to search through the graph to identify clusters. These algorithms range from very sophisticated “graph-intelligent” algorithms, such as NEGOPY, to simple cluster analysis of distance matrices.
  • In most cases, you have to pre-set the number of groups to use (the exceptions are NEGOPY and CliqueFinder. Moody’s CROWDS algorithm also has automatic stopping criteria, but you have to give it starting values.
slide81
Measuring Networks:

Cohesive Sub Groups

In practice, the different algorithms will give different results.

Here, I compare the NEGOPY results to the RNM results. NEGOPY returned one large group, RNM found many smaller, denser groups.

It’s usually a good idea to explore multiple solutions and algorithms.

slide82
Measuring Networks:

Cohesive Sub Groups

Gangon Prison Network

In practice, the different algorithms will give different results.

Here, I compare NEGOPY, FACTIONS and RNM. Groups A and B are identical, C is close. F, E and D differ.

It’s usually a good idea to explore multiple solutions and algorithms.

(all solutions constrained to 6 groups)

slide83
Measuring Networks:

Role Positions

  • Overview
      • Social life can be described (at least in part) through social roles.
      • To the extent that roles can be characterized by regular interaction patterns, we can summarize roles through common relational patterns.
      • Identifying these sets is the goal of block-model analyses.
  • Nadel: The Coherence of Role Systems
      • Background ideas for White, Boorman and Brieger. Social life as interconnected system of roles
      • Important feature: thinking of roles as connected in a role system = social structure
  • White, Harrison C.; Boorman, Scott A., and Breiger, Ronald L. Social Structure from Multiple Networks I. American Journal of Sociology. 1976; 81730-780.
    • The key article describing the theoretical and technical elements of block-modeling
slide84
Measuring Networks:

Role Positions

  • Elements of a Role:
    • Rights and obligations with respect to other people or classes of people
    • Roles require a ‘role compliment’ another person who the role-occupant acts with respect to
  • Examples:
  • Parent - child, Teacher - student, Lover - lover, Friend - Friend, Husband - Wife, etc.
  • Nadel (Following functional anthropologists and sociologists) defines ‘logical’ types of roles, and then examines how they can be linked together.
slide85
Romantic Love

Bickers with

Measuring Networks:

Role Positions

White et al: From logical role systems to empirical social structures

Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might represent a family as:

H

W

C

C

C

Provides food for

(and there are, of course, many other relations inside a family!)

slide86
Measuring Networks:

Role Positions

The key idea, is that we can express a role through a relation (or set of relations) and thus a social system by the inventory of roles. If roles equate to positions in an exchange system, then we need only identify particular aspects of a position. But what aspect?

Structural Equivalence

Two actors are structurally equivalent if they have the same types of ties to the same people.

slide87
Measuring Networks:

Role Positions

Structural Equivalence

A single relation

slide88
Measuring Networks:

Role Positions

Structural Equivalence

Graph reduced to positions

slide89
Measuring Networks:

Role Positions

Blockmodeling: basic steps

In any positional analysis, there are 4 basic steps:

1) Identify a definition of equivalence

2) Measure the degree to which pairs of actors are equivalent

3) Develop a representation of the equivalencies

4) Assess the adequacy of the representation

slide90
Measuring Networks:

Role Positions

1) Identify a definition of equivalence

  • Structural Equivalence:
    • Two actors are equivalent if they have the same type of ties to the same people.
slide91
Measuring Networks:

Role Positions

  • Automorphic Equivalence:
    • Actors occupy indistinguishable structural locations in the network. That is, that they are in isomorphic positions in the network.
    • In general, automorphically equivalent nodes are equivalent with respect to all graph theoretic properties (I.e. degree, number of people reachable, centrality, etc.)
slide92
Measuring Networks:

Role Positions

Automorphic Equivalence:

slide93
Measuring Networks:

Role Positions

  • Regular Equivalence:
    • Regular equivalence does not require actors to have identical ties to identical actors or to be structurally indistinguishable.
    • Actors who are regularly equivalent have identical ties to and from equivalent actors.
    • If actors i and j are regularly equivalent, then for all relations and for all actors, if ik, then there exists some actor l such that jl and k is regularly equivalent to l.
slide94
Measuring Networks:

Role Positions

Regular Equivalence:

There may be multiple regular equivalence partitions in a network, and thus we tend to want to find the maximal regular equivalence position, the one with the fewest positions.

slide95
Measuring Networks:

Role Positions

  • Role or Local Equivalence:
    • While most equivalence measures focus on position within the full network, some measures focus only on the patters within the local tie neighborhood. These have been called ‘local role’ equivalence.
  • Note that:
    • Structurally equivalent actors are automorphically equivalent,
    • Automorphically equivalent actors are regularly equivalent.
    • Structurally equivalent and automorphically equivalent actors are role equivalent
    • In practice, we tend to ignore some of these distinctions, as they get blurred quickly once we have to operationalize them in real-world graphs. It turns out that few people are ever exactly equivalent, and thus we approximate the links between the types.
    • In all cases, the procedure can work over multiple relations simultaneously.
    • The process of identifying positions is called blockmodeling, and requires identifying a measure of similarity among nodes.
slide96
Measuring Networks:

Role Positions

Once you identify equivalent actors, block them in the matrix and reduce it, based on the number of ties in the cell of interest. The key values are a zero block (no ties) and a one-block (all ties present):

1 2 3 4 5 6

1 0 1 1 0 0 0

2 1 0 0 1 0 0

3 1 0 1 0 1 0

4 0 1 0 1 0 1

5 0 0 1 0 0 0

6 0 0 0 1 0 0

1

2

3

4

5

6

1

. 1 1 1 0 0 0 0 0 0 0 0 0 0

1 . 0 0 1 1 0 0 0 0 0 0 0 0

1 0 . 1 0 0 1 1 1 1 0 0 0 0

1 0 1 . 0 0 1 1 1 1 0 0 0 0

0 1 0 0 . 1 0 0 0 0 1 1 1 1

0 1 0 0 1 . 0 0 0 0 1 1 1 1

0 0 1 1 0 0 . 0 0 0 0 0 0 0

0 0 1 1 0 0 0 . 0 0 0 0 0 0

0 0 1 1 0 0 0 0 . 0 0 0 0 0

0 0 1 1 0 0 0 0 0 . 0 0 0 0

0 0 0 0 1 1 0 0 0 0 . 0 0 0

0 0 0 0 1 1 0 0 0 0 0 . 0 0

0 0 0 0 1 1 0 0 0 0 0 0 . 0

0 0 0 0 1 1 0 0 0 0 0 0 0 .

2

3

4

5

6

Structural equivalence thus generates 6 positions in the network

slide97
Measuring Networks:

Role Positions

Once you partition the matrix, reduce it:

. 1 1 1 0 0 0 0 0 0 0 0 0 0

1 . 0 0 1 1 0 0 0 0 0 0 0 0

1 0 . 1 0 0 1 1 1 1 0 0 0 0

1 0 1 . 0 0 1 1 1 1 0 0 0 0

0 1 0 0 . 1 0 0 0 0 1 1 1 1

0 1 0 0 1 . 0 0 0 0 1 1 1 1

0 0 1 1 0 0 . 0 0 0 0 0 0 0

0 0 1 1 0 0 0 . 0 0 0 0 0 0

0 0 1 1 0 0 0 0 . 0 0 0 0 0

0 0 1 1 0 0 0 0 0 . 0 0 0 0

0 0 0 0 1 1 0 0 0 0 . 0 0 0

0 0 0 0 1 1 0 0 0 0 0 . 0 0

0 0 0 0 1 1 0 0 0 0 0 0 . 0

0 0 0 0 1 1 0 0 0 0 0 0 0 .

1 2 3

1 1 1 0

2 1 1 1

3 0 1 0

1

2

3

Regular equivalence

(here I placed a one in the image matrix if there were any ties in the ij block)

slide98
Measuring Networks:

Role Positions

Operationally, you have to measure the similarity between actors. If two actors are structurally equivalent, then they will have identical ties to other people. Consider the example again:

C and D match on all 12 other people, and are thus structurally equivalent.

1

2

3

4

5

6

C D Match

11 1

00 1

. 1 .

1 . .

00 1

00 1

11 1

11 1

11 1

11 1

00 1

00 1

00 1

00 1

Sum: 12

1

. 1 1 1 0 0 0 0 0 0 0 0 0 0

1 . 0 0 1 1 0 0 0 0 0 0 0 0

1 0 . 1 0 0 1 1 1 1 0 0 0 0

1 0 1 . 0 0 1 1 1 1 0 0 0 0

0 1 0 0 . 1 0 0 0 0 1 1 1 1

0 1 0 0 1 . 0 0 0 0 1 1 1 1

0 0 1 1 0 0 . 0 0 0 0 0 0 0

0 0 1 1 0 0 0 . 0 0 0 0 0 0

0 0 1 1 0 0 0 0 . 0 0 0 0 0

0 0 1 1 0 0 0 0 0 . 0 0 0 0

0 0 0 0 1 1 0 0 0 0 . 0 0 0

0 0 0 0 1 1 0 0 0 0 0 . 0 0

0 0 0 0 1 1 0 0 0 0 0 0 . 0

0 0 0 0 1 1 0 0 0 0 0 0 0 .

2

3

4

5

6

slide99
Measuring Networks:

Role Positions

If the model is going to be based on asymmetric or multiple relations, you simply stack the various relations, usually including both “directions” of asymmetric relations:

Stacked

Romance

0 1 0 0 0

1 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 1 0 0 0

1 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

H

W

0 0 1 1 1

0 0 1 1 1

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

Feeds

0 0 1 1 1

0 0 1 1 1

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

C

C

C

0 0 0 0 0

0 0 0 0 0

1 1 0 0 0

1 1 0 0 0

1 1 0 0 0

Romantic Love

Provides food for

Bicker

0 0 0 0 0

0 0 0 0 0

0 0 0 1 1

0 0 1 0 0

0 0 1 1 0

Bickers with

0 0 0 0 0

0 0 0 0 0

0 0 0 1 1

0 0 1 0 1

0 0 1 1 0

slide100
Measuring Networks:

Role Positions

The metric used to measure structural equivalence by White, Boorman and Brieger is the correlation between each node’s set of ties. For the example, this would be:

1.00 -0.20 0.08 0.08 -0.19 -0.19 0.77 0.77 0.77 0.77 -0.26 -0.26 -0.26 -0.26

-0.20 1.00 -0.19 -0.19 0.08 0.08 -0.26 -0.26 -0.26 -0.26 0.77 0.77 0.77 0.77

0.08 -0.19 1.00 1.00 -1.00 -1.00 0.36 0.36 0.36 0.36 -0.45 -0.45 -0.45 -0.45

0.08 -0.19 1.00 1.00 -1.00 -1.00 0.36 0.36 0.36 0.36 -0.45 -0.45 -0.45 -0.45

-0.19 0.08 -1.00 -1.00 1.00 1.00 -0.45 -0.45 -0.45 -0.45 0.36 0.36 0.36 0.36

-0.19 0.08 -1.00 -1.00 1.00 1.00 -0.45 -0.45 -0.45 -0.45 0.36 0.36 0.36 0.36

0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20

0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20

0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20

0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20

-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00

-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00

-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00

-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00

Another common metric is the Euclidean distance between pairs of actors, which you then use in a standard cluster analysis.

slide101
Measuring Networks:

Role Positions

Automorphic and Regular equivalence are more difficult to find, and require iteratively searching over possible class assignments for sets that have the same graph theoretic patterns. Usually start with a set of nodes defined as similar on a number of network measures, then look within these classes for automorphic equivalence classes.

A theoretically appealing method for finding structures that are very similar to regular equivalence, role equivalence, uses the triad census. Each node is involved in (n-1)(n-2)/2 triads, and occupies a particular position in each of these triads.

slide102
Measuring Networks:

Role Positions

  • Moving from a similarity/distance matrix to a blockmodel:
  • number of groups and determining blocks:
      • “An important decision in an analysis using CONCOR is how fine the partition should be; in other words, when should one stop splitting positions? Theory and the interpretability of the solution are the primary consideration in deciding how many positions to produce.” (W&F, p.378)
      • “In defining positions of actors, the ‘trick’ is to choose the point along the series that gives a useful and interpretable partition of the actors into equivalence classes.” (W&F p.383)
slide103
Measuring Networks:

Role Positions

An example:

Padgett, J. F. and Ansell, C. K. Robust action and the rise of the Medici, 1400-1434. American Journal of Sociology. 1993; 981259-1319.

“Political Groups” in the attribute sense do not seem to exist, so P&A turn to the pattern of network relations among families.

This is the block reduction of the full 92 family network.

slide104
Modeling with Networks: Behaviors
  • There are two general approaches to modeling behaviors with network data:
    • Using network measures as variables to predict individual outcomes
    • Network autocorrelation / peer influence models
    • Dyad / QAP models of the similarity of actors and their joint network position
slide105
Modeling with Networks: Behaviors
  • The simplest way to use network data in research is to include the network measure as a covariate in a standard model:
  • Y = a0 + b(netvars) + b(other vars) + e
  • “netvars” most commonly include:
      • Functions of each person’s direct contacts attributes
        • Such as: mean income of friends, proportion of friends who are employed, racial heterogeneity of the friends,etc.
      • Structural indicators:
        • Such as: Centrality, dummies for group / role membership, etc.
  • These models are the only option for ego-network data,where information on network alters is collected from a single respondent’s (ego’s) report.
  • They can be used from extractions of partial or complete data, but the error term is – by definition – autocorrelated. Cases are not independent, but connected through the social relations
slide106
Modeling with Networks: Behaviors
  • Network Autocorrelation models (aka Peer Influence models):
      • Friedkin, N. E. 1984. "Structural Cohesion and Equivalence Explanations of Social Homogeneity." Sociological Methods and Research 12:235-61.
      • ———. 1998. A Structural Theory of Social Influence. Cambridge: Cambridge.
      • Friedkin, N. E. and E. C. Johnsen. 1990. "Social Influence and Opinions." Journal of Mathematical Sociology 15(193-205).
      • ———. 1997. "Social Positions in Influence Networks." Social Networks 19:209-22.
  • Where W is a direct function of the adjacency matrix, and a is the estimated value of peer influence.
slide107
Modeling with Networks: Behaviors

There are two general ways to test for peer influence in an observed network. The first estimates the parameters (a and b) of the peer influence model directly, the second transforms the network into a dyadic model, predicting similarity among actors.

Peer influence model:

See Doreian, Patrick. “Maximum likelihood methods for linear models Spatial Effects and Spatial Disturbances Terms.” Sociological Methods and Research. 1982; 10243-269.

Gould, Roger V. Multiple Networks and mobilization in the Paris Commune, 1871. American Sociological Review. 1991; 56716-729. (applied example)

slide108
Modeling with Networks: Behaviors

The basic model says that people’s opinions are a function of the opinions of others and their characteristics.

WY = A simple vector which can be added to your model. That is, multiply Y by a W matrix, and run the regression with WY as a new variable, and the regression coefficient is an estimate of a.

This is what Doriean calls the QAD (“Quick and Dirty” estimate of peer influence, and is equivalent (under certain assumptions) to adding the mean of ego’s friends to the model.

slide109
Modeling with Networks: Behaviors

The problem with the above regression is that cases are, by definition, not independent. In fact, WY is also known as the ‘network autocorrelation’ coefficient, since a ‘peer influence’ effect is an autocorrelation effect -- your value is a function of the people you are connected to. In general, OLS is not the best way to estimate this equation. That is, QAD = Quick and Dirty, and your results will not be exact.

In practice, the QAD approach (perhaps combined with a GLS estimator) results in empirical estimates that are “virtually indistinguishable” from MLE (Doreian et al, 1984)

The proper way to estimate the peer equation is to use maximum likelihood estimates, and Doreian gives the formulas for this in his paper.

The other way is to use non-parametric approaches, such as the Quadratic Assignment Procedure, to estimate the effects.

slide110
Modeling with Networks: Behaviors

An empirical Example: Peer influence in the OSU Graduate Student Network.

Each person was asked to rank their satisfaction with the program, which is the dependent variable in this analysis.

I constructed two W matrices, one from HELP the other from Best Friend. I treat relations as symmetric and valued, such that:

I also include Race (white/Non-white, Gender and Cohort Year as exogenous variables in the model.

slide111
Modeling with Networks: Behaviors

An empirical Example: Peer influence in the OSU Graduate Student Network.

Distribution of Satisfaction with the department.

slide112
Modeling with Networks: Behaviors

Parameter Estimates

Parameter Standardized

Variable Estimate Pr > |t| Estimate

Intercept 2.60252 0.0931 0

FEMALE -1.07540 0.0142 -0.25455

NONWHITE -0.22087 0.5975 -0.05491

y00 0.93176 0.0798 0.21627

y99 -0.19375 0.7052 -0.04586

y98 -0.45912 0.4637 -0.08289

y97 0.60670 0.3060 0.11919

PEER_BF 0.23936 0.0002 0.42084

PEER_H 0.50668 0.0277 0.23321

Model R2 = .41, compared to .15 without the peer effects

slide113
Modeling with Networks: Behaviors

Dyad QAP models

Another way to get at peer influence is not through the level of Y, but through the extent to which actors are similar with respect to Y.

The model is now expressed at the dyad level as:

Where Y is a matrix of similarities, A is an adjacency matrix, and Xk is a matrix of similarities on attributes

slide114
Modeling with Networks: Behaviors

Dyad QAP models

NODE ADJMAT SAMERCE SAMESEX

1 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 1 0 0 1 1 0

2 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 0 0 1

3 1 1 0 0 1 0 1 0 0 0 0 0 1 0 1 1 1 0 1 0 0 1 0 0 1 1 0

4 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 0 0 1 1 0

5 0 0 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1

6 0 0 0 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 1 0 0 0 1

7 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 1 0 1 1 0 0 0 1 0

8 0 0 0 0 1 1 0 0 1 0 0 1 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0

9 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0

slide115
Modeling with Networks: Behaviors

Dyad QAP models

Y

0.32

0.59

0.54

0.50

0.04

0.02

0.41

0.01

-0.17

Distance (Dij=abs(Yi-Yj)

.000 .277 .228 .181 .278 .298 .095 .307 .481

.277 .000 .049 .096 .555 .575 .182 .584 .758

.228 .049 .000 .047 .506 .526 .134 .535 .710

.181 .096 .047 .000 .459 .479 .087 .488 .663

.278 .555 .506 .459 .000 .020 .372 .029 .204

.298 .575 .526 .479 .020 .000 .392 .009 .184

.095 .182 .134 .087 .372 .392 .000 .401 .576

.307 .584 .535 .488 .029 .009 .401 .000 .175

.481 .758 .710 .663 .204 .184 .576 .175 .000

slide116
Modeling with Networks: Behaviors

Dyad QAP models

The REG Procedure

Model: MODEL1

Dependent Variable: SIM

Analysis of Variance

Sum of Mean

Source DF Squares Square F Value Pr > F

Model 4 0.90657 0.22664 9.29 <.0001

Error 31 0.75591 0.02438

Corrected Total 35 1.66248

Root MSE 0.15615 R-Square 0.5453

Dependent Mean 0.33161 Adj R-Sq 0.4866

Coeff Var 47.08929

Parameter Estimates

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 0.51931 0.05116 10.15 <.0001

NOM 1 -0.17054 0.05963 -2.86 0.0075

SAMERCE 1 0.05387 0.05916 0.91 0.3696

SAMESEX 1 -0.06535 0.05365 -1.22 0.2324

NCOMFND 1 -0.16134 0.03862 -4.18 0.0002

slide117
Modeling with Networks: Behaviors

Dyad QAP models

Like the basic Peer influence model, cases in dyad models are not independent. However, the non-independence now comes from two sources: (1) the fact that the same person is represented in (n-1) dyads and (2) that i and j are linked through relations.

One of the best solutions to this problem is QAP: Quadratic Assignment Procedure. A non-parametric procedure for significance testing.

QAP runs the model of interest on the real data, then randomly permutes the rows/cols of the data matrix and estimates the model again. In so doing, it generates an empirical distribution of the coefficients, generating n levels of the coefficients at ‘chance’ levels, which you then compare to the observed data. This is implemented in UCINET for regression, and in DAMN for logistic regression (J.L. Martin).

slide118
Modeling with Networks: Behaviors

Dyad QAP models

  • Procedure:
  • Calculate the observed association / model
  • for K iterations do:
  • a) randomly sort one of the matrices
  • b) recalculate the association / model
  • c) store the outcome
  • 3. compare the observed outcome to the distribution of outcomes created by the random permutations.
slide119
Modeling with Networks: Behaviors

Dyad QAP models

Comparing multiple networks: QAP

slide121
Modeling with Networks: Behaviors

Dyad QAP models

MULTIPLE REGRESSION QAP W/ MISSING VALUES

--------------------------------------------------------------------------------

# of permutations: 2000

Diagonal valid? NO

Random seed: 533

Dependent variable: EX_SIM

Expected values: c:\moody\Classes\soc884\examples\UCINET\mrqap-predicted

Independent variables: EX_NCOM

EX_ADJ

EX_SRCE

EX_SSEX

Number of valid observations among the X variables = 72

N = 72

Number of permutations performed: 1999

MODEL FIT

R-square Adj R-Sqr Probability # of Obs

-------- --------- ----------- -----------

0.545 0.525 0.029 72

REGRESSION COEFFICIENTS

Un-stdized Stdized Proportion Proportion

Independent Coefficient Coefficient Significance As Large As Small

----------- ----------- ----------- ------------ ----------- -----------

Intercept 0.519314 0.000000 0.012 0.012 0.988

EX_NCOM -0.161337 -0.541828 0.011 0.989 0.011

EX_ADJ -0.170539 -0.381186 0.020 0.980 0.020

EX_SRCE 0.053864 0.124551 0.236 0.236 0.764

EX_SSEX -0.065364 -0.151144 0.180 0.820 0.180

Note that the coefficient values will be identical, but the p values differ

slide122
Modeling with Networks: Behaviors

Dyad QAP models

A substantive question raised with any kind of network autocorrelation model is whether observed associations between network structure and behaviors is due to selection or influence.

Theory is your best friend here, as there is no fool proof method to distinguish the two.

However, recent work has made great progress using individual-level fixed effect models (sometimes random effects models), where the network features vary over time. This removes any stable characteristic that might account for selection into a particular group.

slide123
Modeling with Networks: Structure

Dyad QAP models

While the most common way to use QAP models is to predict the similarity on some substantive variable, one can just as easily predict the presence/absence of a relation given attribute similarity.

This makes it possible to model the network itself, and ask questions about how particular structures form.

slide124
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

A long research tradition in statistics and random graph theory has lead to parametric models of networks.

These are models of the entire graph, though as we will see they often work on the dyads in the graph to be estimated.

Substantively, the approach is to ask whether the graph in question is an element of the class of all random graphs with the given known elements. For example, all graphs with 5 nodes and 3 edges, or, put probabilistically, the probability of observing the current graph given the conditions.

slide125
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

The earliest approaches are based on simple random graph theory, but there’s been a flurry of activity in the last 10 years or so.

Key references:

- Holland and Leinhardt (1981) JASA

- Frank and Strauss (1986) JASA

- Wasserman and Faust (1994) – Chap 15 & 16

- Wasserman and Pattison (1996)

Thanks to Mark Handcock for sharing some figures/slides about these models.

slide126
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

Where:

q is a vector of parameters (like regression coefficients)

z is a vector of network statistics, conditioning the graph

k is a normalizing constant, to ensure the probabilities sum to 1.

slide127
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

The simplest graph is a Bernoulli random graph,where each Xij is independent:

Where:

qij= logit[P(Xij= 1)]

k(q) =P[1 + exp(ij )]

Note this is one of the few cases where k(q) can be written.

slide128
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

Typically, we add a homogeneity condition, so that all isomorphic graphs are equally likely. The homogeneous bernulli graph model:

Where:

k(q) =[1 + exp(q)]g

slide129
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

If we want to condition on anything much more complicated than density, the normalizing constant ends up being a problem. We need a way to express the probability of the graph that doesn’t depend on that constant. It turns out we can do this by conditioning on a ‘complement’ graph.

First some terms:

slide130
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

After some algebra:

Note that we can now model the conditional probability of the graph,as a function of a set of difference statistics, without reference to the normalizing constant.

The model, then, simply reduces to a logit model on the dyads.

slide131
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

Fitting p* models

I highly recommend working through the p* primer examples, which can be found at:

http://kentucky.psych.uiuc.edu/pstar/index.html

Including:

A Practical Guide To Fitting p* Social Network Models

Via Logistic Regression

The site includes the PREPSTAR program for creating the difference variables of interest.

slide132
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

We can model this network based on parameters for overall degree of Choice (), Differential Choice Within Positions (W), Mutuality(), Differential Mutuality Within Positions (W), and Transitivity (T).

The vector of model parameters to be estimated is: = { WWT }.

slide133
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

proclogistic descending ;

tie = l lw m mw tt / noint;

run;

L = Choice

LW = Within Group

M = Mutuality

MW = Mutual within Group

TT = Transitivity

Substantively, this graph is likely from the random class of graphs with similar mutuality and size

slide134
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

One practical problem is that the resulting values are often quite correlated, making estimation difficult. This is particularly difficult with “star” parameters.

lw m mw tt

lw 1.00000 0.58333 0.80178 0.15830

0.0007 <.0001 0.4034

m 0.58333 1.00000 0.80178 -0.02435

0.0007 <.0001 0.8984

mw 0.80178 0.80178 1.00000 -0.11716

<.0001 <.0001 0.5375

tt 0.15830 -0.02435 -0.11716 1.00000

0.4034 0.8984 0.5375

slide135
Modeling with Networks: Structure

Exponential Random Graph Models (p*)

  • Parameters that are often fit include:
  • Expansiveness and attractiveness parameters. = dummies for each sender/receiver in the network
  • Degree distribution
  • Mutuality
  • Group membership (and all other parameters by group)
  • Transitivity / Intransitivity
  • K-in-stars, k-out-stars
  • Cyclicity
slide136
Modeling with Networks: Structure

Comparing to Random Graphs

A conceptual merge between random graph models and QAP models is to identify a sample of graphs from the universe you are trying to model. So, instead of estimating:

generate X empirically, then compare z(x) to see how likely a measure on x would be given X. The difficulty, however, is generating X.

slide137
Modeling with Networks: Structure

Comparing to Random Graphs

The first option would be to generate all isomorphic graphs within a given constraint.

This is possible for small graphs, but the number gets large fast. For a network with 3 nodes, there are 16 possible directed graphs. For a network with 4 nodes, there are 218, for 5 nodes 9608, for 6 nodes1,540,944, and so on…

So, the best approach is to sample from the universe, but, of course, if you had the universe you wouldn’t need to sample from it. How do you sample from a population you haven’t observed?

Use a construction algorithm that generates a random graph with known constraints.

slide138
Modeling with Networks: Structure

Comparing to Random Graphs

Example: Bearman, Peter S., James Moody and Katherine Stovel (2004) “Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks” American Journal of Sociology 110:44:92

Romantic Relations in Jefferson High

slide139
Modeling with Networks: Structure

Comparing to Random Graphs

Simulate random networks with similar degree distribution:

slide140
Modeling with Networks: Structure

Comparing to Random Graphs

Simulated networks preserve observeddegree, isolated dyad distribution,and four-cycle constraint

slide141
Modeling with Networks: Structure

Comparing to Random Graphs

Simulated networks preserve observeddegree, isolated dyad distribution,and four-cycle constraint: 4 examples from the simulated set

slide142
Social Network Software
  • UCINET
    • The Standard network analysis program, runs in Windows
    • Good for computing measures of network topography for single nets
    • Input-Output of data is a special 2-file format, but is now able to read PAJEK files directly.
    • Not optimal for large networks
    • Available from:
  • Analytic Technologies
slide143
Social Network Software
  • PAJEK
    • Program for analyzing and plotting very large networks
    • Intuitive windows interface
    • Used for most of the real data plots in this presentation
    • Started mainly a graphics program, but has expanded to a wide range of analytic capabilities
    • Can link to the R statistical package
    • Free
    • Available from:
slide144
Social Network Software
  • Cyram Netminer for Windows
    • Newest Product, not yet widely used
    • Price range depends on application
    • Limited to smaller networks O(100)

http://www.netminer.com/NetMiner/home_01.jsp

slide145
Social Network Software
  • NetDraw
    • Also very new, but by one of the best known names in network analysis software.
    • Free
    • Limited to smaller networks O(100)
slide146
Social Network Software
  • NEGOPY
    • Program designed to identify cohesive sub-groups in a network, based on the relative density of ties.
    • DOS based program, need to have data in arc-list format
    • Moving the results back into an analysis program is difficult.
    • Available from:
    • William D. Richards
    • http://www.sfu.ca/~richards/Pages/negopy.htm
  • SPAN - Sas Programs for Analyzing Networks (Moody, ongoing)
    • is a collection of IML and Macro programs that allow one to:
      • a) create network data structures from nomination data
      • b) import/export data to/from the other network programs
      • c) calculate measures of network pattern and composition
      • d) analyze network models
    • Allows one to work with multiple, large networks
    • Easy to move from creating measures to analyzing data
    • Available by sending an email to:
    • [email protected]
ad