Introduction to Social Network Analysis

177 Views

Download Presentation
## Introduction to Social Network Analysis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Introduction to Social Network Analysis**Anne ter Wal February 12, 2008http://econ.geo.uu.nl/terwal/terwal.html**Structure**• A. Networks in cluster research • B. What is different about network data? • C. Basic terminology • D. Analysis • E. Network data: primary vs. secondary GEOGRAPHY OF NETWORKS**A. Networks in cluster research**Applying network analysis, one makes flows between agents explicit Agents (nodes): firms, inventors, technicians Flows (links): knowledge, labour, capital, goods: • Business relationships (incl. buyer-supplier relationships) • Knowledge exchange • Cooperation (incl. joint-ventures and strategic alliances) • Labour mobility of workers • Spin-off relationships • Social relations between entrepreneurs / technicians GEOGRAPHY OF NETWORKS**A. Three central questions**Three questions in applied SNA research: • What is the structure of the network? • ORIGINS: How can this network structure be explained? • EFFECTS: How does this network structure affect the performance of its agents (or of the region in which it is located)? GEOGRAPHY OF NETWORKS**B. What is different about network data?**According to usual statistics person A and person U are similar: they both have two five friends. GEOGRAPHY OF NETWORKS**B. What is different about network data?**• Characteristics of an actor are described in terms of the position in a wider structure of interrelated actors. THESE CHARACTERISTICS ARE INTERDEPENDENT • The behavioural choices of individual actors taken together constitutes a self-organizing system THE PROPERTIES OF A NETWORK DEPEND ON INDIVIDUALS’ CHOICES GEOGRAPHY OF NETWORKS**C. Some basic terminology**Node (computer science) Vertex (physics) Actor (sociology) Link (computer science) Edge (physics) Tie (sociology) GEOGRAPHY OF NETWORKS**C. Network data**Matrix Graph Basic network data: presence or absence of links between a set of nodes. NOTE: graphs visualize the presence or absence of links. The location of the nodes and the length of the links do not have any meaning! GEOGRAPHY OF NETWORKS**C. Graphs**Undirected graph • Direction of the links does not matter • Also called bonded graph • Examples:- Friendship network - Railroad network- Cooperation between firms Directed graph • Examples:- A street map with one-way streets- Labour mobility between firms GEOGRAPHY OF NETWORKS**C. Graphs**Binary graphs / simple graphs • A link can be either present (value is 1) or absent (or value is 0) Valued graphs • The present links in a graph can have a valueFor example:- distance- size of the flow (e.g. trade)- intensity- capacity- frequency- cost- time GEOGRAPHY OF NETWORKS**C. Valued network of international calls in Europe**Source: TeleGeography GEOGRAPHY OF NETWORKS**C. Graphs**• Simplex versus multiplex graphs • Multiplex graphs visualize links of various types in a single graph Examples: cities connected by motorway and/or by railroad firms having a business relationship and/or a cooperative relationship GEOGRAPHY OF NETWORKS**Binary matrix**Undirected links: symmetric Directed links: asymmetric Valued matrix Undirected links: symmetric Differential weights: asymmetric C. Matrices Matrices are squared: rows and columns contain the same actors, in the same order. The diagonal of the matrix refers to self-loops and are usually left out of consideration to to to to from from GEOGRAPHY OF NETWORKS**C. Levels of analysis**In social network analysis three levels of analysis can be distinguished: • Individual nodes (section D1) • The dyad: pairs of actors (section D2) • The network as a whole (section D3) GEOGRAPHY OF NETWORKS**D1 Nodes**• persons, families, animals • firms, organizations, universities, political parties, scientific communities • continents, countries, regions, cities, neighbourhoods • newspaper articles, patents, scientific articles, words, letters, languages, web pages • atoms, molecules, cells, bacteria, ions • etc., etc., etc., etc., etc., etc., etc., etc., etc. GEOGRAPHY OF NETWORKS**D1 Nodes: degree**• The basic property of a node is its degree: the number of direct relationships. isolate GEOGRAPHY OF NETWORKS**D1 Nodes: in-degree and out-degree**• In directed graphs in-degree and out-degree can be distinguished. GEOGRAPHY OF NETWORKS**D1 Nodes: attributes**Beside by their degree nodes can de described by their attributes. Attributes are characteristics of a node, not related to its position in a network. • Age, gender, religion, residence, income of people in a friendship network. • Location, sector, number of employees, revenues, age of firms in an inter-firm cooperation network • Number of inhabitants, geographical coordinates, average income of cities connected in a high-speed train network GEOGRAPHY OF NETWORKS**D1 Centrality**• Degree • Betweenness • Closeness GEOGRAPHY OF NETWORKS**D1 Degree Centrality**• The number of nodes adjacent to given node Highest Degree Centrality GEOGRAPHY OF NETWORKS**D1 Betweenness Centrality**• Loosely: number of times that a node lies along the shortest path between two others Highest Betweenness Centrality GEOGRAPHY OF NETWORKS**D1 Closeness Centrality**• Sum of geodesic distances to all other nodes • Inverse measure of centrality “Highest” Closeness Centrality GEOGRAPHY OF NETWORKS**Degree**Betweenness Closeness Data courtesy of David Krackhardt GEOGRAPHY OF NETWORKS**D1 Centrality**• Degree • how well connected; direct influence • Closeness • how far from all others • how long information takes to arrive • Betweenness • brokerage, gatekeeping, control of info GEOGRAPHY OF NETWORKS**D1 Clustering Coefficient**“The extent to which friends of friends are friends.” The extent to which the direct neighbours of a node are linked. CC=1/3 GEOGRAPHY OF NETWORKS**D2 Dyads**• A dyad is a pair of nodes in a network. • In a simple graph:A dyad can have value 0: the link is absent.A dyad can have value 1: the link is present. • In a valued graph:A dyad can have value 0: the link is absentA dyad can have a value other than 0: a link is present, with value x. • The cells of a matrix display all values of a dyad. GEOGRAPHY OF NETWORKS**D2 Dyads: geodesic distance**A particular type of path is the geodesic path: the shortest path between a pair of nodes. Matrix of geodesic distance GEOGRAPHY OF NETWORKS**D2 Dyads: reachability**• A graph is not necessarily one integrated network. It can consist of several components. A reachability matrix displays for every dyad whether there is a path between them (1) or not (0). GEOGRAPHY OF NETWORKS**D2 Bridge**• A tie that, if removed, would disconnect the net GEOGRAPHY OF NETWORKS**D2 Structural Holes**• Basic idea: Lack of ties among alters may benefit ego • Benefits • Autonomy • Control • Information GEOGRAPHY OF NETWORKS**B**B B A C A C A C B B A C A C D2 Brokerage Roles for node B Coordinator Representative Gatekeeper Consultant Liaison GEOGRAPHY OF NETWORKS**D3 Network: size and density**• The size of a network can be expressed in total number of nodes N or total number of links L. • The density Δof a network is the total number of existing links divided by the total number of possible links. • Density Δis expressed in a number between 0 (a completely disconnected graph) and 1 (a completely connected graph). Δ = 0.25 Δ = 0.39 GEOGRAPHY OF NETWORKS**D3 Network: average geodesic distance**• Geodesic distance: shortest path between two nodes • Average geodesic distance: average over all dyads of a graph Average geodesic distance = 1.9“core-periphery” Average geodesic distance = 2.4“clique structure” GEOGRAPHY OF NETWORKS**D3 Network: diameter**• Diameter D is the longest geodesic distance in the graph. Diameter = 4 Diameter = 3 GEOGRAPHY OF NETWORKS**Recent acquisition**Older acquisitions Original company D3 Network: number of components Data drawn from Cross, Borgatti & Parker 2001. GEOGRAPHY OF NETWORKS**D3 Cliques**• Maximum complete subgraph • Cliques have at least three members GEOGRAPHY OF NETWORKS**E Data collection**Primary network data: • Full network methods • Snowball methods • Ego networks Secondary network data: • e.g. patent data GEOGRAPHY OF NETWORKS**E Data collection:Full network methods**• Collecting all links for all actors in your population • A census of the whole population rather than a sample • Required for most of the node and network properties Popular method: roster-recall methodology + different kinds of links among the same set of actors + collecting data on characteristics of the links - high response rate required - time- and labour intensive - only static network data GEOGRAPHY OF NETWORKS**E Data collection: missing data**For conducting a proper Social Network Analysis it is extremely important to have complete network data: all linkages for all nodes in your network. Full network data:26 nodes Missing data:1 node out of 26 (node l) refused to cooperate with the survey GEOGRAPHY OF NETWORKS**E Data collection: The snowball method**• In the snowball method- you start asking for the links of a focal actor- you continue asking for the links of mentioned actors- you go on until now new actors are added to your list • You will identify a full network- if there are no isolates in the network- if the network consists of one large componentIn most cases you cannot come to know in advance if these conditions are satisfied! GEOGRAPHY OF NETWORKS**E Data collection: Ego networks**If it is unfeasible to collect full network data or use a snowball method, you can rely on ego networks. There are two possibilities: - Ego network without alter connections: you identify their direct links for a sample of your population - Ego network with alter connections: in addition you identify whether the direct neighbours (alters) of a node are connected among themselves. GEOGRAPHY OF NETWORKS**E Data collection: Ego networks**Suppose you do a network research on a population of 26 actors. You decide to ask one third of the actors of the population randomly. a d g j m p s v y (blue in the right graph) GEOGRAPHY OF NETWORKS**E Data collection: Ego networks**You will discover only part of the network: - only the direct alters of the nodes in your sample; - for these alters only the links to the sample actors. Hence, it is not allowed to:- calculate network properties on a network obtained by aggregating ego networks (density, average geodesic distance, diameter etc.);- calculate the degree for nodes which were not in the sample. Actually ego networks are not real network data. Degree has become a node characteristic that is comparable to normal statistical attribute data (that do not depend on the wider network structure). GEOGRAPHY OF NETWORKS**E Secondary (patent) data**Inter-firm network • Co-patenting • Multiple applicant inventorship Inventor network • Co-inventing GEOGRAPHY OF NETWORKS**E Secondary (patent) data**+ Possibility to do longitudinal network analysis + Less time consuming - Only cooperative links that have led to a patent are detected - Patenting behaviour varies strongly across sectors and over time - Patenting behaviour is strongly related to firm size - Universities and research institutes are underrepresented in patent data GEOGRAPHY OF NETWORKS