1 / 33

SI 614 Community structure in networks

SI 614 Community structure in networks. Lecture 17. Outline. One mode networks and cohesive subgroups measures of cohesion types of subgroups Affiliation networks team assembly. Why care about group cohesion?. opinion formation and uniformity.

adeola
Download Presentation

SI 614 Community structure in networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SI 614Community structure in networks Lecture 17

  2. Outline • One mode networks and cohesive subgroups • measures of cohesion • types of subgroups • Affiliation networks • team assembly

  3. Why care about group cohesion? • opinion formation and uniformity • if each node adopts the opinion of the majority of its neighbors, it is possible to have different opinions in different cohesive subgroups

  4. within a cohesive subgroup – greater uniformity

  5. Other reasons to care • Discover communities of practice (more on this next time) • Measure isolation of groups • Threshold processes: • I will adopt an innovation if some number of my contacts do • I will vote for a measure if a fraction of my contacts do

  6. What properties indicate cohesion? • mutuality of ties • everybody in the group knows everybody else • closeness or reachability of subgroup members • individuals are separated by at most n hops • frequency of ties among members • everybody in the group has links to at least k others in the group • relative frequency of ties among subgroup members compared to nonmembers

  7. Cliques • Every member of the group has links to every other member • Cliques can overlap clique of size 4 overlapping cliques of size 3

  8. Considerations in using cliques as subgroups • Not robust • one missing link can disqualify a clique • Not interesting • everybody is connected to everybody else • no core-periphery structure • no centrality measures apply • How cliques overlap can be more interesting than that they exist • Pajek • remember from class on motifs: • construct a network that is a clique of the desired size • Nets>Fragment (1 in 2)>Find

  9. a less stingy definition of cohesive subgroups: k cores • Each node within a group is connected to k other nodes in the group 4 core 3 core Pajek: Net>Partitions>Core>Input,Output,All Assigns each vertex to the largest k-core it belongs to

  10. subgroups based on reachability and diameter • n – cliques • maximal distance between any two nodes in subgroup is n 2-cliques • theoretical justification • information flow through intermediaries

  11. frequency of in group ties • Compare # of in-group ties within-group ties ties from group to nodes external to the group Given number of edges incident on nodes in the group, what is the probabilitythat the observed fraction of them fall within the group? The smaller the probability – the stronger the cohesion

  12. considerations with n-cliques • problem • diameter may be greater than n • n-clique may be disconnected (paths go through nodes not in subgroup) 2 – clique diameter = 3 path outside the 2-clique • fix • n-club: maximal subgraph of diameter 2

  13. cohesion in directed and weighted networks • something we’ve already learned how to do: • find strongly connected components • keep only a subset of ties before finding connected components • reciprocal ties • edge weight above a threshold

  14. Example: political blogs (Aug 29th – Nov 15th, 2004) • all citations between A-list blogs in 2 months preceding the 2004 election • citations between A-list blogs with at least 5 citations in both directions • edges further limited to those exceeding 25 combined citations only 15% of the citations bridge communities

  15. Affiliation networks • otherwise known as • membership network • e.g. board of directors • hypernetwork or hypergraph • bipartite graphs • interlocks

  16. m-slices • transform to a one-mode network • weights of edges correspond to number of affiliations in common • m-slice: maximal subnetwork containing the lines with a multiplicity equal to or greater than m 1-slice 1 1 1 A = 1 2 2 slice

  17. Pajek: Net>Transform>2-Mode to 1-Mode> Include Loops, Multiple Lines Info>Network>Line Values (to view) Net>Partitions>Valued Core>First threshold and step

  18. Scottish firms interlocking directorates legend: 2-railways 4-electricity 5-domestic products 6-banks 7-insurance companies 8-investment banks

  19. methods used directly on bipartite graphs rare Finding bicliques of users accessing documents An algorithm by Nina Mishra, HP Labs Users Documents

  20. Team Assembly Mechanisms Determine Collaboration Network Structure and Team PerformanceRoger Guimera, Brian Uzzi, Jarrett SpiroLuıs A. Nunes AmaralScience, 2005 astronomy andastrophysics social psychology economics

  21. Issues in assembling teams • Why assemble a team? • different ideas • different skills • different resources • What spurs innovation? • applying proven innovations from one domain to another • Is diversity (working with new people) always good? • spurs creativity + fresh thinking • but • conflict • miscommunication • lack of sense of security of working with close collaborators

  22. Parameters in team assembly • m, # of team members • p, probability of selecting individuals who already belong to the network • q, propensity of incumbents to select past collaborators Two phases • giant component of interconnected collaborators • isolated clusters

  23. creation of a new team • incumbents (people who have already collaborated with someone) • newcomers (people available to participate in new teams) • pick incumbent with probability p • if incumbent, pick past collaborator with probability q

  24. Time evolution of a collaboration network newcomer-newcomer collaborations newcomer-incumbent collaborations new incumbent-incumbent collaborations repeat collaborations after a time t of inactivity, individuals are removed from the network

  25. BMI data • Broadway musical industry • 2258 productions • from 1877 to 1990 • musical shows performed at least once on Broadway • team: composers, writers, choreographers, directors, producers but not actors • Team size increases from 1877-1929 • the musical as an art form is still evolving • After 1929 team composition stabilizes to include 7 people: • choreographer, composer, director, librettist, lyricist, producer

  26. Collaboration networks • 4 fields (with the top journals in each field) • social psychology (7) • economics (9) • ecology (10) • astronomy (4) • impact factor of each journal • ratio between citations and recent citable items published • A= total cites in 1992 • B= 1992 cites to articles published in 1990-91 (this is a subset of A) • C= number of articles published in 1990-91 • D= B/C = 1992 impact factor

  27. size of teams grows over time

  28. degree distributions data data generated from a model with the same p and q and sequence of team sizes formed

  29. Predictions for the size of the giant component • higher p means already published individuals are co-authoring – linking the network together and increasing the giant component S = fraction of network occupied by the giant component

  30. Predictions for the size of the giant component(cont’d) • increasing q can slow the growth of the giant component – co-authoring with previous collaborators does not create new edges

  31. network statistics what stands out? what is similar across the networks?

  32. different network topologies ecology economics astronomy

  33. main findings • all networks except astronomy close to the “tipping” point where giant component emerges • sparse and stringy networks • giant component takes up more than 50% of nodes in each network • impact factor (how good the journal is where the work was published) • p positively correlated • going with experienced members is good • q negatively correlated • new combinations more fruitful • S for individual journals positively correlated • more isolated clusters in lower-impact journals ecology, economics, social psychology ecology social psychology

More Related