1 / 52

Social and Information Networks Theory and Practice

Social and Information Networks Theory and Practice. Anirban Dasgupta Isabelle Stanton. Topics. The Structure of Networks Small world networks Generative models The Long Tail Community Detection Cascades and Viral Processes Computation on Large Graphs Sampling and Surveying

garson
Download Presentation

Social and Information Networks Theory and Practice

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Social and Information NetworksTheory and Practice AnirbanDasgupta Isabelle Stanton

  2. Topics • The Structure of Networks • Small world networks • Generative models • The Long Tail • Community Detection • Cascades and Viral Processes • Computation on Large Graphs • Sampling and Surveying • Crowdsourcing • ….?

  3. Coursework • 1 (group) Project • 2 Reaction Papers • 2-3 Experimental Assignments • Scribing

  4. Office Hours • Isabelle – Soda 645 Time TBD • Anirban – By appointment http://cs294socialnetworks.org

  5. Available Data Sets • Yahoo! Webscope data • Will be available in a few weeks • Social Network Crawls • LiveJournal, Twitter, Orkut, Flickr, YouTube, Facebook • SNAP archive • Citation Networks • HEP-th, dblp (over time), theory… • Physical Systems • Power grid, autonomous systems… • Web graphs • Notre Dame, Berkeley/Stanford, Wikipedia…

  6. Complex Systems Around Us • We are surrounded by complex systems • Society is interaction of 7 billion individuals • Communication Systems (e.g. Internet) is formed by linking devices • Our cellsfunctionby interaction of proteins • Thoughts in our brain are formed by interactions of neurons • What are some common properties of these systems? How can we study them?

  7. Why study Networks? Behind each of the complex systems, there is an underlying wiring diagram: the network We will never understand the complex system without understanding the network behind it Nodes: elements Links: interactions System: Graph/network

  8. Network: Online Social Networks Nodes: members Links: “friend”

  9. Network: Internet Nodes: routers Links: connections

  10. Network: US power grid Nodes: power stations Links: power lines

  11. Network: Economy Nodes: Companies Investment Pharma Research Labs Public Links: Collaborations Financial R&D http://ecclectic.ss.uci.edu/~drwhite/Movie

  12. Network: Human Disease Nodes: Disease class Links: share gene

  13. Network: Yeast Proteins Nodes: Proteins Links: chemical interaction

  14. Network: Brain Human Brain has between 10-100 billion neurons. Nodes: neurons Links: connections

  15. Why Study Networks?

  16. Networks: Predicting the H1N1 outbreak

  17. Satellite map of US

  18. Result of US power grid outage

  19. Network: US power grid Nodes: power stations Links: power lines

  20. Without studying networks, we cannot …. • stop cascading outages in power-grids • forecast how disease spreadsin a society • design search engines like Google • understand how interaction of genomes create life • …

  21. What do we study in networks? • Structure and evolution • How does a network look like? • How did it come to be like that? • Process and dynamics • Networks provide skeletons for information, for disease spreading, other dynamic processes

  22. How would we study a network? • Empirical: Study network data to find out a particular principle • Data analysis, experiments, sociology surveys, … • Analyze: Is this principle surprising? How universal is this principle? • Statistics, probability, domain knowledge,… • Hypothesize: Build models that would explain the observed principle • Algorithms, graph theory, statistics, probability, domain knowledge…

  23. Why now? • Data availability • Storage and computation are only getting cheaper • Massive amounts of data about human interaction • Universality • Networks arising from different fields of science and technology have surprisingly common properties • Shared Vocabulary • Statisticians, Cognitive Scientists, Physicists, Biologists, Computer Scientists,..

  24. The story of“six degrees of separation”or“small world phenomenon”

  25. Before there was the Internet • There were still social networks • How can we measure anything about them? • What do social networks look like? • How connected are we?

  26. Milgram’s Experiment (1967) • Wanted to know about the global friendship network • If information is spreading through friends, how soon will it reach one particular person • Cannot really obtain the entire friendship network, so designed an experiment to find out this quantity Stanley Milgram

  27. MA NE Milgram’s Experiment (1967) 300 people in midwest each given a letterTarget  stockbroker in Boston Can only forward the letter to someone you know! Goal: Reach the target

  28. Milgram’s Experiment: Results 300 people in midwest each given a letterTarget  stockbroker in Boston Can only forward to someone you know! Total no. of chains 64 64 total Average number of steps 6.5 “six degrees of separation”

  29. Six degrees of separation For almost all random pairs among 6 billion individuals There is a path with at most 6 steps

  30. Experimental Problems • Selection bias • Starting points weren’t random but people who responded to an ad for ‘well-connected people’ • Highly disconnected groups aren’t sampled • Dropped chains • 232 of 296 never reached the target • 136 of 160 never reached the target • 16 of the 24 went through the same last hop

  31. Was this a fluke? • Replicated by researchers using emails, Facebook • Similar property (short paths between pairs of nodes) also seen in other networks • protein-protein network, gene network • economic networks • language networks…

  32. Six degrees of separation Is this surprising? • The average number of steps in chain was 6 • Why should there be 6 steps? • Hint: Suppose everyone has 100 friends, then? • But, your friends are friends among themselves !! Hermione Harry Ron

  33. Small World Networks Alone these two properties aren’t very surprising. Together, they are. High ‘clustering’ Friends of my friends are likely to be my friends. Small diameter I have ~100 friends,who each have ~100 friends, and so on… So, I can reach everyone in s steps where 100s = n s = log(n)

  34. Six degrees of separation • People do have moderately large (~100-1000) set of friends • But these friends typically occur in clusters • Everyone in a school, workplace, town… • In the presence of these properties, six degrees of separation is not obvious • Surprisingly, people can actually find the small paths…

  35. The Small World concept in simple terms describes the fact despite their often large size, in most networks there is a relatively short path between any two nodes.

  36. Why do we want to understand this?

  37. Why Study Small World property? • Purely scientific: • Why is there something this universal ? • Many very concrete applications: • Designing peer-to-peer systems (Napster, Gnutella), building computer networks • How to spread information with limited budget, say about an upcoming movie • How to stop spreading of viral infections?

  38. How can we explain this? • What if we could hypothesize how networks are formed? • Basic intuition: models have to contain element of structured relation as well as random elements • Example, for social networks • structured friendships: college classmates • different interests: people have different groups of friends • random friendships: met on a train-ride • Still on ongoing area of research…

  39. The Structure of Social Networks • Small diameter • Strongly connected (many short paths) • There exist highly connected people • High clustering coefficient • There are ‘short range’ and ‘long range’ edges • Local routing algorithms are successful What other types of networks have this property?

  40. Erdős–Rényi Graphs • Classic random graph model • G() – for n nodes, add every edge with probability

  41. Erdős–RényiProperties • Not connected unless • No real clustering • Every vertex has the same expected degree • Doesn’t really have any underlying structure Not a good model of a social network

  42. Watts-Strogatz Model • Parameters: • Construct a ring with vertices. Connect each to their nearest neighbors. • Rewire each edge with probability

  43. Watts-Strogatz Properties • Has local and long-range edges • Path lengths • approach • Clustering Coefficient • starts at ¾, decreases to • Degree distribution • same as G(n,p) Key feature of the model is rewiring allows ‘weak ties’

  44. What can we say about when short paths can be found with local information?

  45. Kleinberg’s Small World Networks • How does the network structure affect being able to locally find short paths? • Start with a grid. • Add edge with probability • As changes, what happens?

  46. Decentralized Routing • is given a message to send to • knows where is on the grid • Try to get the message to as fast as possible • can only see its own links Without the random edges, any message can be routed in time.

  47. Kleinberg’s Results • All long range links equally likely • Short paths exist (whp) • They can’t be found with local information Thm: When , the expected delivery time of any decentralized algorithm is at least

  48. Kleinberg’s Lower Bound

  49. Algorithmically • Decentralized routing delivers messages in an expected steps • All others requires time Why ?

  50. Geometry of the Network v x x u The expected length of x is based on r

More Related