February 17 2012 - PowerPoint PPT Presentation

slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
February 17 2012 PowerPoint Presentation
Download Presentation
February 17 2012

play fullscreen
1 / 30
February 17 2012
Download Presentation
Download Presentation

February 17 2012

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. WST500 Online social networks: Trends in research and the small world phenomenon Meeyoung ChaAssistant Professor Graduate School of Culture Technology KAISTmeeyoungcha@kaist.edu February 17 2012

  2. Roadmap [Credit: zastaviki.com] • Trends in social network • Basic concepts • Sentiment analysis • Six degree of separation • Milgram’s experiment • Kevin bacon game • Network models

  3. The new science of networks • Why is the role of networks in computer science, information science, social science, physics, economics, and biology expanding? • Rise of the Web and social media led to more data • Online communities (Facebook 500M), news and micro-blogging sites • Shared vocabulary between different fields • Network as a set of weakly interacting entities • Helps us find patterns from seemingly complex structure Internet Citation Sexual contact Yeast-protein

  4. What we can learn from social media I am @Starbucks I’m bored Check out this linkhttp://... Fire here! Great shampoo! Listening to … [Credit: teslasociety.com] • Social media help us gain new insight into the world we live in and answer long-standing questions in social science

  5. Can public sentiments expressed in social media predict the stock market? [Credit: wired.com]

  6. Yes, based on a Twitter study • Analysis of public tweets from Feb 28th – Dec 19th 2008(9.8 million tweets by 2.7 million users) http://arxiv.org/abs/1010.3003By Johan Bollen, Hunia Mao, Xiao-Jun Zeng, Oct 2010 • Out of various types of emotions, “calmness” line up very well with the Dow Jones Industrial Average • Training a machine-learning algorithm with a 3-day prior data could predict the stock price with 86.7% accuracy

  7. Data methodology • Step 1: Finding opinions • Focused on explicit mood statements e.g. “I am feeling”, “makes me”, “I am” • Excluded tweets with URLs in order to avoid spam messages • Step 2: Finding mood dimensions • Used a psychology dictionary POMS (Profile of mood states) that gives scores to words across different mood states [McNair, Lorr, and Droppleman, 1979] • Authors extended POMS to cover more recently used words from Google and measured six mood sates: calm, alert, sure, vital, kind, happye.g. “I feel nervous about doing something new”

  8. Sanity Check • Could confirm that people are anxious the day before the US election • On Thanksgiving, “happy” score spiked

  9. Roadmap [Credit: zastaviki.com] • Trends in social network • Basic concepts • Sentiment analysis • Six degree of separation • Milgram’s experiment • Kevin bacon game • Network models

  10. Small world phenomenon • A network is a small world if all nodes are connected to all other nodes through relatively short distances. 1. How short is “relatively” short? 2. Do there exist models of networks that produce short paths?

  11. Interview with Duncan Watts

  12. Milgram experiment (1969) Stanley Milgram • Asked random people from Nebraska to send a letter (via intermediaries) to a stock broker in Boston • Could only send to letters to those acquainted on first-name basis • 296 volunteers participated with help of 453 intermediaries. Ultimately, 29% of the letters reached the target! [Travers and Milgram, Sociometry 1969]

  13. Six degrees of separation Mean = 5.2 • The (successful) chain length was on average six hops, indicating that everyone is connected to everyone else through six links. [Travers and Milgram, Sociometry 1969]

  14. Erdős numbers - 1 Paul Erdős (1913-1996) • Number of links required to connect scholars to Erdős via co-authorship of papers • Erdős wrote 1500+ papers with 507 co-authors • Jerry Grossman’s site allows mathematicians to compute their Erdős numbers: • http://www.oakland.edu/enp/ • Connecting path lengths, among mathematicians only: • The average is 4.65 • The maximum is 13

  15. Erdős numbers - 2 Count Scientists are linked to one another through the papers they write, because coauthorship represents a strong social link. Distance to Paul Erdős • Collaboration graph • Nodes are authors and links mean coauthor relationships • Erdős number: distance from Paul Erdős • 280,000 reachable nodes; mean path length = 4.65 links

  16. Kevin Bacon Game Boxed version of the Kevin Bacon Game • Invented by Albright College students in 1994 • Goal is to connect any actor to Kevin Bacon, by linking actors who acted in the same movie • Oracle of Bacon website uses Internet Movie Database (IMDB.com) to find shortest link between any two actors: http://oracleofbacon.org/ • Total # of actors in database: ~550,000 • Average path length to Kevin: 2.79 • Actor closest to “center”: Rod Steiger (2.53) • Most actors are within 3 links of each other!

  17. Six Degrees of Kevin Bacon

  18. http://oracleofbacon.org/ Example: 전도연 to Kevin Bacon

  19. What about online social networks? • Snail letter network, co-authorship network, movie network • Small-scale examples • Direct human networks, where social link is strong => Hence the chance of having a short path could increase • Online social networks • Tens of millions of users and links • Social links not necessarily based on direct encounters => Should we expect social networks to have short paths too?

  20. Example 1: Microsoft IM • Leskovec and Horvitz (2007) • 180 million nodes and 1.3 billion edges in the messenger network • Mean path length = 6.6 links [Leskovec and Horvitz, WWW 2007]

  21. Example 2: LiveJournal • Liben-Nowell et al. (2005) • Greedy geographic routing to friend closest to destination • Mean path length = 4.12 links [Liben-Nowell et al., PNAS 2005]

  22. Small world networks • Based on Milgram’s (1967) famous work, the substantive point is that networks are structured such that even when most of our connections are local, any pair of people can be connected by a fairly small number of relational steps.

  23. 6-degrees: Should we be surprised? How can we understand the small world phenomena? Is there a good model? [Credit: Jure Leskovec] • Assume each person is connected to 100 other people • So • In step 1, one can reach 100 people • In step 2, one can reach 100x100 = 10,000 people • … • In step 5, one can reach 10 billion people • What’s not obvious here? • Many edges are local (i.e., friend of a friend)

  24. Transition from regular to random? • Do there exist models of networks that have high clustering and low diameter (like real networks)? • Studied shift from structured networks (lattices) to “random” networks

  25. Watts and Strogatz Model (1998) 6 billion nodes on a circle Each connected to 1,000 neighbors Start rewiring links randomly Calculate “average path length” and “clustering” as the network starts to change Network changes from structured to random APL: starts at 3 million, decreases to 4 (!) Clustering: probability that two nodes linked to a common node will be linked to each other (degree of overlap) Clustering: starts at 0.75, decreases to 1 in 6 million So what happens along the way? [Watts and Strogatz, Nature 1998]

  26. From regularity to randomness • “Rewire” edges of lattice independently with probability P then examined the average distance L(p) and clustering coefficient C(p) [Watts and Strogatz, Nature 1998]

  27. Small worlds around us Caenorhabditis Elegans 959 cells Genome sequenced 1998 Nervous system mapped  small world network Power grid network of Western States 5,000 power plants with high-voltage lines  small world network

  28. Scale-free network Colorado Springs High-Risk (Sexual contact only) Network is power-law (a=-1.3) • The scale-free model focuses on the distance-reducing capacity of high-degree nodes, as ‘hubs’ create shortcuts that carry the disease.

  29. Implications Applications to • Spread of diseases (foot-and-mouth disease, computer viruses, AIDS) • Spread of fashions • Spread of knowledge Small-world networks are: • Robust to random failures • Vulnerable to selectively targeted attacks

  30. Conclusion: small world phenomenon • Watts and Strogatz demonstrated that small world properties can occur in graphs with a surprisingly small number of shortcuts • 1. How short is “relatively” short?Empirical evidence shows six degree of separation.People can find efficient routes with only local information. • 2. Do there exist models of networks that produce short paths?A small number of long range “shortcuts” suffice to significantly reduce average distance.