1 / 23

A Study of social influence in diffusion of innovation over Facebook

A Study of social influence in diffusion of innovation over Facebook. Shaomei Wu sw475@cornell.edu Information Science Cornell University Information Science Breakfast, Dec 5, 2008. Diffusion of Innovation.

jana
Download Presentation

A Study of social influence in diffusion of innovation over Facebook

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Study of social influence in diffusion of innovation over Facebook Shaomei Wu sw475@cornell.edu Information Science Cornell University Information Science Breakfast, Dec 5, 2008

  2. Diffusion of Innovation “ Diffusion is the process in which an innovation is communicated through certain channels over time among the members of a social system. ” –––– Everett M. Rogers * • “innovation”: Friendship Quiz – a Facebook application • “Communicated”: Invitations among Facebook friends • “time”: September 25, 2008 – Now • “social system”: Facebook * Rogers, Everett M. (2003). Diffusion of Innovations, 5th ed.. New York, NY: Free Press, pp 5-6

  3. Basic Diffusion Models Threshold Model ⇔ Cascade Model Statistically Equivalent * *David Kempe, Jon Kleinberg, Eva Tardos. Maximizing the Spread of Influence through a Social Network.KDD, 2003

  4. Cascade Model • Each recommendation will succeed with certain probability. h k b pgk c i pab pab pac pdi pgl g pag d a pad l pdj paf j pae non-adopter adopter social link recommendation f e Question: how to estimate puv ?

  5. Question: how to estimate puv? • Current practice • Constant [1] • Based on ONLY network structure (e.g., in/out-degree) [2] Do individuals and the social relationship among them matter? [1] Jure Leskovec, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst,Cascading Behavior in Large Blog Graphs. SDM 2007. [2] Jure Leskovec, Lada Adamic, Bernardo Huberman. The Dynamics of Viral Marketing. ACM Conference on Electronic Commerce (EC) 2006.

  6. Theories from Empirical Diffusion Research: • Opinion leaders: who own “greater exposure to mass media than their followers”, “are more cosmopolite”, “have greater social participation” , “have higher socioeconomic status”, and “are more innovative” [Rogers 2003, pp 316-318]. • The importance of heterophilybetween participants on certain attributes (i.e., education and socioeconomic status) at determining the efficiency of diffusion, despite the fact that “more effective communication occurs when two or more individuals are homophilous” [Rogers, 2003, pp19]

  7. This project is to… • Model puv’s for cascade model • Identify the most influential factors at determining puv • Predict the success of contagion • Exploit Facebook data • A real-world, ongoing diffusion instance; • Rich and (most of the time) trustable profile information of individuals and their social connections/activities; • Precisely timestamped diffusion process, a complete log of events;

  8. Status • Launched: Sep 25, 2008. • Currently used data is until: Nov25, 2008. • 216 adopters, • 375 individuals, • 737 edges between 266 pairs of people, • 90 successful infection • 178 failed infection • Network Evolution (in the first month after release)

  9. Predict the success of invitation with SVM • A Binary classifier: • each invitation is either successful or failed. • Features • Individual features • Pair features (homophily/heterophily)

  10. Individual Features # of events attended/invited # of photo tagged # of wall posts # of networks # of groups participated # of notes Religion Political View Gender Age Culture Background Relationship Status Work Info Education Info Social Activeness Innovativeness Socioeconomics Education

  11. Pair-wise Features Biological traits Age difference Same gender? Same political view? Same religion? Same culture background? # of same networks # of photos both tagged # of groups both participated # of events both attended Same education level? Same high school? Same college? Same workplace? Same current city? Belief Socioeconomics Proximity

  12. Each invitation is a training example - machine learning. Training Data * all numerical features are normalized across examples.

  13. AdaBoost (with DecisionDump) A popular way to do feature selection. • Selected Features • sender wall post count • sender group count • sender network count • receiver age • receiver group count • sender & receiver common group count • Performance (10-fold cross validation) • Accuracy: 83.6%

  14. SVM performance • SVM-light (10-fold cross-validation)

  15. Weights from SVM

  16. Result • SVM-light performance • 209 records into 5 folds, 4 for training, 1 for testing. • Performance on the testing set: • Accuracy: 71.43% (30 correct, 12 incorrect, 42 total) • Precision/recall: 55.56%/38.46% • Feature weights distribution Top weighted features: 8, sender_events_invited,4, sender_friend_count,11, sender_gender35, receiver_is_It's Complicated5, sender_wall_post_count,9, sender_note_count27. sender_is_In a Relationship So, the story can be: when a sender who has been invited to greater number of events in Facebook, has more friends, wrote more Facebook notes (blog entries), is female, has less wall posts, in a relationship, tried to infect a person whose relationship status is “it’s complicated”, it’s more like the infection will happen compared to other cases.

  17. SVM with features selected by AdaBoost

  18. Background • Diffusion of Innovation • Question: • How does it work in largeonline social networks? • What are the key factors at determining the success of infection? • Can we predict the propagation path?

  19. Hypothesis • Social influence depends on 5 dimensions of similarities: • geographical distance current location(country/state/city), current school, current major, year of class, current workplace, current courses enrolled; • background similarity sex, sexual preference, dating interest, relationship interest, relationship status, birthday, political view, religious view, hometown address, previous school, previous workplace; • social similarity number of mutual networks they belong to, number of mutual friends; • interest similarity activities, favorite books, favorite music, favorite movies, favorite TV shows, favorite quotas; • social status distance difference of numbers of friends, difference of wallpost counts, difference of counts of message sent and received, difference of counts of notes.

  20. Project Description • Objectives • Identify the key factors for social influence; • Predict occurrence of adoption based on the key factors. • Friendship Quiz • A Facebook application we developed; • Enable users to make quizzes and send to their friends (take a peek!); • We track the spread of application.

  21. Highlights • A real-world diffusion of innovation; • Rich and (most of the time) trustful profile information of individuals and their social connections/activities; • Precisely timestamped diffusion process, a complete log of events; • Ongoing diffusion process

  22. Backup: Threshold Model

More Related