1 / 32

You Are Who You Know: Inferring User Profiles in Online Social Networks Third ACM International Conference on WSDM 2010

You Are Who You Know: Inferring User Profiles in Online Social Networks Third ACM International Conference on WSDM 2010. Alan Mislove , Bimal Viswanath , Krishna P. Gummadi , Peter Druschel MPI-SWS Rice University Northeastern University

jalen
Download Presentation

You Are Who You Know: Inferring User Profiles in Online Social Networks Third ACM International Conference on WSDM 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. YouAreWhoYouKnow:InferringUserProfilesinOnlineSocialNetworksThird ACM International Conference on WSDM 2010 Alan Mislove, BimalViswanath, Krishna P. Gummadi, Peter Druschel MPI-SWS Rice University Northeastern University Saarbrücken, Germany Houston, TX, USA Boston, MA, USA

  2. outline • Introduction • Approach • Result • My thought

  3. Introduction

  4. Facebook and personal data Profile Users upload information to sites like Facebook Profile information, Status updates, Photos, videos

  5. InferringUserProfilesinOnlineSocialNetworks • Givenattributesforsomeusers,wecaninfertheattributesoftheremainingusers. • Forexampleschool:NCCU.

  6. Facebook and personal data(cont.) • Privacy model for data Choose what to reveal And what to keep private • When reasoning about privacy Don’t often consider implicit data What our friends reveal about ourselves • What is implicit data? Exploiting homophily People associate with others who sharecommonattribute

  7. Approach

  8. Roadmap • Idea: Use communities to infer attributes • Collect fine-grained community data • Do attribute-based communities exist? • How well can we infer user attributes?

  9. Idea: Use communities • Discover:users are more likely to be friends with users with similar attributes, and that groups of users with common attributes often form dense subgraphs. • Look for groupings of users Called communities Potentially share attributes

  10. Roadmap • Idea: Use communities to infer attributes • Collect fine-grained community data • Do attribute-based communities exist? • How well can we infer user attributes?

  11. Social network data • Crawled two Facebook networks Rice University (university) New Orleans (regional) • Picked known seed user Crawled all of his friends, added new users to list Effectively performed a BFS of graph

  12. Social network data • Crawled two Facebook networks Rice University (university) New Orleans (regional) • Picked known seed user Crawled all of his friends, added new users to list Effectively performed a BFS of graph

  13. Social network data • Crawled two Facebook networks Rice University (university) New Orleans (regional) • Picked known seed user Crawled all of his friends, added new users to list Effectively performed a BFS of graph

  14. Collecting attributes • Obtained authoritative informationQueried student directoryCollege (dormitory), major(s), year • Could not collect Facebook profiles • Collected Facebook profiles • Collectall attributesE.g., high school, groups, genderhabits,

  15. Roadmap • Idea: Use communities to infer attributes • Collect fine-grained community data • Do attribute-based communities exist? • How well can we infer user attributes?

  16. Do attributes define communities? • Put users into groups based on attributesDetermine if these are communities • How good is a community division? • Need metric to rate communities • Modularity rates community strengthRange [-1,1]0 represents expected in random graph≥0.25 represents community structure

  17. Partition a network into k distinct communities.(eg.C1~C3) • Eiithe set of intra-community edges in CiEijthe set of inter-community edges connecting CiandCj(i≠j), |E|edgesinthesocialnetwork. • eii = |Eii|/|E| , eij= |Eij|/2|E|,ai=Σjeij,(Ci)

  18. Partitionasocialnetworkinto3communities:C1,C2,C3. • |E|=16,|E11|=5, |E12|=1, |E13|=1 • e11=|E11|/|E|=5/16,e12=|E12|/2|E|=1/32, e11=|E13|/2|E|=1/32,

  19. a1=Σ1≦j≦3e1j=5/16+1/32+1/32=3/8. • e22=5/16,e33=3/16,a2=3/8,a3=1/4ModularityQ=Σ1≦i≦3(eii-ai2)=15/32. • Tre=Σieii,||e||thesumofalltheelements.

  20. Attribute communities for Rice ugrads • Riceugradsdon’tformcommunitybaseonmajorattribute

  21. Roadmap • Idea: Use communities to infer attributes • Collect fine-grained community data • Do attribute-based communities exist? • How well can we infer user attributes?

  22. Using communities to infer attributes • Can we detect a single attribute community?Given a few users in the community • Propose a new algorithm to detect a specific communityProblem: How to evaluate community strength?

  23. B A

  24. Algorithm • Given seed users, find a community byAdding usersStopping at some point • At each step, add user who increasesnormalized conductance by the most • Stop when no user increases normalized conductance

  25. Normalized Conductance • How strong is community A? • Metric: Normalized conductance CFraction of A’s links within ARelative to a random graph • Range is [-1,1]0 represents no stronger than random

  26. Let G = (V,E) denote a graph, let A V be a subset of the vertices that forms a community, and let B = V A. • to be the number of edges between A and B as the number of edges within A. B A

  27. eA=eAA+eAB.ThenumberofedgesthatreachverticeswithinA. • eB=eAB+eBB.ThenumberofedgesthatreachverticeswithinB. • Inarandomgraph,weexpectthateAB=eAeB.

  28. How to evaluate? • Evaluate performance using precision and recall • ideally want a precision and recall of 1.0 C.由aglo檢測出來且確實有此attribute的人 B.實際上擁有此 attribute的人 C B A Recall= C/B A.由algo.檢測出來擁有此attribute的人 Precision = C/A

  29. A.由algo.檢測出來擁有此attribute的人 C.由aglo檢測出來且確實有此attribute的人 A=C Precision = C/A = 1 But, 並不是真的準, 所以還需要參考recall 的值。 B B.實際上擁有此 attribute的人 Recall= C/B

  30. RecallandPrecisionformatriculationyearcommunitydetectionforRiceugradsRecallandPrecisionformatriculationyearcommunitydetectionforRiceugrads

  31. Summary • Good interpretation: Can reduce burden on users Don’t have to fill in entire profile • Bad interpretation: Can figure out attributes users don’t reveal Privacy is a function of what friends reveal

  32. My thought • Ongoing online social network privacy debateFocuses mainly on explicitly provided attributes • Demonstrated that many attributes can be inferredEven if user didn’t provide them

More Related