1 / 25

A nonymized social networks

Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Stenography. A nonymized social networks. What is a social network?. A social network occurs anywhere there is social interaction between people.

ownah
Download Presentation

A nonymized social networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Stenography Anonymized social networks

  2. What is a social network? • A social network occurs anywhere there is social interaction between people. • Examples include Email, instant messaging, Facebook, blogging trackbacks, coauthor networks

  3. Coauthor Network

  4. Uses of mining social networks • The structure of social networks can be interesting How are friendships usually structured? Are there hubs, such as Heather, who connect separate networks? How many degrees of Kevin Bacon? We can investigate these questions if we have the data to mine.

  5. Email • For our examples, we will use a network of emails sent between users. • How do we protect users’ privacy while still releasing the data for research? John Mary Vertex Directed edge Vertex

  6. Anonymization Techniques • Remove any identifiable information, such as name and other attributes. • Randomly rename the vertices R3579X R73313

  7. Anonymization Techniques • Convert directed edges to undirected edges. This increases the complexity and makes it harder to attack. R3579X R73313 Undirected edge

  8. Compromising privacy • Let’s say you want to know if two vertices are connected onthe graph. • All the identifying info has beenremoved, so how do we do it?

  9. Active Attacks! • An active attack involves the adversary creating vertices in the graph before the graph is released • The adversary will create edges between the vertices in a fashion that it can then recognize later on in when the graph is released

  10. Walk-Based Attack • We create k new vertices around 2*(log n) where n is the total number of vertices • We create new do – d1 edges between these new vertices and the other ones in the graph • Then, we randomly create edges between these new nodes with independent probability of 1/2

  11. Algorithm • Given the graph, how do we find the subgraph that we created? • Create a search tree, pruning the tree based on the properties of our subgraph, such as the number of degrees of our new vertices

  12. Are Mary and John connected? John Mike Mary Zoe Tom

  13. Are Mary and John connected? John Mike k1 k5 k2 k4 Mary k3 Zoe Tom

  14. Are Mary and John connected? John Mike k1 k5 k2 k4 Mary k3 Zoe Tom

  15. Are Mary and John connected? John Mike k1 k5 k2 k4 Mary k3 Zoe Tom

  16. Graph is released ZXCV ASDF WER DFG UYT ASD QWER HGF BNM JKL

  17. We identify our subgraph ZXCV ASDF k1 k5 k2 k4 QWER k3 BNM JKL

  18. Yes, they’re connected John ASDF k1 k5 k2 k4 Mary k3 BNM JKL

  19. Analysis • The paper proves that the search tree does not grow too large and that the algorithm displays good performance • Also, it proves that the subgraph is unique so that we don’t identify the wrong subgraph

  20. Experimental attack • They simulate an attack on LiveJournal friendship links. They create the accounts on the website, make the connections, and then crawl the site and anonymize the data • The network has 4.4 million nodes and 77 million edges

  21. Results

  22. Cut-Based Attack • Only needs sqrt(log(n)) new nodes to attack the graph • However, it’s much more computationally intensive and less practical in the real world, although it takes less nodes

  23. Cut-based Results

  24. Passive Attack • It’s a lot like an active attack, except you don’t create new nodes, instead you collaborate with your friends and find yourselves in the graph • However, because you did not specifically target certain people, you may not be able to identify other people when you find yourself

  25. Conclusion • We cannot rely on anonymization to ensure privacy in social networks • Possible improvements: add noise to the data by adding/removing random edges

More Related