1 / 26

Narayanan-Shmatikov attack (IEEE S&P 2009)

Narayanan-Shmatikov attack (IEEE S&P 2009). A. Narayanan and V. Shmatikov, De-anonymizing Social Networks , IEEE S&P 2009 . Anonymized data: Twitter (crawled in late 2007) A microblogging service 224K users, 8.5M edges Auxiliary data: Flicker (crawled in late 2007/early 2008)

Download Presentation

Narayanan-Shmatikov attack (IEEE S&P 2009)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Narayanan-Shmatikov attack (IEEE S&P 2009) • A. Narayanan and V. Shmatikov, De-anonymizing Social Networks, IEEE S&P 2009. • Anonymized data: Twitter (crawled in late 2007) • A microblogging service • 224K users, 8.5M edges • Auxiliary data: Flicker (crawled in late 2007/early 2008) • A photo-sharing service • 3.3M users, 53M edges • Result: 30.8% of the users are successfully de-anonymized Twitter Flicker Heuristics Eccentricity Edge directionality Node degree Revisiting nodes Reverse match User mapping

  2. Srivatsa-Hicks Attacks (ACM CCS 2012) • M. Srivatsa and M. Hicks, De-anonymizing Mobility Traces: using Social Networks as a Side-Channel, ACM CCS 2012. • Anonymized data • Mobility traces: St Andrews, Smallblue, and Infocom 2006 • Auxiliary data • Social networks: Facebook, and DBLP • Over 80% users can be successfully de-anonymized

  3. Motivation • Question 1: Why can structural data be de-anonymized? • Question 2: What are the conditions for successful data de-anonymization? • Question 3: What portion of users can be de-anonymized in a structural dataset? [1] P. Pedarsani and M. Grossglauser, On the Privacy of Anonymized Networks, KDD 2011. [2] L. Yartseva and M. Grossglauser, On the Performance of Percolation Graph Matching, COSN 2013. [3] N. Korula and S. Lattanzi, An Efficient Reconciliation Algorithm for Social Networks, VLDB 2014.

  4. Motivation • Question 1: Why can structural data be de-anonymized? • Question 2: What are the conditions for successful data de-anonymization? • Question 3: What portion of users can be de-anonymized in a structural dataset? Our Constribution Address the above three open questions under a practical data model.

  5. Outline • Introduction and Motivation • System Model • De-anonymization Quantification • Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice • Implication 2: Secure Data Publishing • Conclusion

  6. System Model • Anonymized Data • Auxiliary Data • De-anonymization • Measurement

  7. System Model • Anonymized Data • Auxiliary Data • De-anonymization • Measurement • Quantification • Configuration Model • G can have an arbitrary degree sequence that follows any distribution conceptual underlying graph

  8. Outline • Introduction and Motivation • System Model • De-anonymization Quantification • Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice • Implication 2: Secure Data Publishing • Conclusion

  9. De-anonymization Quantification • Perfect De-anonymization Quantification Structural Similarity Condition Graph/Data Size Condition

  10. De-anonymization Quantification • -Perfect De-anonymization Quantification Graph/Data Size Condition Structural Similarity Condition

  11. Outline • Introduction and Motivation • System Model • De-anonymization Quantification • Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice • Implication 2: Secure Data Publishing • Conclusion

  12. Evaluation • Datasets

  13. Evaluation Structural Similarity Condition • Perfect De-anonymization Condition Graph/Data Size Condition

  14. Evaluation • -Perfect De-anonymization Condition Projection/Sampling Condition Structural Similarity Condition Graph/Data Size Condition

  15. Evaluation Structural Similarity Condition • -Perfect De-anonymizability

  16. Evaluation Structural Similarity Condition • -Perfect De-anonymizability How many users can be successfully de-anonymized

  17. Outline • Introduction and Motivation • System Model • De-anonymization Quantification • Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice • Implication 2: Secure Data Publishing • Conclusion

  18. Optimization based De-anonymization (ODA) • Our quantification implies • An optimum de-anonymization solution exists • However, it is difficult to find it. Select candidate users from unmapped users with top degrees Mapping candidate users by minimizing the Edge Error function

  19. Optimization based De-anonymization (ODA) • Our quantification implies • An optimum de-anonymization solution exists • However, it is difficult to find it. ODA Features 1. Cold start (seed-free) 2. Can be used by other attacks for landmark (seed) identification 3. Optimization based Space complexity Time complexity

  20. ODA Evaluation • Dataset • Google+ (4.7M users, 90.8M edges): using random sampling to get anonymized graphs and auxiliary graphs • Gowalla: • Anonymized graphs: constructed based on 6.4M check-ins<UserID, latitude, longitude, timestamp, location ID> generated by .2M users • Auxiliary graph: the Gowalla social graph of the .2 users (1M edges) • Results: landmark identification

  21. ODA Evaluation • Dataset • Google+ (4.7M users, 90.8M edges): using random sampling to get anonymized graphs and auxiliary graphs • Gowalla: • Anonymized graphs: constructed based on 6.4M check-ins<UserID, latitude, longitude, timestamp, location ID> generated by .2M users • Auxiliary graph: the Gowalla social graph of the .2 users (1M edges) • Results: de-anonymization

  22. Outline • Introduction and Motivation • System Model • De-anonymization Quantification • Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice • Implication 2: Secure Data Publishing • Conclusion

  23. Secure Structural Data Publishing • Structural information is important • Based on our quantification • Secure structural data publishing is difficult, at least theoretically • Open problem …

  24. Conclusion • We proposed the first quantification framework for structural data de-anonymization under a practical data model • We conducted a large-scale de-anonymizability evaluation of 26 real world structural datasets • We designed a cold-start optimization-based de-anonymization algorithm Acknowledgement We thank the anonymous reviewers very much for their valuable comments!

  25. Thank you! Shouling Ji sji@gatech.edu http://users.ece.gatech.edu/sji/

More Related