Presented by Mihalis Hortis (746) - PowerPoint PPT Presentation

presented by mihalis hortis 746 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Presented by Mihalis Hortis (746) PowerPoint Presentation
Download Presentation
Presented by Mihalis Hortis (746)

play fullscreen
1 / 39
Presented by Mihalis Hortis (746)
Download Presentation
Download Presentation

Presented by Mihalis Hortis (746)

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. 1. Privacy-preserving network forensicsbyMikhail Afanasyev, Tadayoshi Kohno,Justin Ma, Nick Murphy, Stefan Savage,Alex C. Snoeren, and Geoffrey M. Voelker Presented by Mihalis Hortis (746) 2. A practical attack to de-anonymize social network usersbyGilbert Wondracek, Thorsten Holz, Engin Kirda, Christopher Kruegel

  2. 1st paper: Privacy-preserving network forensics • Forensics: Use of technical means to establish the presence of a person or an object at a crime scene after the fact • In the physical world: • DNA • Fingerprints • Writing samples • The issue: • There is no such a robust forensic trail in the Internet • IP addresses are virtual and insecure • The challenge: • Balance between need for attribution and user expectations of privacy

  3. Authors’ proposal • Network-layer capability called “Privacy-preserving forensic attribution”: • Packet-level cryptographic signature mechanism • Properly authorized parties (organizations) • Examination of logged packets from months prior • Unambiguous identification • Anybody can verify the validity of a packet • Prototype system named “Clue”

  4. Background and related work • Internet architecture allows packets to be manipulated • IP address spoofing • IP addresses are not unique identifiers • IPs represent topological location • Dynamic change of IP mapping (DHCP, NAT etc.) • Limited period of logging • “The internet provides criminals two of the most coveted qualities: anonymity and mobility”, David Aucsmith • Little work has been done in network-level forensic attribution • Past attempts didn’t focus on privacy concerns or long-term linkage

  5. Design goals • Basic requirements: • Physical names • Link to a physical object (i.e. a specific computer) • Per-packet granularity • Application to every packet, low-level implementation • Unimpeachability • Evidentiary value in the courtroom • Indefinite lifetime • Ability to examine a packet long after the fact • Other necessary attributes: • Privacy • Packets non-identifiable to an unauthorized observer • Attributability • Any observer must have the ability to verify packet signatures

  6. Group signature (1/2) • Group members, group manager • Anyone can verify a signed message • Only the group manager can determine who sent it • Group manager: owns a secret key msk • Group members: own signing key sk[i], i=1…..n, n=number of group members • One single public key pk

  7. Group signature (2/2) • How does it work? • Group manager creates msk, pk, sk[i], i=1…..n • I-th member signs messagem with sk[i] and gets signature σ • Global verification key used to verify that (m, σ) is valid • Group manager can identify i, using the msk • Application: • Group managers may be the computer manufacturers • Each group member (computer) has its own signing key

  8. Design challenges • Replay • Possibility of DoS attacks • Have to bind packets to a point in time • Timestamp implementation • Revocation • Used to confront secret key compromise • Global update of public key • Local update of unrevoked secret keys • Middleboxmodification • Middleboxes (i.e. NATs) apply changes to the packets • The packets can no longer be verified • Solutions: • Exclusion of the source address from the verification • IP-in-IP tunneling

  9. Basics of Clue mechanism (1/2) • Straightforward packet transformation • Signingprocess: • Collects nonvolatile elements of an IP packet • Adds an 8byte local NTP-derived timestamp to implement replay detection • Feeds the resulting data as input to the group-signature library to generate a signature • Appends the signature (and additional optimization information) to the original packet, adjusting the IP length field accordingly

  10. Basics of Clue mechanism (2/2) • Verification process: • Validates a packet’s freshness from its timestamp; • Collects the nonvolatile elements of an IP packet • Strips the Clue trailer from the end of the packet • Feeds the resulting data and signature to the group signature library • Pushes the original packet to one of two output ports, depending whether verification was successful

  11. BBS short-group signature scheme (1/2) Signingprocedure

  12. BBS short-group signature scheme (2/2) Verificationprocedure

  13. Optimizations (1/2) • Signing process: • Precomputation • The computation of Tj and Rj dominates overhead • CPU load of clients is dominated by idle times • That leads to precomputation of these values on idle CPU time • Verification process • Windowed verification • Deriving Rj’ values creates significant overhead • Option for the receiver to verify a single packet or a window of packets • Window size k adjusts to the sender’s congestion window • Asynchronous verification • Sending ACK before packet verification, overlaps the Verify computation

  14. Optimizations (2/2) • Incremental verification • Reduce the time to reject a non-verifiable packet • σ (T1, T2, T3, c, sα, sβ, sχ, sδ1, sδ7, R1, R2, R3, R4, R5) • c’’ H(m, T1, T2, T3, R1, R2, R3, R4, R5) • If c’’ ≠ c, reject the packet • Derive the Rj’ from pk and T1, T2, T3, c, sα, sβ, sχ, sδ1, sδ7 • Immediately reject if Rj’≠ Rj • Finally, accept the packet as valid • Other optimizations • Random verification of a packet (reduced security over performance) • The authors don’t examine such optimizations, due to increased risk of non-attributability

  15. Evaluation (1/2) • 100 iterations • Packets of 1277bytes • Result: per-packet overhead 10ms-30ms total • Incremental verification achieves a 70% reduction in overhead

  16. Evaluation (2/2) • For a typical RTT (round-trip time) of 80ms, the combination of Precomputation, Asynchronous verification and an Adaptive window achieves a throughput within a factor of 1.2 of user-level Click performance.

  17. Conclusion • Internet has vulnerabilities • How can we apply forensics to a network? • Effort to match attributability with privacy concerns • Development of the Clue prototype using group signature • Optimization of the performance to meet the needs of everyday Internet use

  18. 2ndpaper: A practical attack to de-anonymize social network users • Why social networks? • Social networks are critical with respect to security and privacy • Very large user base • Users provide personal information • Authors’ contribution • Novel de-anonymization attack on social network users • Examination of group membership data • The group membership information serves as a fingerprint • Demonstration of techniques for deployment of such real-world attacks • Examination of the attack on Xing(more than 8million users), Facebook(more than 300million users), LinkedIn (more than 50million users) • Brief examination on 5 other social networks

  19. BackgroundModel and Definitions • Social networks • Each user u is a member of n groups • Vector Γg(u) = 1 if u is a member of group g, and 0 otherwise • This vector can be used to de-anonymize users and serves as a fingerprint • Browser history • Browser history βuof a user u • Each time a page p is visited, its URL φpis added to βu • Theφpgets removed from βuafter a time interval, depending on the browser • Attacker model • The attacker can determine if a URLφpis in βu • The attacker has a way to learn about the members of a group • The more the groups, the more efficient the attack

  20. BackgroundStructure of social networking sites • Overview • Each user u has a profile pu • Public groups: • Anyone can join • Some networks allow non-members to list the group members • Closed groups: • Authorization to join • A moderator needs to approve the membership request • Web applications • Based on HTTP GET parameters • All links are added to the user’s browser history • It doesn’t matter if HTTPS is used, the information (userID, groupID etc.) is still there!

  21. BackgroundHistory stealing • Various ways for an attacker to probe whether a page is in the browsing history: • Using CSS and image references • Using client-side scripting (i.e. JavaScript) • Other ways discussed in previous lecture (21/3 by Silvia) • Either way, the attacker has to probe for each URL

  22. Basic attack • The attacker finds pattern of suitable dynamic URLs (gathers info from the social network) • The links should be easy to predict (i.e. numerical IDs) • The attacker probes for each URL and checks if it is in the victim’s browsing history • Cons: • The attacker has to generate millions of links • The victim should give the required time to his browser to download all links • This attack is still valuable for a small group of possible candidates

  23. Improved attack • Based on the fact that many social network users are members in groups • 5 steps: • Attacker obtains group membership info from the social network • Attacker uses history stealing to learn a partial group fingerprint • Then the attack can proceed in 2 ways: • Slow and robust approach – check the union of all group members • Fast and fragile approach – check the intersection of all group members • Then uses the basic attack to finally determine the user

  24. Improved attackRobustness • The group membership information might not be entirely accurate • The browsing history may contain incomplete info • The group membership information degrades over time • 2 cases: • False negative • Not a problem as long as the attacker finds at least one group • False positive • No problem for the first, slow attack • For the second attack, the intersection of all group members will exclude the victim • An attacker would first deploy the fast attack and if it fails, he would deploy the slower but more robust attack

  25. Obtaining group info • Group directory: • Some social networks provide group directory listing • The attacker can use crawling techniques to get the info he needs (group IDs) • In several cases, the group directory can be viewed by non-members of the network • Directory reconstruction: • Some networks don’t publish group directories • Solutions: • Guessing group IDs, if they follow a simple pattern and verify their presence • Using the group search function of the social network • Crawling the public member profiles (costly technique…)

  26. Obtaining group members info • Public groups • Standard crawling, if the network provides full listing of all public group members • If there is a list limit, (i.e. Facebook only lists 6000 members), use common search for first or last names within groups • Private groups • The attacker send a request to join the group • Then, the attacker crawls membership info and leaves the group • The member crawling can be performed on-the-fly, when the attack takes place, if there are the adequate resources

  27. Crawling experiments (1/4) • Overview • In-depth analysis of the Xing platform • Feasibility studies on Facebook and LinkedIn • Crawling approaches • Custom crawler • Manual registration to the social networks by the authors • Crawler is able to login using provided member credentials • No restrictions are applied in crawling group directories • Commercial crawling services • Accept a list of web pages and regular expressions • Very cost effective ($0.25 per 1million crawled URLs)

  28. Crawling experiments (2/4)Xing • Custom crawling provided a total of 6.574 groups with1.8million unique users • Closed groups • Application for membership in 1.306 groups • Acceptance by 108 groups (8.2%) • Discovery of 404.331 group members • 329.052 (81%) users of these groups were already covered by crawling public groups • A malicious user can launch automated social engineering attacks (use fake photos, fake personal info etc.) to become member of more closed groups

  29. Crawling experiments (3/4)Facebook • Facebook publishes a full group directory • Using commercial crawling, 7.1GB of HTML data was gathered, containing 39.156.580 groups • Facebook only lists the 6.000 first members of a group • The team used the member search functionality to enumerate members in larger groups by searching for common first or last names • Results • 43.2 million group members • 31.853 groups • A malicious user can use a botnet in real-life or crawl for a longer period, to get more results

  30. Crawling experiments (4/4)LinkedIn • Easy to predict group IDs (numbers from 0 to 3.000.000) • 1st scenario • Generation of 3million hyperlinks and crawling on them to identify if they exist • LinkedIn publishes its public member directory • Public profiles contain membership status and group IDs • 2nd scenario • Commercial crawling of all the public profiles • Overall cost is estimated at about $88 for crawling all 40million group members

  31. Evaluation (1/4)Xing • Greedy search and information gain for: • All groups • Groups with less than 50.000 members • Groups with less than 20.000 members

  32. Evaluation (2/4)Xing • Using set intersection (fast method) • For 42.06% of users the group fingerprint was unique • For 1.000.000 users candidate set is narrowed to less than 32 users • For 90%, the candidate set is less than 2.912 users

  33. Evaluation (3/4)Xing • Using set union (slow method) • For more than 72% of users the candidate set is narrowed to less than 50.000 users

  34. Evaluation (4/4)Facebook • Using set intersection (fast method) • For 43.2million users, 9.9million have unique group fingerprint • For 25.2million users, the candidate set is smaller than 1.000 users

  35. Real-world experiments • Implementation on Xing • Development of a website to perform the attack • Only on volunteers from personal contacts • Steps: • Browsing history is probed for the URL of Xing homepage • Probe for the URLs of the Xing groups • Analysis on the obtained group fingerprint • Presentation of the result to the user • Results: • Attack on 26 volunteers • For 11 of 26, there were no group links in their browsing history • For the remaining 15, a group fingerprint was found • For 11/15, fast approach succeeded and the median size of candidate set was 570 members • For 4/15, slow approach was needed and the median size was 30.013 members

  36. Run-time evaluation

  37. Group fluctuation

  38. Defense techniques • Server-side defense techniques • HTTP GET parameters that contain random tokens in every hyperlink • Use HTTP POST instead of HTTP GET • Although, the usability of web applications may be affected (i.e. bookmarking) • Client-side defense techniques • Turning off JavaScript, or use browser add-ons (i.e. NoScript) • Disable the browsing history (incognito mode) • Unfortunately • The usability of browsers and applications will be reduced • An effort on behalf of the user is required

  39. Conclusion • Users of social networks can be de-anonymized • Introduction of such a de-anonymization attack using group information • The group information is not considered as data that can be used in such an attack • The attack requires low effort • The attack has the potential to affect millions of registered social network users • WATCH YOUR BACK!!!