1 / 24

Sequence-Aware Privacy Preserving Data-Leak Detection

Sequence-Aware Privacy Preserving Data-Leak Detection. Xiaokui Shu 11/29/2011. Content. Applications of Privacy Preserving Data-Leak Detection (PDLD) Challenges and our schema Sequence-aware PDLD (SPDLD) Implementation & evaluation. Application :: Outsourced Security Service. Internet.

dior
Download Presentation

Sequence-Aware Privacy Preserving Data-Leak Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequence-Aware PrivacyPreserving Data-Leak Detection XiaokuiShu 11/29/2011

  2. Content • Applications of Privacy Preserving Data-Leak Detection (PDLD) • Challenges and our schema • Sequence-aware PDLD (SPDLD) • Implementation & evaluation

  3. Application :: Outsourced Security Service Internet Customer’s network • Service Provider • Professional solution • Value-added service provider (VASP) • Semi-honest: honest, but curious • Customer • Zero knowledge required • Better business concentration Guest Outsourced Security Service

  4. Application :: Introspective Security Service Intranet Endpoint Sensitive Data Owner VPN Endpoint Normal Endpoint • Sensitive Data Owner • Knowledge of all sensitive data • Distribute sensitive data fingerprints to endpoints • DLD Endpoint • Inside or outside the intranet • Being monitored to be data-leak-free Internet

  5. Challenges • Accuracydecrease both false positive and false negative • Privacyminimize the DLD executor’s knowledge of the sensitive data • Efficiencyreal-time processing of the traffic in PCs as well as through network gateways • Robustnessthe ability to handle modified leaked data, or variants

  6. Challenges

  7. SPDLD Schema • Robustnessextract local features to represent the sensitive data • Accuracytake into account features of the sensitive data as well as the relationship among features • Privacyhash/fingerprinting values, samples • Efficiencysample both sensitive data and network traffic to improve performance

  8. Fingerprint Tape II • Sensitive Data • Network Traffic • Fingerprint Tape I SPDLD :: Whole View Data Owner DLD Executor • Alignment

  9. SPDLD :: Basic Alignment w/o Sampling … Alignment Result …

  10. SPDLD :: Flow Sampling Requirement No matter where we start, • ABCDEFGHIJKLMNOPQ • …FJM… • CDEFGHIJKLMNOPQRS • …FJM… We should always have the same sample for an identical segment.

  11. SPLDL :: Punching FingerprintTape Fingerprint Punched fingerprint in FingerprintTape Sliding window Quasi-gap encoded in FingerprintTape Minimum fingerprint in the window FP Flow … … … … FPTape … …

  12. SPLDL :: Advanced FingerprintTape • Quasi-gap encoding/decoding • Start flags bound for each FingerprintTape • Start position recorded

  13. SPLDL :: Alignment • Needleman-Wunsch Algorithm • Dynamic programming • Gap penalty • Unit comparison function replaced to expand quasi-gap • Implementation optimized for Python using 1D array and multiple iterators

  14. Implementation & Evaluation • Implementation Environment • Python 2.7 • Sensitive data • One paragraph from the source of TCP/IP wikipedia page • Leaked network traffic • Whole source of TCP/IP wikipedia page • MediaWiki & WordPress

  15. Implementation & Evaluation • Parameters of the system • 3-byte shingles • 64 bit Rabin’s fingerprint • Window size: 100 • Number of minima: 5 • Unit score in alignment • Match: 12, Mismatch: -1, Gap: -4

  16. Implementation & Evaluation :: Speed • My optimization of Needleman–Wunsch algorithm achieves 2.5 times speed as the naive (my previous) implementation • Comparison of set intersection, basic alignment, FingerprintTape

  17. Implementation & Evaluation :: Accuracy

  18. Implementation & Evaluation :: Accuracy

  19. Thank you!

  20. Background :: Shingling & Fingerprinting shingling hashing

  21. Background :: Automation-based RE Matching • Evolution of pattern matching in NIDS Boyer–Moore Regular Expression Support Aho–Corasick Multi-pattern search NFA DFA Automations D2FA CD2FA

  22. Background :: List Alignment • Needleman-Wunsch • Dialign

  23. SPDLD :: Shingling & Fingerprinting 658955 SENSITIVE INFO 452785 fingerprints 123587 754812 458763 shingling 885621 645853 SENSITIV ENSITIVE shingles NSITIVE fingerprinting SITIVE I ITIVE IN TIVE INF IVE INFO

  24. Sequence-Aware PP-DLD set list flow

More Related