1 / 22

PRIVACY CRITERIA

PRIVACY CRITERIA. Roadmap. Privacy in Data mining Mobile privacy ( k-e ) – anonymity ( c-k ) – safety Privacy skyline. Privacy in data mining. Random Perturbation (quantitative data) Given value x , return value x + r , r is a random value from a distribution

darva
Download Presentation

PRIVACY CRITERIA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PRIVACY CRITERIA

  2. Roadmap • Privacy in Data mining • Mobile privacy • (k-e) – anonymity • (c-k) – safety • Privacy skyline

  3. Privacy in data mining • Random Perturbation (quantitative data) • Given value x, return value x + r, r is a random value from a distribution • Construct decision-tree classifier on perturbed data s.t. accuracy is comparable to classifiers of original data • Randomized Response (categorical data) • Basic idea: disguise data by probabilistically changing the value of sensitive attribute to another value • Distribution of original data can be reconstructed using the disguised data

  4. Roadmap • Privacy in Data mining • Mobile privacy • (k-e) – anonymity • (c-k) – safety • Privacy skyline

  5. Mobile privacy • Spatial cloaking: Cloaked region • Contains location q and at least k-1 other user locations • Circular region of location q • Contains location q and number of dummy locations generated by client • Transformation based matching • Transform region through Hilbert curves by using Hilbert keys • Casper: user registers with (k, Amin) profile • k: user is k-anonymous • Amin : minimum acceptable resolution of the cloaked spatial region

  6. Roadmap • Privacy in Data mining • Mobile privacy • (k-e) – anonymity • (c-k) – safety • Privacy skyline

  7. (k-e) - anonymity • Privacy protection for numerical sensitive attributes • GOAL: group sensitive attribute values s.t. • No less than k distinct values • Range of group larger than threshold e • Permutation-based technique to support aggregate queries • Constructing help table Aggregate Query Answering on Anonymized Tables @ ICDE2007

  8. (k-e) - anonymity Original Table Table after Permutation

  9. (k-e) - anonymity Table after Permutation Help Table

  10. Roadmap • Privacy in Data mining • Mobile privacy • (k-e) – anonymity • (c-k) – safety • Privacy skyline

  11. (c-k) – safety • Goal: • quantify background knowledge k of attacker • maximum disclosure w.r.t. k is less than threshold c • Express background knowledge through a language Worst –Case Background Knowledge for Privacy –Preserving Data Publishing @ ICDE2007

  12. (c-k) – safety • Create buckets , where randomly permute sensitive attribute values within each bucket Original Table Bucketized Table

  13. (c-k) – safety • Bound background knowledge i.e., attacker knows k basic implications • Atom: tp[S] = s, s S, p Person • e.g. tJack[Disease] = flu • Basic implication: • For some m, n and Ai, Biatoms • e.g. tJack[Disease] = flu tCharlie[Disease] = flu • is the language consisting of conjunctions of k basic implications

  14. (c-k) – safety • Find bucketization Bof original table s.t. • B is (c-k) – safe • The maximum disclosure of B w.r.t is less than threshold c

  15. Roadmap • Privacy in Data mining • Mobile privacy • (k-e) – anonymity • (c-k) – safety • Privacy skyline

  16. Privacy skyline • Original data transformed in Generalized or Bucketized data • Quantify external knowledge through skyline for each sensitive value • External knowledge for each individual • Having single sensitive value • Having multiple sensitive values Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge @ VLDB 2007

  17. Privacy skyline • Three types of knowledge (l, k, m) e.g.(2, 3, 1) • l: Knowledge about target individual t • flueTom[S] and cancerTom[S](obtained from Tom.s friend) • k: Knowledge about individuals (u1, ..uk) other than t • flue Bob[S] and flue Cary[S] and cancer Frank[S] (obtained from another hospital) • m: Knowledge about the relationship between t and other individuals (v1, …vm) • AIDS  Ann[S]  AIDS  Tom[S] (because Ann is Tom’s wife)

  18. Privacy skyline • Example: knowledge threshold (1, 5, 2) and confidence c=50% for sensitive value AIDS • Adversary knows l≤1 sensitive values that t does not have • Adversary knows sensitive values of k≤5 others • Adversary knows m≤2 members in t’s same-value family Adversary cannot predict individual t to have AIDS with confidence 50% when the above hold

  19. Privacy skyline • If transformed data D* is safe for (1, 5, 2) it is safe for any (l, k, m) with l≤1, k≤5, m≤2 • i.e., the shaded region

  20. Privacy skyline • Skyline for set of incomparable points • {(1, 1, 5), (1, 3, 4), (1, 5, 2)}

  21. Privacy skyline • Given a skyline {(l1, k1, m1, c1), …,(lr, kr, mr, cr)} • release candidate D* is safe for sensitive value  iff , for i =1 to r • max {Pr( t[S] | Lt,(li, ki, mi), D*)} < ci • maximum probability of a sensitive value  to be for individual t w.r.t external knowledge and release candidate is below confidence threshold ci

  22. Original Table Generalize Table Bucketized Table

More Related