1 / 29

Privacy Preserving Publication of Moving Object Data

Privacy Preserving Publication of Moving Object Data. Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain. Joey Lei CS295. Outline. Intro & Background Clustering and Perturbation Techniques Spatio-Temporal Cloaking (Generalization) Techniques Future Research.

Download Presentation

Privacy Preserving Publication of Moving Object Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Privacy Preserving Publication of Moving Object Data Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain Joey Lei CS295 CS295 - Privacy and Data Management

  2. Outline • Intro & Background • Clustering and Perturbation Techniques • Spatio-Temporal Cloaking (Generalization) Techniques • Future Research CS295 - Privacy and Data Management

  3. Location Privacy • Growing prevalence of location aware devices • mobile phones and GPS devices • Two Analysis Groups • Online • Real-time monitoring of moving objects and motion patterns • development of location based services (LBS) • Google Maps on the iPhone • Offline • Collection of traces left by moving objects • Offline analysis to extract behavioral knowledge • public transportation CS295 - Privacy and Data Management

  4. Privacy Concerns • Location Data allows for intrusive inferences • Reveals habits • Social customs • Religious and sexual preferences • Unauthorized advertisement • User profiling CS295 - Privacy and Data Management

  5. Offline Analysis • Traffic Management Application • Paths (trajectories) of vehicles with GPS are recorded • Geographic Privacy-aware Knowledge Discovery and Delivery (GeoPKDD) • Traffic data published for the city of Milan (Italy) • Car identifiers were replaced with pseudonyms • Daily Commute Example • Bob’s home and workplace are traceable by location systems (QIDs) • Join data with a telephone directory CS295 - Privacy and Data Management

  6. Definitions • Anonymity Preserving Data Publishing of Moving Objects Databases • How to transform published location data while maintaining utility • Moving Object Database (MOD) • A set of individuals, time points, and trajectories CS295 - Privacy and Data Management

  7. Background: Location Based Services • Ideals • Provide service without learning user’s exact position • Location data is forgotten once service is provided • k-anonymity definition • A response to a request for location data is k-anonymous when it is indistinguishable from the spatial and temporal information of at least k – 1 other responses sent from different users CS295 - Privacy and Data Management

  8. LBS: Location k-Anonymity • Spatial Requirements • Ubiquity – that a user visits at least k regions • Congestion – number of users be at least k • One Way to Achieve This: Mix Zones • An area where LBS providers cannot trace a specific users’ movement • Identity is replaced with pseudonyms • Users entering these zones at the same time are mixed together CS295 - Privacy and Data Management

  9. LBS: Location Based Quasi-Identifier • A spatio-temporal pattern that can uniquely identify one individual • set of spatial areas and time intervals plus a recurrence formula • AreaCondominium [7am, 8am],AreaOfficeBldg [8am, 9am], • AreaOfficeBldg [4pm, 6pm],AreaCondominium[5pm, 7pm] • Recurrence : 3.Weekdays ∗ 2.Weeks CS295 - Privacy and Data Management

  10. LBS: Historical k-Anonymity • In the offline context • A set of requests satisfies historical k-anonymity if there exists k – 1 personal histories of locations (trajectories) belonging to k – 1 different users such that they are location-time consistent (undistinguishable) CS295 - Privacy and Data Management

  11. Outline • Intro & Background • Clustering and Perturbation Techniques • Spatio-Temporal Cloaking (Generalization) Techniques • Conclusions CS295 - Privacy and Data Management

  12. Clustering and Perturbation • C&P ignores the inherent problems with location QIDs: • each individual can have their own QIDs which makes it difficult to create a QID for all individuals • Area(Home,Office,??)[??am- ??pm] • Recurrence : 7.Weekdays ∗ 52.Weeks • Solution: anonymize trajectories instead • Microaggregation / k-member anonymity CS295 - Privacy and Data Management

  13. Clustering and Perturbation • Trajectories are not polylines, but instead a cylindrical volume with radius δ (or uncertainty radius) • If another trajectory moves within the cylinder of the given trajectory, then the two trajectory are indistinguishable from each other ((k, δ)-anonymity set) CS295 - Privacy and Data Management

  14. Clustering and Perturbation • Uncertainty trajectory • Anonymity set for two trajectories CS295 - Privacy and Data Management

  15. Achieving (k, δ)-anonymity • Achieved by Space Translation – slightly moving some observations in space • Step One: cluster trajectories of similar sizes • NWA (Never Walk Alone) • All equivalence classes have the same time span and special timestamp requirements π (ie. π = 60, only full hours, from 1:00PM-2:00PM) CS295 - Privacy and Data Management

  16. Achieving (k, δ)-anonymity • Step Two: perturb trajectories within uncertainty radius δ (i.e. transformation into anonymity set) • Grouping and Reconstruction • Finding the nearest matching points to group • Reconstruct a generalization for utility • Multi TGA and Fast TGA Algorithms CS295 - Privacy and Data Management

  17. Outline • Intro & Background • Clustering and Perturbation Techniques • Spatio-Temporal Cloaking (Generalization) Techniques • Conclusions CS295 - Privacy and Data Management

  18. Trajectory Generalization Anonymization of three trajectories tr1, tr2 and tr3, based on point matching and removal, and spatio-temporal generalization CS295 - Privacy and Data Management

  19. Trajectory Reconstruction Reference: Aggarwal, C.C., Yu, P.S.: A condensation approach to privacy preserving data mining. CS295 - Privacy and Data Management

  20. Quasi-identifier Methods • QIDs are a sequence of locations with multiple sensitive values (locations) • values are different from the perspective of each adversary • Yet, must consider linkage attacks from all adversaries CS295 - Privacy and Data Management

  21. Quasi-identifier Methods • Possible Attack • T5 and t5A match! We know that person visited b1 CS295 - Privacy and Data Management

  22. Space Generalization • Each position is an exact point on a grid • Generalizations become rectangles of nearby points. CS295 - Privacy and Data Management

  23. Attack Graph • Privacy Breach on prior example • Definitions • I-Nodes (Individuals) • O-Nodes (Moving Object IDs) CS295 – Data Privacy and Confidentiality

  24. Attack Graph • If I1 is mapped to O2, there is no clear mapping for I2 or I3 • Both I2 and I3 map to O3. • Conclusion • O1 must map to I1 CS295 - Privacy and Data Management

  25. Attack Graph • Shortcomings on basic k-anonymity definition • Standard k-anonymity states there should be at least k paths originating from I (based on grouping). • What if we group O to have at least k paths? CS295 - Privacy and Data Management

  26. Attack Graph • Privacy Breach • Assume I2, O5 are a pair • I1 maps to both O1, O2, but this is impossible! • I5 must map to O5 CS295 - Privacy and Data Management

  27. Final k-Anonymity Definition • Every I-node has degree k or more • The attack graph is symmetric • For edge (Ii, Oj) there is also an edge (Ij,Oi) • 2-anonymous attack graph: CS295 - Privacy and Data Management

  28. Future Research • Ad-Hoc anonymization techniques for intended use of data • Privacy Preserving Data Mining • Focus on the analysis methods instead of the publishing CS295 - Privacy and Data Management

  29. Questions? CS295 - Privacy and Data Management

More Related