2D representation of the original and generalized table.
Download
1 / 1

Overview Study an important privacy preserving method, namely k-anonymity - PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on

2D representation of the original and generalized table. Name. Age. Start-year. Salary. Alice. 25. 2001. 7k. Bob. 30. 2004. 1k. Christina. 35. 1990. 2k. Complexity and Approximation Ratio d : dimensionality n : the size of dataset. Daniel. 40. 1995. 3k. Emily.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Overview Study an important privacy preserving method, namely k-anonymity' - theo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Overview study an important privacy preserving method namely k anonymity

2D representation of the original and generalized table.

Name

Age

Start-year

Salary

Alice

25

2001

7k

Bob

30

2004

1k

Christina

35

1990

2k

Complexity and Approximation Ratio

d: dimensionality n: the size of dataset

Daniel

40

1995

3k

Emily

45

2000

6k

William

55

1985

3k

Algorithm

Time Complexity

Approximation Ratio

The original payroll table

DAG

O(3ddnlog2n)

8d

MMG

O(dn2d+1)

2d+1

Age

Start-year

Salary

NNG

O(dn2)

6d

[25, 45]

[2000, 2004]

7k

[25, 45]

[2000, 2004]

1k

[35, 55]

[1985, 1995]

2k

[35, 55]

[1985, 1995]

3k

[25, 45]

[2000, 2004]

6k

[35, 55]

[1985, 1995]

3k

A 3-anonymous generalization

The Institute for Information AssuranceOn Multidimensional k-Anonymity with Local Recoding GeneralizationPresented by: Yang DuCollege of Computer and Information ScienceNortheastern University, Boston, MA 02115duy@ccs.neu.edu

  • Overview

  • Study an important privacy preserving method, namely k-anonymity

  • Show it is provably hard, even to find a good enough approximate answer

  • Develop three algorithms with different tradeoffs between the approximation ratio and complexity

  • Introduction

  • Motivation is privacy preserving

    • Publish sensitive data to allow accurate analysis without revealing the privacy

  • Simply removing the id column is not enough

    • Attackers can use some other attributions, called quasi-identifiers, to restore the identities

  • Generalization is necessarily

    • The quasi-identifiers are replaced by values in more general forms

  • K-anonymity is often a requirement

    • Make the quasi-identifiers of each tuple undistinguishable with at least those of other (k-1) tuples

  • Approximation Algorithms

  • The Divide-and-Group (DAG) Algorithm

  • Divide the space into square cells with proper size

  • Find a set of non-overlapping tiles of 2 x 2 cells to cover the points, such that each tile covers at least k points

  • Assign the rest of (uncovered) points to the nearest tile

  • Problem Mapping

  • Given a table R containing d quasi-identifier attributes

  • Map each quasi-identifier attribute to one dimension

  • Map each tuple in the table to a point in d-dimensional space

  • Map the k-anonymous generalization problem to a partition problem

    • Partition a set of d-dimensional points into some groups

    • Each point belongs to one and only one group

    • Each group contains at least k points

    • Each point is generalized to the minimum bounding rectangle (MBR) of its group

  • Quality Measuring

  • The smaller the MBRs are, the more accurate the analysis results are.

  • The size of each MBR is measured by its perimeter.

  • Objective

    • Find the optimal partition that minimizes the maximum size (perimeter) among all MBRs.

  • The Min-MBR-Group (MMG) Algorithm

  • For each point p, find the smallest MBR which covers at least k points including p

  • Find a set of non-overlapping MBRs from the result of previous step

  • Assign the points to the nearest MBR

  • The Nearest-MBR-Group (NNG) Algorithm

  • For each point p, find the MBR which covers p and its k-1 nearest neighbors

  • Find a set of non-overlapping MBRs from the result of previous step

  • Assign the points to the nearest MBR

  • Hardness of the Problem

  • Finding the optimal partition is NP-hard (cannot be done within polynomial time).

  • Finding a partition with approximation ratio less than 5/4, i.e. the maximum perimeter is 5/4 of the maximum perimeter of the optimal partition, is also NP-hard.

  • For more information:

    • http://www.ccs.neu.edu/research/dblab

    • Prof. Donghui Zhang – donghui@ccs.neu.edu

    • Yang Du – duy@ccs.neu.edu