Loading in 5 sec....

Protecting Statistical Databases Against SnoopersPowerPoint Presentation

Protecting Statistical Databases Against Snoopers

- 83 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Protecting Statistical Databases Against Snoopers' - oliver

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Presentation Transcript

### Protecting Statistical Databases Against Snoopers

Comparison of two methods

Disclosure vs. Anonymity

- Information disclosure necessary for planning and numerical measurements
- Anonymity necessary for protection of the individual and the public’s trust in systems

Medical Data

Necessary for:

- Measuring effectiveness of current treatments
- Finding sources of common medical mistakes
- Tracking contagious disease
- Government spending planning
- Health Insurance Companies

Anonymity: Not as Easy as it Looks

Complete Identification Without Uniquely Identifying Information

Outside Factors Affecting Privacy

- Snooper’s supplementary knowledge
- Public data sources
- Rarity

Comparing Two Methods of Protection

- What are the privacy guarantees?
- Can useful information be gained?

Sensitivity-based Noise-adding Algorithm

- Proposed by Dwork, McSherry, Nissim and Smith
- Adds noise to each answer based on the sensitivity of the series of queries
- Amount of privacy based on ε, a coefficient in the noise-generating formula

How much could changing one row change an answer?

MEAN

COUNT

HISTOGRAMS

The sensitivity of a series of queries is the sum of the sensitivities of the queries

SensitivityCoin-flip Algorithm

- Proposed by Mishra and Sandler
- A way for individuals to publish their own personal data
- Amount of privacy based on ε, the bias in the coin-flip

Each of the k possible answers to a query are ordered and numbered

If an individual’s answer to the query is the ith answer, the profile would be a string of k bits where the ith is a one and the others are zero

To sanitize, each bit is flipped with probability ½ + ε/2

All sanitized profiles resemble a random string of ones and zeros

Implementing the Coin-flip AlgorithmExample: HIV status numbered

- Ordered possible responses: “POSITIVE, NEGATIVE, UNKNOWN”
- The original profile of an HIV+ individual: “1, 0, 0”
- Results of coin-flips: “STAY, FLIP, STAY”
- Resulting sanitized profile: “1, 1, 0”
- What do we know about the individual from the sanitized profile?

My Research numbered

- Compare the total amount of error generated by histogram / frequency queries
- Hypothesis: The noise-adding algorithm will generate less error for few queries and the coin-flip algorithm will generate less error for many queries
- Research question: Where is the “sweet spot” where the error lines cross on a graph?

- With the smallest histograms first, the first “sweet spot” occurs at 32 queries.

- With the largest histograms first, the first “sweet spot” occurs at 189 queries.

A Second Look spot” occurs at 189 queries.

- Range of sensitivity: 2 to 136
- Unordered histograms:
- At first “sweet spot”, sensitivity= 30.

- Smallest histograms first:
- At first “sweet spot”, sensitivity= 32.

- Largest histograms first:
- At first “sweet spot”, sensitivity= 34.

Conclusions spot” occurs at 189 queries.

- For histogram / frequency queries, “sweet spots” occur between sensitivity=30 and sensitivity=40, so for least error:
- If sensitivity < 30, use NOISE-ADDING algorithm
- If sensitivity > 40, use COIN-FLIP algorithm

Quick Bibliography spot” occurs at 189 queries.

- Survey:
- N R Adam and J C Wortmann. Security-control methods for statistical databases: a comparative study. ACM Computing Surveys, 25(4), December 1989.

- Noise-adding algorithm:
- C Dwork, F McSherry, K Nissim, A Smith. Calibrating noise to sensitivity in private data analysis. 3rd Theory of Cryptography Conference, 2006.

- Coin-flip algorithm:
- N Mishra, M Sandler. Symposium on Principles of Database Systems, 2006.

Professor Alf Weaver, PhD spot” occurs at 189 queries.

Professor Nina Mishra, PhD

- REU program at UVa, sponsored by the National Science Foundation

Download Presentation

Connecting to Server..