Intruder Testing:
This presentation is the property of its rightful owner.
Sponsored Links
1 / 17

Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on
  • Presentation posted in: General

Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census. Joint UNECE/ Eurostat Work Session on Statistical Data Confidentiality, Ottawa, 28-30 October 2013. Keith Spicer, Caroline Tudor and George Cornish. Forthcoming Attractions. 2011 UK Census

Download Presentation

Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Intruder testing demonstrating practical evidence of disclosure protection in 2011 uk census

Intruder Testing:

Demonstrating practical evidence of disclosure protection in 2011 UK Census

Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Ottawa, 28-30 October 2013

Keith Spicer, Caroline Tudor and George Cornish


Forthcoming attractions

Forthcoming Attractions

  • 2011 UK Census

    • SDC method: targeted record swapping

  • Sufficient uncertainty

  • Intruder testing:

    • Considerations

    • The intruders

    • Feedback

    • Validating Claims

    • Results

  • Conclusions


2011 uk census

2011 UK Census

  • Context of user criticism in 2001

    • Small cell adjustment

    • Poor utility of some outputs

  • Needed additivity and consistency

  • Evaluation of possible SDC methods

  • Record swapping selected

    • Swap households (individuals in communals)

    • Targeted to ‘risky’ records

    • Swap rate sufficiently low to maintain utility


Level of protection sufficient uncertainty

Level of Protection: Sufficient Uncertainty

  • SRSA 2007 – Personal information must not be disclosed

  • Impossible to get zero risk

  • There will be 1s, 2s and attribute disclosures in tables

    • Some will be real

    • Some will be fake

  • Census White Paper: “no statistics will be produced that allow the identification of an individual......with a high degree of confidence”

  • Needs to be “sufficient uncertainty”


Ico code of practice

ICO Code of Practice

  • ICO = Information Commissioner’s Office, who oversee interpretation of Data Protection and Freedom of Information Acts

  • Issued Code of Practice in light of abortions FOI case

  • Encouraged empirical evidence of disclosure risk

  • Intruder testing of reconviction data by Ministry of Justice provided a steer


Intruder testing considerations

Intruder Testing - Considerations

  • Recruitment of intruders

  • Security of Census database

  • Creation of pre-publication tables

    • Tables for own Output area (c. 300 population)

    • Tables for own MSOA (c. 7,500 population)

    • Maps for local areas

  • Unrestricted internet access (2nd laptop)

  • Briefing material

  • Validating claims

  • Ethical considerations


Intruder testing the intruders

Intruder Testing – The intruders

  • 18 intruders

  • ONS staff or contractors with security clearance

  • Few with SDC experience

  • All with excellent IT skills adept with data

  • Range of grades up to Divisional Director

  • Range of local areas in England & Wales

  • Availability for at least ½ day


Intruder testing other issues

Intruder Testing – Other issues

  • Intruders claims

    • Only general feedback given

    • No specific claim confirmed or denied

  • Checking claims

    • Potentially of people the checker knows (e.g. A self-identification made by work colleague)

    • Consent of intruders

  • Websites

    • Paying for access

    • Retaining search details (intruder’s identity)

  • Laptops wiped after each intruder


Intruder feedback

Intruder Feedback

  • For each claim:

  • Name of person

  • Address

  • Table and cell reference

  • Type of claim: identification or attribute (and which attribute)

  • Reasoning, variables, tables, websites used

  • Level of confidence in claim


Intruder feedback1

Intruder Feedback

  • Intruders took between 1.5 and 6 hours

  • 16 of 18 intruders made at least one claim

  • >50 claims made in total

  • Tables looked sensible for their areas

  • Swap rate looked low

  • Generally intruders felt utility preserved


Validating claims

Validating Claims

  • Cell reference and table used to obtain form id

  • Form id  Census image on the image database (very restricted access)

  • Correct claim if match name and address

  • Check of logic used by intruder


Results

Results

Level of confidence in intruder’s claim


Results1

Results

  • 48% claims correct overall

  • Best success rate for claims made with 60-79% Confidence (67% correct)

  • Self / family 61% correct (v 36% other)

  • Very few attribute disclosure claims (<10%)

  • Tables used most:

    • Age x sex x industry

    • Age x sex x marital status

    • Age x sex x economic activity

    • Sex x industry x economic activity

    • Age x sex x health x disability


How could so many claims be wrong

How could so many claims be wrong?

  • Non-response

  • Imputation (both person and item)

  • Capture error (e.g. write-in date of birth)

  • Processing (esp. coding from free text)

  • Respondent error

  • Record swapping

  • Intruder error


Conclusions for census

Conclusions for Census

  • Fewer than half claims correct

  • Fewer than half “high confidence” claims correct

  • How much uncertainty is “sufficient”?

  • ICO have endorsed this work and said “risk is manageable”

  • Special attention to the most used tables and their “close relatives”

  • National Statistician content

  • Communication strategy important


Conclusions for intruder testing

Conclusions for Intruder Testing

  • Useful for assessing risk empirically

  • Considerable resource needed

  • Need lot of support

  • Wouldn’t suggest doing for every output

  • Need assessment of what “success” looks like

  • Use in conjunction with theoretical work


Intruder testing demonstrating practical evidence of disclosure protection in 2011 uk census

  • Any Questions?


  • Login