1 / 23

Confidentiality and the SARs

Confidentiality and the SARs. Update on SAR progress, and discussion of the disclosure work done for Scotland. Sam Smith s.smith@man.ac.uk. Update 2001 SARs. Newsletter published very recently: More delays Disclosure Control is ongoing by CAPRI

nessa
Download Presentation

Confidentiality and the SARs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Confidentiality and the SARs Update on SAR progress, and discussion of the disclosure work done for Scotland. Sam Smith s.smith@man.ac.uk

  2. Update 2001 SARs • Newsletter published very recently: • More delays • Disclosure Control is ongoing by CAPRI • Current estimate for Individual data to be with the SARs team in June • In-house access at ONS for users with urgent need.

  3. England and Wales • For the release of 100% tables, England and Wales and Northern Ireland rounded small cell counts. • It is not possible to match between the SAR and the tables for England, Wales and NI.

  4. Scotland • Scotland did not round their 100% tables. • As a result, there are counts of 1 in the tables. • If any of these individuals are present in the SAR, it is disclosive.

  5. Background • The following work has been carried out in collaboration with the General Register Office for Scotland, by the SARs team at CCSR. • At time of writing, I have had no access to disclosive data. • There is no geography below Scotland level.

  6. Population Uniques • Population Uniques are people who have one or more characteristics which are Unique in the Population. • Sample Uniques are people who are unique on one or more characteristics in the Sample.

  7. Scale • There are 62 variables in both the SAR and 100% tables. • GROS are interested in Tri-variate tables. Only concerned with uniques. • We obtained 37,820 tables, covering all combinations of trivariate tables.

  8. Request of the tables • An example request for input to their system was provided by GROS • We then replicated and modified it, one request for each table. • The tables arrived on 4 CDs, a month later.

  9. An example table Space-Time Research 2001 ED Based OSD - Test 1 Table 1 Cars - Number of by Ever worked Indicator and Number of Rooms for Person No code required No code required No code required No code requiredNo code required Not applicable 01-02 03-04 05-06 7+ None - 53,323 421,443 232,335 18,719 One - 33,839 577,499 759,187 188,235 Two - 6,104 174,884 499,420 368,657 Three - 772 20,029 83,915 84,619 Four or more - 222 4,622 20,353 29,984 Communal establishment 50,485 - - - - • Cars - Number of by Ever worked Indicator and Number of Rooms • Only “No Code Required” shown for Ever Worked.

  10. A Bigger Example TableAge, Industry, Occupation • Add table here

  11. Analysis • Custom software written to parse each table, and list the file, variables and values locations of all uniques. • List the Uniques. • There are 2.4 million of them.

  12. Implementation • Step by Step process. • Keep intermediate steps. • Keep It Simple.

  13. Target • The Scotland Specification is as compatible as possible with the England and Wales specification. • Use recodes to reduce the unique count to a level where they can be dealt with on an record by record basis.

  14. Simple Suppression of Uniques • All records with uniques must be perturbed. • Approximately 96% of Uniques will be immediately suppressed by virtue of the sample being 4%. • There are also reductions because of differences in the specifications.

  15. Recodes • Variables were recoded to coarser categories. • Some used to aid E&W disclosure work • including: Age, Hours of Work, Industry + others • At time of writing, Occupation is the only additional recode for Scotland.

  16. Running the recodes. • The previous slide represents 6 weeks of iterative work. • Each recode had the uniques analysis run, producing a list of uniques.

  17. Moving forwards • We now have a slightly more restrictive specification for Scotland. • Age recoded to between 2 and 5 year bands (for age 16+) (possibly also for EWNI) • Occupation in ?? categories • Industry in 15 categories (applied to EWNI) • Hours of Work banded (applied to EWNI)

  18. So far… • Everything has been done on publicly accessible data. • The above process needs to be rerun on the SAR to find Sample Uniques • This requires access to the disclosive microdata.

  19. Future Work • The 38,720 tables will be recreated for the records in the sample. • The lists of Population Uniques and Sample Uniques will be compared. • Where there is a Population Unique in the Sample, it will be flagged.

  20. Applying this to the Microdata • All the Population Uniques in the Sample will be peturbed by ONS. • The method of peturbation will be the same as done for England, Wales and NI records. • This method is likely to involve PRAMM. Discussion paper available from the SARs website?

  21. The 100% Tables • The 37,820 tables requested cost £2,000 - paid for by the SARs project. • They will be made available to registered SARs/Census users for use in research.

  22. And Finally…. • Slides will be available on the seminars webpage tomorrow. • Any questions?

More Related