Access routes to 2001 uk census microdata issues and solutions
Download
1 / 15

Access routes to 2001 UK Census Microdata: Issues and Solutions - PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on
  • Presentation posted in: General

Access routes to 2001 UK Census Microdata: Issues and Solutions. Jo Wathan SARs support Unit, CCSR University of Manchester, UK Jo.wathan@manchester.ac.uk. UK Census context. Traditional 10 yearly census at present Medium length form (c. 30 person questions, c. 10 household questions)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Access routes to 2001 UK Census Microdata: Issues and Solutions ' - stacie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Access routes to 2001 uk census microdata issues and solutions

Access routes to 2001 UK Census Microdata: Issues and Solutions

Jo Wathan

SARs support Unit, CCSR

University of Manchester, UK

Jo.wathan@manchester.ac.uk


Uk census context
UK Census context Solutions

  • Traditional 10 yearly census at present

  • Medium length form (c. 30 person questions, c. 10 household questions)

    • Ethnicity + optional religion question

    • No income question

  • Legal framework in GB is Census Act 1920

    • No statistics Act

    • Legislation only deals with confidentiality restrictions – up to 2years imprisonment!


1991 sars
1991 SARs Solutions

  • Samples of Anonymised Records (SARs) from 1991 were first to be released

  • Highly successful. c. 400 research papers used the data between 1993 & 2002. Also used in teaching.

  • SARs are a commissioned output, paid for by UK Economic and Social Research Council.

  • SARs support unit at CCSR represent client, disseminate and support the data.


Disclosure control 1991
Disclosure Control 1991 Solutions

After work had been undertaken to demonstrate low risk of disclosure

  • Users had to register to use them

  • some ‘broadbanding’ or grouping of rare categories

  • Very large household had individual detail suppressed (12+ residents)

  • 2 non-overlapping files for different interest groups:

    • One for geographers

    • One for sociologists/demographers


What did the 91 sars look like
What did the 91 SARs look like? Solutions

Household SAR

Hhd hierarchy

1% (c. 0.6M cases)

Regional

Individual year of age

10 ethnicity categories

358 categories of occupation

Individual SAR

Individual level file

2% (c. 1.2M cases)

Geography population threshold 120k = 278 SAR areas

Individual year of age

10 ethnicity categories

73 categories of occupation


Request for 2001 sars
Request for 2001 SARs Solutions

  • New work on disclosure control showed that we had previously overestimated the risk of disclosure

    • Requested larger sample size

    • Slightly more geography

    • A 3rd SAR for small areas

  • However new stricter interpretation of degree of disclosure risk required

  • Initial level of detail available would not provide files of sufficient use for research


Access routes to 2001 uk census microdata issues and solutions
Why? Solutions

  • Census Office concerns:

    • Perceived increased levels of concern amongst respondents

    • Increased data processing power

    • Increased levels of storage of personal information that might be used to match to the data

  • Major strategic review of data stewardship issues at the time that Census outputs due for release


Principles
Principles Solutions

  • Ongoing need for user consultation

  • Recognise different users require different levels of detail (and may be able to accept different conditions) – trading detail/access against each other

  • Trading different types of detail against each other: geog against socio/demographic etc.

  • Flexible approach to combining a range of access and disclosure approaches:

    • Safe Data

    • Safe Users

    • Safe Setting

  • International role models were very helpful


Where we are now
Where we are now Solutions

  • Have succeeded in obtaining access to

    • End User License- Safe Data2 Datasets which are accessible in the same way as in 1991: less detail on some variables, but with enough detail for research purposes

    • Special License – Safe Users1 Dataset available for distribution but with extra access conditions

    • Controlled Access Microdata- Safe SettingMuch more detailed versions of 2 datasets available in a safe setting


Safe data end user license files
Safe Data: End User License Files Solutions

  • Standard online application procedure for those with electronic signature (otherwise equivalent paper system)Not public data!

  • Available for very low risk files

  • Risk reduced by

    • Broadbanding (e.g. age, geography)

    • Perturbing data


Eul files
EUL Files Solutions

Individual SAR

Individual level file

3% (c. 1.8M cases)

Regional (13 categories

Ages 16-74 banded

16 categories of ethnicity

81 categories of occupation

Small area microdata

Individual level file

5% (c. 3 M cases)

Local authority geography (< 90k)

13 Age bands (c. 10 years)

13 categories of ethnicity

Only broad social class variable (economic activity 3 groups)


Safe users the 2001 s l household sar
Safe Users: The 2001 S-L Household SAR Solutions

  • Additional Complexity of a household SAR required special license

No geography at all & not available for Northern Ireland or Scotland

Age in 2-year bands of

16 categories of ethnicity

81 categories of occupation


Safe setting
Safe setting Solutions

  • To compensate for loss of detail in the end user and special license files

  • Same records as Individual and Household SARs but with MUCH more detail

  • Managed by the Census offices

  • Access currently at only a handful of census office sites

  • Virtual microdata laboratory environment, outputs manually checked prior to release to user

  • Access only permitted if this is the only available data source, for work in keeping with the aims of the Census Office


Controlled access microdata
Controlled Access Microdata Solutions

Individual CAM

Individual level file

3% (c. 1.2M cases)

Local authority – with context at lower level

Individual year of age to 90+

16 ethncity categories

Over 200 categories of occupation

Household CAM

Hhd hierarchy

1% (c. 0.6M cases)

Local authority – with context at lower level

Individual year of age to 90+

16 ethnicity categories

Over 200 categories of occupation


Conclusion
Conclusion Solutions

  • Have a range of research worthy datasets by treating different user groups differently

  • Traded off:

    • Safe data

    • Safe users

    • Safe setting

  • http://www.ccsr.ac.uk/sars


ad
  • Login