1 / 37

Using Research to Inform Geographic Policy

This research focuses on the methodology of best-fitting geographic policy for national statistics, specifically for small areas. The results of this research will feed into a review of the Geographic Policy set by the Office for National Statistics (ONS). The research explores the use of best-fitting from Output Areas (OAs) to higher geographies, such as Local Authority Districts (LADs) and Middle Layer Super Output Areas (MSOAs), to ensure consistent and reliable statistics.

wrasmussen
Download Presentation

Using Research to Inform Geographic Policy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

  2. Introduction • The Geography Policy for National Statistics sets out the methodology for calculating areal statistics – “best-fitting”. • Methodology was adopted for 2011 Census, but for small areas, this approach is not always as accurate as previous estimates. • The results of ONS Geography research on these cases will feed into a review of GSS geographic policy.

  3. Context • ONS has a legal duty under the Statistics and Registration Service Act (2007) not to disclose “information which relates to and identifies a particular person”. • Output Areas were designed for the 2001 Census with this requirement in mind.

  4. Census Output Areas (OAs) • Designed to minimise within-area and maximise between-area socio-economic variations. • Individual OAs, and aggregations of complete OAs, are non-disclosive. • For 2011 Census, revision of OAs was kept to a minimum. Only 2.6% of 2001 OAs were changed. • Over 181,000 cover England and Wales.

  5. Census Output Areas (OAs) • Designed to minimise within-area and maximise between-area socio-economic variations. • Individual OAs, and aggregations of complete OAs, are non-disclosive. • For 2011 Census, revision of OAs was kept to a minimum. Only 2.6% of 2001 OAs were changed. • Over 181,000 cover England and Wales. • South Brent, Devon (pop. 3,000)

  6. Creation of Output Areas • Formed from groups of Thiessen polygons generated from households and snapped to residential postcode unit centroids. • Target of 125 households per OA.

  7. Census Households and Unit Postcode Centroids

  8. Household/ Unit Postcode Thiessen polygon building blocks

  9. Census Output Areas from building blocks

  10. Exact estimates • Where statistics for an area are derived directly from the Census households located within them. • Exact estimates were produced in 2001 for all geographies. • Exact estimates can produce slivers when incompatible boundaries (here, parishes) are overlaid. • Small cell adjustment was applied to the dissatisfaction of users. 2001 OAs

  11. Exact estimates • Even before 2001 Census published, demand expressed for release on future boundaries. • We needed a methodology that can accommodate this demand. 2011 OAs

  12. Statistical building blocks • Statistical outputs should be constructed from the smallest geographical areas for which data are available. • For Census univariate data, these building-blocks are OA; for Census multivariate data, LSOA. • Allows consistent and comparable statistics, even if the geography changes.

  13. Best-fitting to higher geographies • For ‘higher’geographies, Census statistics are ‘best-fitted’ from the OA or LSOA pop-weighted centroids that a target area contains. • For geographies made up of aggregations of complete OAs or LSOAs (e.g. MSOA / LAD) this equates to exact fit. • The National Statistics Postcode Lookup (NSPL) provides a pre-built best-fitting tool.

  14. Output Area PWCs from Census Households (median position)

  15. OAs and OAPWCs

  16. OA and higher geography e.g. LSOA

  17. Exception - National Parks • The Geography Policy for National Statistics allows for exceptions to be requested from the Statistical Policy Committee. • The irregular boundaries and very small populations of National Parks makes them unsuitable for best-fitting as very few OAPWCs fall within them. • Statistics for National Parks are therefore exact-fitted as an exception to the policy

  18. Exception - National Parks • Usually, whatever small populations located within them are associated with OAPWCs that fall outside the park.. • Statistics for National Parks are therefore always under-counted if best fit.

  19. Best-fitting to higher geographies • Otherwise, best-fitting using the OAPWC contained within the target geography works very well. • Statistics are non-disclosive, on a consistent, reliable and relatively simple basis. • Research by Ralphs (2011) determined that best fitting is a dependable methodology and it was adopted as fundamental to the policy. • Suitable also for non-nesting geographies. Usually…

  20. OA, OAPWC and non-nesting geography

  21. Non-nesting Geography unit without OAPWC

  22. Statistics for areas without OAPWCs? • Because we cannot release sub-OA statistics, no data may be published for these areas. • But there is demand.

  23. Statistics for areas without OAPWCs? One option: Best-fit, to the target area, the statistics for the OA pop-weighted centroid (OAPWC) which is nearest to the target area’s geometric centroid and which is within the same LAD.

  24. Statistics for areas without OAPWCs? One option: Best-fit, to the target area, the statistics for the OA pop-weighted centroid (OAPWC) which is nearest to the target area’s geometric centroid and which is within the same LAD.

  25. Best-fitting using adjacent OAPWC • This does produce non-disclosive statistics for the target area. • But: • Statistics are indicative rather than precise; • Statistics for the nearby OAPWC serve for two or more areas; • Statistics cannot be aggregated up to compile statistics for higher areas.

  26. The ‘areas with a small population’ problem • There are a number of administrative and statistical geographies that can in principle be smaller than an Output Area • Parishes; • Wards; • Built-up areas (BUA); • The problem is also clustered around particular areas such as the Isles of Scilly and the City of London. • Civil parishes are a particularly good geography for the research due to the large number that are smaller than an Output Area

  27. There is a real demand for parish data • There is a very real demand for parish data, demonstrated by the fact that they are one of the most popular geographies on the Neighbourhood Statistics site. • Many parishes in England are very recent creations driven by the government’s localism agenda. • For 2011 Census, some of the previous parish level outputs were removed creating a greater requirement for accuracy on the parish outputs that were produced. • Yet we can not currently provide statistics for parishes that do not contain an OAPWC.

  28. Parish area varies enormously Chester Castle (0.04 km2) Stanhope, County Durham (255km2) Over 6,000x larger.

  29. Parish population varies enormously • Some old parishes are completely uninhabited; more recent creations are home to up to 75,000 people. • By contrast, OA population falls within strict thresholds (100 – 625).

  30. Coverage The majority of the population of England live in “Unparished Areas”. All of Wales is covered by “communities”, the local equivalent to parishes.

  31. Outline methodology • Fictitious data as Census 2011 data proxy linked to postcodes as household proxy. • One copy aggregated to OA pop-weighted centroids as “model” (best fit) data. • A second copy aggregated to parishes as “true” (exact fit) data. • Stats for “model” and “true” data compared.

  32. The datapoint ratio and model power • The “datapoint ratio” is the ratio of the number of postcodes in the OA that would be supplying “model” data to the number of postcodes in the parish that would be receiving those data).

  33. The datapoint ratio and model power • Where the “datapoint ratio” was close to 1:1, correlations between the “model” statistics for the external OAPWC and the equivalent “true” statistics for the receiving parish were strong. • Elsewhere, correlations were generally weak and raw error and percentage errors were high. • For parishes without an OAPWC, the datapoint ratio was always less than 1:1 and the mean datapoint ratio was 0.3 : 1.

  34. Conclusions - parishes • At least eleven possible types of spatial relationship between parishes and OAs. Most drive the datapoint ratio away from 1:1. • No structural link between parishes and the output area hierarchy of geographies. • Estimating statistics for a parish on the basis of adjacent OAPWCs unlikely to succeed. • Statistics for the 1,142 parishes without OAPWCs are on average likely to be over-estimated.

  35. Conclusions – more general • Using an external OAPWC to estimate statistics for an area that does not contain an OAPWC is only likely to produce statistics close to the ‘truth’ where the following conditions apply: • The target geography is similar in number of postcodes/ households to the supplying geography and • The target geography is structurally similar to the OA/LSOA/MSOA geography. • The pilot project was successful in classifying the types of geography and scenarios that make best-fitting via adjacent OAPWCs problematic. • The procedure appears to be especially weak for count/sum variables.

  36. What’s next? • We are about to embark upon the project proper, with dedicated high-performance hardware and household-level Census data that will provide a finer degree of granularity and realism to the results. • This will allow us to develop measures to mitigate the problems identified, and this will assist in our aim to publish robust and non-disclosive estimates for the small geographies that users are most at home with. • The research will feed into a review of the Geography Policy for National Statistics planned for 2014-15

  37. Links: GEOGRAPHY POLICY FOR NATIONAL STATISTICS http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/geography-policy-for-national-statistics.pdf Best-fit Policy: http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/index.html Coady: An overview of best-fitting: Building 2011 Census estimates from Output Areas: http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/an-overview-of-best-fitting.pdf Ralphs: Exploring the performance of best fitting to provide ONS data for non standard geographical areas http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/exploring-the-performance-of-best-fitting-to-produce-ons-data-for-non--standard-geographical-areas.pdf Contact ONS Geography at: ons.geography@ons.gsi.gov.ukAccess Open Geography products at: https://geoportal.statistics.gov.uk/geoportalAccess as Linked Data at: http://statistics.data.gov.uk Follow ONS Geography at @ONSgeography

More Related