1 / 38

Towards an Enhanced UK Spatial Interaction Data Service

Outline of presentation. Introduction: relevant background on interaction data and CIDER and WICIDAudit of Interaction Data Sources: a brief overview of the variety of interaction data sources available in the UK What were the recommendations of the audit? How do we propose to take things forward to create an enhanced UK spatial interaction data service?The new INTERACTION system: overview of the issues and challenges involvedThe new data: an overview of the individual characteristics of e35344

percival
Download Presentation

Towards an Enhanced UK Spatial Interaction Data Service

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Towards an Enhanced UK Spatial Interaction Data Service Adam Dennett, Oliver Duke-Williams and John Stillwell School of Geography, University of Leeds Presentation for British Society for Population Studies, University of St Andrews, 11-13 September 2007

    2. Outline of presentation Introduction: relevant background on interaction data and CIDER and WICID Audit of Interaction Data Sources: a brief overview of the variety of interaction data sources available in the UK What were the recommendations of the audit? How do we propose to take things forward to create an enhanced UK spatial interaction data service? The new INTERACTION system: overview of the issues and challenges involved The new data: an overview of the individual characteristics of each of the new proposed datasets

    3. Introduction – CIDER CIDER: the Centre for Interaction Data Estimation and Research Based now, principally, at the University of Leeds though software runs at Manchester Data Support Unit: part of the ESRC-funded UK Census Programme

    4. Explain Interaction data to do with FLOWS of migrants and commutersExplain Interaction data to do with FLOWS of migrants and commuters

    5. Introduction - CIDER

    6. Introduction – CIDER Data Sets and Geographies 2001 Census: Special Workplace Statistics (SWS) (Levels 1, 2 & 3) 2001 Census: Special Travel Statistics (STS) (Scotland Levels 1,2 & 3 and Level 2 Scottish postal sectors) 2001 Census: Special Migration Statistics (SMS) (Levels 1,2 & 3) Also comparable datasets from 1991 and 1981 As well as the standard District, Ward and OA geographies available, different aggregations of these basic units, as well as various bespoke geographies are available for different data years Other Geogs include 100 Zones (FHSAs), 1991 counties and countries, LLSOAs MLSOAs, foreign originsOther Geogs include 100 Zones (FHSAs), 1991 counties and countries, LLSOAs MLSOAs, foreign origins

    7. Introduction - WICID Can select information either by geography or data first. Output is in a variety of formats including CSV, html etc…Can select information either by geography or data first. Output is in a variety of formats including CSV, html etc…

    8. Introduction – CIDER’s Ongoing Objectives CIDER’s objectives of relevance to this presentation: To gather/estimate further UK census-based data sets and include them in the system To expand the WICID system to incorporate a range of UK interaction data sets from outside of the census To undertake research based on the current and future interaction data sets held within the software system

    9. Interaction Datasets in the UK: An Audit Purpose of the Audit: Before adding new datasets to WICID, we needed to know what was out there! To identify and evaluate sources of interaction data in the UK that might complement the current census datasets held in WICID To make recommendations relating to the inclusion of the most useful datasets in a new, expanded version of WICID called INTERACTION Whilst detailed and comprehensive, UK census datasets have the obvious limitation of being decennial – other datasets, whilst maybe lacking data coverage of census are temporally more regular. Datasets collated on a more frequent basis provide an opportunity for more complete temporal coverage. Whilst other datasets are in existence to study internal migration, their ease of access to researchers is limited. WICID allows for a flexible, query building approach which facilitates easy access to the information people want through selection of Origins, Destinations and a range of disaggregated variables – age, sex etc… Inclusion of additional datasets in wicid will add real value to the service.Whilst detailed and comprehensive, UK census datasets have the obvious limitation of being decennial – other datasets, whilst maybe lacking data coverage of census are temporally more regular. Datasets collated on a more frequent basis provide an opportunity for more complete temporal coverage. Whilst other datasets are in existence to study internal migration, their ease of access to researchers is limited. WICID allows for a flexible, query building approach which facilitates easy access to the information people want through selection of Origins, Destinations and a range of disaggregated variables – age, sex etc… Inclusion of additional datasets in wicid will add real value to the service.

    10. Interaction Datasets in the UK: An Audit

    11. Census data sources of interaction data

    12. Major administrative sources of interaction data

    13. Important surveys containing interaction data

    14. Recommendations coming out of the Audit… Additional data should be included in the new system from the following four sources: 2001 Census: the large and more complex matrices of migration and commuting flows commissioned from ONS that have national coverage at district and sub-district spatial scales NHSCR: annual flows, from 1975 to 1998, of NHSCR patient re-registration movements between 100 FHSA-based zones, disaggregated by age and sex; and annual flows, from 1998/99 onwards, of NHS patients movements between HAs, disaggregated by age and sex Generally, with the census there are already online query and extraction systems in place, so CIDER does not wish to replicate these existing census services. In the case of some of the large commissioned tables – extensive data and spatial coverage means that it will be useful to add these datasets to wicid. CIDER already holds NHSCR data for 1975-1998 for set of 100 zones based on FHSA geography. As a relatively reliable source of year-on-year migration data, it would be very useful to include this data in WICID. ONS have expressed their willingness to release post 1998 data for CIDER to use, however due to new HA geography, work will need to be done to create a continuous time series from 1975. Student migrations to HE institutions are important both in terms of their magnitude and impact Generally, with the census there are already online query and extraction systems in place, so CIDER does not wish to replicate these existing census services. In the case of some of the large commissioned tables – extensive data and spatial coverage means that it will be useful to add these datasets to wicid. CIDER already holds NHSCR data for 1975-1998 for set of 100 zones based on FHSA geography. As a relatively reliable source of year-on-year migration data, it would be very useful to include this data in WICID. ONS have expressed their willingness to release post 1998 data for CIDER to use, however due to new HA geography, work will need to be done to create a continuous time series from 1975. Student migrations to HE institutions are important both in terms of their magnitude and impact

    15. Recommendations coming out of the Audit… HESA: annual flows, from 2001 onwards, of student movements between MLSOA of parental domicile and HEI, disaggregated by various characteristics NHS IC: annual flows, from 2001 onwards, of hospital patients from LLSOA or MLSOA of residence to hospital, disaggregated by various attributes

    16. Implications for CIDER CIDER is currently in negotiation with the custodians of these targeted data sets to see if incorporation of the data into a an extended version of WICID is possible. All current indications are positive, but due to the differing availability and cost of particular data sets, it is likely that the acquisition and incorporation of some data will happen before others. Securing additional funding via the Census Development Programme should allow for the purchase of data and trial of a new improved INTERACTION data system which incorporates these new data sources.

    17. Towards an Enhanced Spatial Interaction Data Service… Overview of the issues and challenges involved with adding new non-census datasets to the new INTERACTION system. The new data: A more detailed look at the individual characteristics of each of the new proposed datasets.

    18. WICID – The current system

    19. WICID - Inbuilt flexibility System originally designed to handle a variety of primary (migration) data Metadata is key as it describes the primary data held in the database. The system relies on this metadata to recognise the range of primary data stored The system has very few ‘hardcoded’ assumptions about the data – it is all looked up whenever a data page on the user’s browser is produced Data need only have a single origin and destination identifier, with a set of fields (generally a set of counts disaggregating the flow)

    20. WICID – The metadata

    21. WICID – sample of table in SQL database

    22. WICID – finalising the metadata

    23. WICID – The finished product.

    24. From WICID to INTERACTION Flexible nature of the current WICID system should allow for the addition of non-census datasets as long as the data is prepared in the required pair-wise origin, destination, variable format Main challenges: Re-designing the interface to handle time-series data. Current data are discrete, cross-sectional data Some of the datasets (HES for example) present issues related to geographies: Currently, HES destination is a specific point, rather than an area Metadata redesign to clearly identify different datasets and characteristics for users Incorporation of ‘on-the-fly’ disclosure control routines for datasets like HESA

    25. INTERACTION – Example issues

    26. INTERACTION – Example issues Output complexities will need to be solved, with extra dimensions to the data output e.g. Current: origin/destination by age by sex Could be: origin/destination by age by sex by year

    27. INTERACTION – Example issues Currently, census data supplied to us has already been subjected to statistical disclosure control methods, such that small counts are suppressed before the data is put onto the system - this can affect the accuracy of query results Where some new datasets will be supplied in primary unit form, this offers us the opportunity to only apply statistical disclosure control where it is necessary, thus increasing data accuracy for the end user Different techniques will need to be trialled and evaluated before data is made widely available

    28. The New Data Three new non-census data sets would be included in INTERACTION: National Health Service Central Register (NHSCR) data from 1975 to present Hospital Episode Statistics (HES) data from 2001 to present Higher Education Statistics Agency (HESA) student data from 2001 to present

    29. NHSCR Data NHSCR data will be available as a time series for a consistent set of 100 Zones based on the FHSA geography from 1975 to 1998 Post-1998 data will be available for Health Authority areas in England and wales and equivalent areas in Scotland and Northern Ireland Variables will be restricted to broad age and sex categories

    31. HES data We would be aiming to include HES data from 2001 until the present Data contains information on all in-patient episodes relating to Hospitals in England Origins are as detailed as Ward or SOA. Destinations are available down to Postcode Unit level The ‘journey to hospital’ data can be disaggregated by a huge variety of variables, including:

    32. HES data Age (at end and start of hospital episode) Sex Ethnicity Duration of episode Type of episode (related to treatment given) Diagnosis category (International Classification of Diseases and related health problems [ICD-10] classification) – contains information on every known illness/disease/injury Separate classifications for maternity and mental health episodes Type of operation (if applicable)

    34. HES data – research opportunities Hospital Episode Statistics provide a unique opportunity to study hospital catchment areas in relation to specific treatments and enable measurements of ‘market penetration’ – something becoming more relevant under the new NHS Patient Choice directive which allows patients more choice over where they are treated Spatial interaction modelling will enable analyses of the frictional effect of distance on the ‘commute’ to hospital, and the testing of ‘what if’ scenarios in relation to the opening and closing of hospitals Optimum locations for new hospitals or treatment centres in relation to demand could be explored through location-allocation modelling

    35. HESA data We would be aiming to include HESA data from 2001 until the present Data contains information on the home address and destination of higher or further education institution Origins could be as detailed as MLSOA with destinations only as accurate as the location of the HE institution attending – no way to ascertain exactly where student is living Student migrations can be disaggregated by:

    36. HESA data Age group (5 years) Disability (disabled/not known to be disabled/not known) Ethnicity (white/non-white/unknown) Domicile (middle layer Super Output Area) Postcode of HEI headquarters Level of study (postgraduate, first degree, other undergraduate) Subject area Term-time accommodation Major source of tuition fees Mode of study (full-time/part-time) Gender

    37. HESA data – research opportunities Students are the section of the population most actively involved in internal migration in Britain Increasing numbers of students are entering into higher education, with large numbers of students becoming features of many of Britain’s major urban centres Students have significant social, cultural, economic and environmental impacts on the areas they live with issues such as ‘studentification’ becoming active topics of political debate Times series and cross-sectional analysis of student migration data in Britain should allow for greater understanding and prediction of student in-migration impacts

    38. Conclusions: An extensive audit of interaction data in the UK led to CIDER identifying a number of key sources that could be incorporated into an updated version of the WICID system New data sources would compliment existing census-based interaction datasets and would move CIDER towards providing a more complete interaction data service An number of technical challenges will need to be overcome as we move from WICID to INTERACTION Easy access to new interaction data sources will provide unique opportunities for substantive research to be carried out in relation to internal migration in the UK

    39. Thank you Adam Dennett, Centre for Interaction Data Estimation and Research, School of Geography, University of Leeds a.r.dennett@leeds.ac.uk http://www.geog.leeds.ac.uk/people/a.dennett/ For the full audit: http://www.geog.leeds.ac.uk/wpapers/index.html

More Related