1 / 36

Paycheck Income Data

Paycheck Income Data. John Rae, Partner - Data and Product Development Simon Power, Principal Consultant HNDA Training for Practitioners, 6 th May 2014. Agenda. What is Paycheck. Introduction. S ources Using the geographic hierarchy Multi-method models. Method. Latest data ideas

sue
Download Presentation

Paycheck Income Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paycheck Income Data John Rae, Partner - Data and Product Development Simon Power, Principal Consultant HNDA Training for Practitioners, 6th May 2014

  2. Agenda • What is Paycheck • Introduction • Sources • Using the geographic hierarchy • Multi-method models • Method • Latest data ideas • Keeping up to date • Innovations • Licence use • Opportunities • Limitations

  3. What is Paycheck?

  4. Paycheck provides estimates of household income • UK wide estimates • Mean • Median • Mode • Distribution by income band • Full geographic detail • Modelled down to full postcodes • Potential to aggregate to any geographic area • Option to sub-divide incomes by lifestage

  5. Data and modelling concepts

  6. Modelling objective • To fit an appropriate statistical distribution to data on household incomes • To predict this distribution for all relevant geographic areas by means of the mean and standard deviation

  7. Geographical hierarchy of the modelling

  8. Outline of data inputs • Survey data • Structured to be representative • Sample will rarely include data points in a given local area • Large lifestyle database • Unrepresentative collection method • Large sample size will include data for most local areas Surveys are good best nationally, lifestyle is better locally.

  9. Step 1 – Bring the data up to date Survey Data Survey Data Lifestyle data • Living Costs and Food survey • Sample 6,500 across the UK • Available survey period typically two years ago • Data Locator Group • Covers 1.2 million individuals in Scotland (UK 15 million) • After cleaning we get data for 856,000 households in Scotland (UK 4.3 million) • We bring the incomes up to current year using Average Earning change figures published by ONS • We apply weighting to match the survey data at national level • We inflate to the present using Average Earnings time series

  10. Step 2 - Establish the current UK earning profile • Take the household incomes measured by the survey • Inflate these to current year figures • Represent the distribution as bands of £5,000 • Model incomes above £100,000 as an exponentially decaying distribution • Transform the (percentile points of) the distribution to fit a standard normal distribution • All subsequent modelling is conducted on normally distributed variables and a reverse transformation converts the model results back to real income values

  11. Step 3 - Bayesian modelling approach Demographic modelling ’Direct’ calculation • Take a sample of lifestyle data (representative of national socio-demographics) • Build linear regression models to estimate (transformed) income from the demographics • Apply to local areas based on local socio-demographics • Calculate incomes directly from the Lifestyle data • Create (local) correction factors for the initial model estimates in light of the actual scores • Repeat for the next (smaller) geographic level • Undo the transformation £

  12. Points of Discussion • Why LCF as opposed to other surveys? • Why a UK model? • How is it kept up to date?

  13. Data innovations

  14. Places change….. and we can impute data about them

  15. What is Acorn and why is it relevant to the income model?

  16. Which is Ethel ? Which is Kayleigh?

  17. The purpose of geodemographics • To analyse data and facilitate educated guesses. • Which channels fit which people? • Where might it be more likely to find people with unhealthy lifestyles? • Which people are using which of my services and in what manner?

  18. It looks like this Category Group Type Affluent Achievers A. Lavish Lifestyles B. Executive Wealth C. Mature Money 1 Exclusive enclaves 2 Metropolitan money 3 Large house luxury 4 Asset rich families 5 Wealthy countryside commuters 6 Financially comfortable families 7 Affluent professionals 8 Prosperous suburban families 9 Well-off edge of towners 10 Better-off villagers 11 Settled suburbia, older people 12 Retired and empty nesters 13 Upmarket downsizers 1 Rising Prosperity D. City Sophisticates E. Career Climbers 14 Townhouse cosmopolitans 15 Younger professionals in smaller flats 16 Metropolitan professionals 17 Socialising young renters 18 Career driven young families 19 First time buyers in small, modern homes 20 Mixed metropolitan areas 2 21 Farms and cottages 22 Larger families in rural areas 23 Owner occupiers in small towns and villages 24 Comfortably-off families in modern housing 25 Larger family homes, multi-ethnic areas 26 Semi-professional families, owner occupied neighbourhoods 27 Suburban semis, conventional attitudes 28 Owner occupied terraces, average income 29 Established suburbs, older families 30 Older people, neat and tidy neighbourhoods 31 Elderly singles in purpose-built accommodation 32 Educated families in terraces, young children 33 Smaller houses and starter homes Comfortable Communities F. Countryside Communities G. Successful Suburbs H. Steady Neighbourhoods I. Comfortable Seniors J. Starting Out 3 34 Student flats and halls of residence 35 Term-time terraces 36 Educated young people in flats and tenements 37 Low cost flats in suburban areas 38 Semi-skilled workers in traditional neighbourhoods 39 Fading owner occupied terraces 40 High occupancy terraces, many Asian families 41 Labouring semi-rural estates 42 Struggling young families in post-war terraces 43 Families in right-to-buy estates 44 Post-war estates, limited means 45 Pensioners in social housing, semis and terraces 46 Elderly people in social rented flats 47 Low income older people in smaller semis 48 Pensioners and singles in social rented flats Financially Stretched K. Student Life L. Modest Means M. Striving Families N. Poorer Pensioners 4 O. Young Hardship P. Struggling Estates Q. Difficult Circumstances Urban Adversity 49 Young families in low cost private flats 50 Struggling younger people in mixed tenure 51 Young people in small, low cost terraces 52 Poorer families, many children, terraced housing 53 Low income terraces 54 Multi-ethnic, purpose-built estates 55 Deprived and ethnically diverse in flats 56 Low income large families in social rented semis 57 Social rented flats, families and single parents 58 Singles and young families, some receiving benefits 59 Deprived areas and high-rise flats 5

  19. All thit is relevant because….. The new Acorn has revolutionised geodemographics Peter Sleight Chair, The Association of Census Distributors 'Tracking a decade of changing Britain‘, Market Research Society seminar, November 2013

  20. Data is derived by combining multiple sources Registers of Scotland Land Registry National Register of Social Housing FoI requests to LAD’s Public register of HMO’s Zoopla property portals CACI lifestyle databases Housing for the elderly CACI High rise dwellings database e.g. Local level housing type and tenure

  21. Adding to the census Variables no longer on the census • Identify likely locations of high rise buildings • WALK THE STREETS • Create a database of addresses; • Social high rise (10+ storey) • Social mid-rise (5-9 storey)

  22. Data is derived by combining multiple sources e.g. Local level family structure, occupation and affluence CACI names and addresses Credit application age data Elderly-only accommodation Emma’s diary children database DWP claimant data CACI lifestyle database UCL ethnicity imputation Company directors Shareholders Students

  23. Adding to the census • Replace the census - housing for specific categories of people • Improve the census

  24. This approach realises a lot of address level data… 21.5m 10m 22m • households where wehave detailed age data • households where we have housing / tenure data • households where we have more detailed socio-demographics 3m 600,000 age-limited addresses people in HMO’s

  25. And produces a remarkable outcome… Following the release of the census Prior to the release of the census • We built the segmentation without census inputs • We linked to research surveys to form an insight test-bed • We optimised across over 2,500 topics • We added in the census and checked for change • We found including the census made no difference to the structure of the segmentation • The approach appears to achieve the equivalent (for these geodemographic models) of having a census every year

  26. As an illustration… South Ayrshire’s first affordable housing in 30 years, the Somerset Road Development includes: • West of Scotland Housing Association’s development of 32 flats as part of a bigger development of 76 units. • Dawn Homes development of 44 homes for outright sale. • Segmentation types.. • 49 Young families in low cost private flats • 50 Struggling younger people in mixed tenure

  27. Licencing – limitations and opportunities

  28. Limitations and opportunities • End User Licence • Contractual restrictions on the use of the data • Council use • Third parties

  29. Limitations and opportunities Opportunities Other CACI Datasets

  30. Limitations and opportunities Opportunities Other CACI Datasets

  31. IrishACORN WorkforceACORN Retail, Leisure & Financial Catchments Retail, Leisure & Financial Outlets Retail Spend Estimates Online Spend Estimates Postcode Worker Spend Estimates Individual Tourist Spend Estimates Consumers. Locations. Communities. Current Demographics 2011 Census Out of Work Benefits Job Seekers Allowance Hospitals/GPs/Schools/Libraries Rail Passengers Public Transport Access Levels (PTAL) British Crime Survey FRS: GFKNoP’s Financial Research Survey Understanding Society TGI

  32. Summary • Seeking the best between national surveys and very local data • Paycheck • Data techniques experts consider to be revolutionary • Up to date • Use within the HNDA • Options for other uses • Usability

  33. Questions

  34. CACI Contact Details Simon Power T. 07977 522792 E. spower@caci.co.uk

More Related