Deriving Educational Attainment
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Frank Linder (co-author Dominique van Roon) Statistics Netherlands PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on
  • Presentation posted in: General

Deriving Educational Attainment by combining data from Administrative Sources and Sample Surveys Recent developments towards the 2011 Census. Frank Linder (co-author Dominique van Roon) Statistics Netherlands Conference Statistics Investment to the future 2 Prague, 14-15 September 2009.

Download Presentation

Frank Linder (co-author Dominique van Roon) Statistics Netherlands

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Frank linder co author dominique van roon statistics netherlands

Deriving Educational Attainment by combining data from Administrative Sources and Sample SurveysRecent developments towards the 2011 Census

Frank Linder

(co-author Dominique van Roon)

Statistics Netherlands

Conference Statistics Investment to the future 2

Prague, 14-15 September 2009


Contents

Contents

  • Importance of data on education

  • Data Sources education level - traditionally - new alternative

  • Innovation in micro-integration:new way of combining administrative sources and sample surveys

  • Social Statistical Database (SSD), Virtual Census

  • Educational attainment, new method explained - micro-integration steps - weighting strategy

  • Accuracy

  • Conclusions

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Importance of data on education

Importance of data on education

  • Education key social indicator for government policy and socio-economic research

  • EU Lisbon Strategy 2000: “Education and training policies are central to the creation and transmission of knowledge and are determining factor in each society’s potential for innovation… Positive impact of education on employment, health, social inclusion and active citizenship has already been extensively shown”

  • Educational Attainment standard variable in Census Programme

  • Great demand for data on education level by researchers, background variable for their analyses

  • Educational Attainment central position in Social Statistical Database (SSD)

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Data sources education level traditionally

Data Sources education level, traditionally

  • Labour Force Survey (LFS), exclusive domain

    - complete education career until date of interview - educational attainment

  • Reliability small subpopulations problematic

  • In practice LFS-solution: - unified sample over a period of consecutive years => more observations, lower standard errors - assumption: stability of variable over the period

  • LFS still used as source for education

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Data sources education level new alternative

Data Sources education level, new alternative

  • New administrative education registers in last decade

    - new opportunities determination educational attainment

    - full coverage target population of register => more observations => more reliable estimates (in particular small populations!)

  • However ….. alternative still dependent on LFS ! - no coverage e.g. people prior to administrations (mostly older citizens), private education, studies abroad

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Innovation in micro integration

Innovation in micro-integration

  • combining statistical information from administrative sources ánd sample surveys for the sake of one variable

    EXAMPLE

    Integration conventional way Integration new way

    (different variables) (one variable)

    Jobs register Education register 1

    Education register k

    LFS

    LFS

Education level

Employeepopulation

Education level target population

Coherent information on education level of employees

Education level

Education level

Education level

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Social statistical database ssd virtual census

Social Statistical Database (SSD), Virtual Census

  • Integration Framework Social Statistics

  • - Micro-linkage and micro-integration of data on demographic and socio-economic issues

  • - Data sources:

  • ∙ administrative registers (primarily)

  • ∙ sample surveys (if no information in registers)

  • - Coherence, consistency, comprehensiveness, completeness, detailedness, 1 figure - 1 phenomenon

  • - Important base production of social statistics

  • Kind of information

  • - Demography, labour, social security, income, health care, security, housing etc…..

  • - Education level (previously LFS, now new method)

  • Key source Virtual Census 2001 and 2011

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Frank linder co author dominique van roon statistics netherlands

Educational Attainment, new method, step 1Construction Education Archive

  • Collecting sources education data

  • Storage in Education Archive

  • Registers Sample Surveys

Labour Force Survey’96..

Primary

Education -ENR2010..

Secondary education -ENR’02.. -ERR’99.. -CREHE prelim

Other registers -CWI’90.. -SFR’95.. -RSF’01..

Higher education -CREHE’83..

Education Archive (cumulative storage by annual addition)

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Education archive

Education Archive

Extract from Education Archive

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Educational attainment new method step 2 construction educational attainment file

Educational Attainment, new method, step 2 Construction Educational Attainment File

  • Construct Education Attainment File (EAF) containinghighest attained education level of individuals at reference date

  • Selection from Education Archive (micro-integration) - Records representing education careers until reference date (micro-integr: derive and impute missing start and end dates)

  • Adjust to target population (micro-integration) - E.g. eliminate foreign students in Education Archive; supplement primary education by imputation from Population Register: PR 0-14)

  • Quality assessment sources (micro-integration) - Which sources to be used, which neglected (e.g. CWI)?

  • Assessing validity education levels at reference date - Decision rules: deterministic and stochastic (based on probabilities)

  • Determine highest valid level during education careerExample registers: SCED 43 43 53 60 60 LFS-sample: SCED 20 – 33 – 43 – 53 – 60

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Validity education level at reference date

Validity education level at reference date

optional

sheet

  • Deterministic decision rules

  • Examples: - Last record of a person in Education Archive: secondary education with certificate in 2003. Education level still valid at reference date 2006? Yes, because not found in register of Higher Education.

  • - Someone doctorate thesis in 2005. Gives highest attained education level period after, because we distinguish no higher level (trivial!)

  • Stochastic decision rules - Probability education level is still valid at reference date (application Survival Analysis, Life Tables method)

  • Example: last observed date D. Is level valid x years after D? Upper bound U: within period [D;D+U] still valid (95%)

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Frank linder co author dominique van roon statistics netherlands

Weighting Strategy (1),structure EAF

  • REGISTER

  • inflating

  • EAF september 2005: mixture of register and sample (LFS) records

  • - Coverage: 6.5 mln records (5.8 mln register and 0.675 mln sample)

  • - NL population: 16.3 mln people.

  • - Sample inflation to bridge gap of 9.8 mln people

Education Attainment File, September 2005

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Weighting strategy 2 representativeness

Weighting Strategy (2), representativeness

  • Younger people better represented in EAF (registers!)- Older people are underrepresented in EAF- make EAF representative of NL Population: calibration LFS-weights

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Weighting strategy 3 weighting model

Weighting Strategy (3), weighting model

  • weighting model, variables

  • - demographic (e.g. sex, age, country of origin,…)

  • - socio-economic (e.g. socio-econ. category, income,…)

  • - education CWI-register (proxy)

  • weighting model too abundant?

  • - pro: consistency with as many population margins

  • - con: fluctuation final weights, disrupting effect on accuracy, problematic in cells with few observations

  • - revision of weighting strategy is considered

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Accuracy and dissemination

Accuracy and dissemination

  • Education levels accurate enough for dissemination?

  • Measuring instrument needed for determination accuracy: variance estimator

  • Standard literature sampling theory mainly for sample only, less the case for combined register and sample data

  • - approximation formulas – only to be used for larger n. Problematic for smaller subpopulations in which we are particularly interested

  • - solution also applicable for smaller subpopulations: bootstrap resampling method for accuracy measurement developed by methodologists of Statistics Netherlands

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Accuracy small subpopulations

Accuracy, small subpopulations

  • Young highly educated persons of Turkish origin, 2005

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Conclusions

Conclusions

  • Results new method deriving educational attainment are promising- Similarity with results traditional LFS-estimation at high aggregation level

    - Outperforms LFS for small populations. So more dissemination possibilities for small populations

  • Serious opportunity for innovation of Census of 2011 to produce educational attainment according to new method

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Frank linder co author dominique van roon statistics netherlands

RemarksQuestionsDiscussion

For more details read our paper!

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Ssd system organizational units

SSD-system, organizational units

optional

sheets

  • SSD-system: SSD-core (pivot) and SSD-satellites

  • Not fully presentation of SSD-satellites!!

  • Manageability SSD-system: split in smaller units

  • Core: demographic and socio-economic information relevant in almost any field

  • Educational attainment core-position: crucial in many social processes

  • Satellite: specific topics

  • Core and satellites consistent: 1-figure 1-phenomenon

sparetime

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Population census in the netherlands

Population Census in the Netherlands

optional

sheets

  • Until 1971 conventional: field enumeration

  • 2001: Virtual Census

    - Key source: Social Statistical Database (SSD)

  • - greater part from registers. - educational attainment from LFS (2000/2001)

  • Virtual Census results convincing =>

  • build on these experiences for Census 2011

  • 2011: Virtual Census

    - Task force in charge of working-out - Key source: much more comprehensive SSD - Educational attainment new method recommended (detailed table Census table programme)

sparetime

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Survival analysis life tables method

Survival Analysis, Life Tables Method

optional

sheets

sparetime

The survival function S(t) = P[T ≥ t] gives the probability that the educational attainment has not changed within t years. The distribution was determined empirically on the basis of the LFS for a number of years.

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Cv formulae sample

CV-formulae, sample

optional

sheets

  • - N population size

  • - n sample size

  • - N(g) total population in cell g

  • - n(g) number of sample observations in cell g

  • - p=N(g)/N; q=1-p and f=n/N.

  • With n large enough, the variance of Ň (g) can be approximated as: var(Ň(g)) = N2pq(1-f)/n

  • With the assumption that the average sample fraction f is very small (e.g. LFS sample fraction is about 1 percent), and p is very small (i.e. relative small subpopulation) the coefficient of variation (CV) can be approximated as:

  • [q(1-f)/np]½ ≈ [1/n(g)]½.

  • So, a CV ≤ 20% implies n(g) ≥ 25 (threshold)

sparetime

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Cv formulae mixture sample register

CV-formulae, mixture sample/register

optional

sheets

  • - N1(g) number of register observations in cell g

  • - N2(g) weighted number of sample observations in cell g

  • - N(g)= N1(g)+ N2(g)

  • - n2(g) is the (unweighted) number of sample observations in cell g.

  • For a coefficient of variation of not more than A it is required that

  • n2(g) ≥ (1/A)2. [N2(g) / (N(g)]2.

  • With a higher N1(g) the threshold decreases

sparetime

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


Silva skinner 1997

Silva & Skinner (1997)

  • Simulation study

  • Adding auxiliary variables in a regression model causes variance of regression estimator to drop initially, but by adding still more variables the variance will tend to increase from a certain point on.

optional

sheets

sparetime

Conference Statistics Investment to the future 2, Prague 14-15 september 2009


  • Login