ESRC Research Methods Festival 2014: Using Secondary Analysis to Research Individual Behaviour On the job training and accounting for endogeneity using BHPS longitudinal data. Genevieve Knight and Michael White July 8 2014. Genevieve Knight [email protected]
ESRC Research Methods Festival 2014:Using Secondary Analysis to Research Individual BehaviourOn the job training and accounting for endogeneity using BHPS longitudinal data
Genevieve Knight and Michael White
July 8 2014
The impact of economic conditions on the outcomes of job mobility and the mediating role of training: Sectoral differences in the private returns to in-job training in the 1990s UK recession
To inform the large scale UK public sector job cuts announced within an economic recession we looked for evidence on what happened last time…. UKPublic sector contraction and forced movement of public service employees to the private sector has happened before – under the John Major government of the 1990s, so we can use the experience then to gauge the effects of public-to-private job moves now
Use the methods and design of analysis
2 criteria to infer a causal relationships
Covariation(correlation) of causal (X, explanatory) and outcome (Y, dependent) variables AND time order – cause comes before effect
Individual A – receives training
has to be estimated!!!
She then earns £280 per week
WE OBSERVE THIS FACT
She would earn £220 per week
If she had received no training
Impact on individual A = £280 - £220 = £60
Clearly do not observe the situation where training is mobility and the mediating role of training: not received for those who actually do receive training
Matching impact analysis involves carefully trying to estimate the counterfactual for those who receive training.1. Matching - Impact evaluation & the counterfactual
(AKA ‘within regression’ since estimates reflect within-person variation around her/his mean values rather than comparisons between people).
accounts for the influence of unobserved personal factors (such as ability or personality) that are constant over time.
Outcome = Op
Outcome = Oc
Positive overall effects for some, but with sectoraldifferences (no effect market, 7.5% public) , and phasing (timing), with public sector training providing a more persistent gain in protecting earnings when employees change sector or change employment.
that no amount of X will get rid of.
1) change in sector, i.e. the mobility between public and market sectors;
We use a combination of
We use matching for large samples (all, market, public) and FE for mobility groups as we can pool across waves
Observational data…(sigh!) mobility and the mediating role of training:
Violation of ignorable treatment assignment
[there are unobserved variables related to both treatment assignment (who gets training) and the outcome (wages)]
The only true solution – get better data!
In practice, we have to implement solutions to try to fix up these issues….That’s not all folks….3. Other ‘methods/data’ problems…
Attrition mobility and the mediating role of training: – survey ‘drops outs’ can unbalance the information leading to ‘selection bias’ and should be accounted for (a form of unit missing data).
Missing (Y or covariate values) – the methods of accounting for missing data will affect the results (another bias). X =Simple Mean Imputation: missing dummy indicators in the propensity; missing dummies in the FE. Y? see above
Choose The variance estimates for matching – there is still debate on the best way to account for the uncertainty of the propensity estimate not being the True propensity. Just an estimate. (it leads to more conservative – wider confidence interval on the impact than necessary…! ) bias and variance reduction trade off decisions…we use abadie&imbens
You have to choose a matching method…more bias and variance reduction trade off decisions We specify two-match nearest neighbour matching with sample replacement
You have to choose how much covariate balance is enough…
For FE - repeated observations on the same individuals: use robust variance estimator
For FE -‘unbalanced panel’ approach (Wansbeek and Kapteyn 1989), since restriction to the balanced panel would lose too much data.
Inclusion of numerous controls for time-varying variables helps to strengthen the causal interpretation of results as well as substituting for sample weights.
We compensate for the absence of weighting by including a wide range of control variables that have been used by the survey originators in structuring the survey (see Taylor et al. 2011).3. Other problems to solve in (panel) data…Analytical problems (biases)
We suspect FE does a better job of allowing for individual unobserved ability/financial motivation that no amount of X will get rid of.