1 / 32

Weighting and Estimation

Weighting and Estimation. Presented by. Loredana Di Consiglio Istituto Nazionale di Statistica , ISTAT. Outline. Weighting and estimation in the Handbook Weighting, use of auxiliary variables and calibration estimators Small area estimation Preliminary estimation

derry
Download Presentation

Weighting and Estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Weighting and Estimation

  2. Presented by • Loredana Di Consiglio • IstitutoNazionale di Statistica, ISTAT

  3. Outline • Weighting and estimation in the Handbook • Weighting, use of auxiliary variables and calibration estimators • Small area estimation • Preliminary estimation • Choice of estimation method

  4. Weighting • Principle of weighting: each sample unit represents a number of population units. • Basic weights: the design weights • Horvitz-Thompson estimator • Non-linear Estimation: Plug-in Principle (or substitution)

  5. Weighting • The principle of weighting is also applied to account for unit non-response. • Design weights can be adjusted also to consider non-response in order to reduce the possible bias of resulting estimates. • For example, the sample can be partitioned into sub-groups of units where the response rates are assumed to be constant, and where it can be assumed that non-respondents behave similarly to respondents. • Non-response depends on auxiliary variables defining a partition of the population, but conditionally on these variables it is independent of the target variable.

  6. Use of Auxiliary information • When auxiliary variables are available: reduce bias, reduce variance(however sometimes, external bounds) • Ratio estimator, auxiliary information : the total of one numerical variable • If applied to the X variable, one getsa perfect estimate

  7. Use of Auxiliary information • Poststratification: total of a vector of indicator of post-strata • The estimator is

  8. Use of Auxiliary information • Raking Ratio • Auxiliary Information: knowntotals of differentauxiliaryvariables (not-cross-classified) The Raking-Ratio method consists in performing post-stratification with all variables and iterate

  9. Use of Auxiliary information • GREG • GREG is «assisted» by a linear relationshipbetweenX and Y.

  10. Calibration • The estimate of total Y is obtained by means of a procedure which • Corrects bias due to non response • takes into account the knowledge of auxiliary variables, requiring that the estimates of these ones are equal to their own known totals

  11. Calibration • The weights wkare calculated as follows: • dk is the initial weight, equal to the inverse of the inclusion probability pk • gkis the final correction factor, which allows equality of sampling estimates to their known totals; it is calculated by means of the following equations

  12. Calibration • Final weight are chosen to satisfy constraints on auxiliary variables subject to • where G is an appropriate distance function • Subject to bounds for w/d

  13. Calibration • Distance function G: • Linear • Raking ratio: (w/d) Log (w/d) – w/d +1 • Truncated linear

  14. Calibration • Calibration estimator equals GREG when choosing the linear (Euclidean) distance function

  15. Calibration • All calibration estimators are asymptotically equal to GREG • They are approximately unbiased and consistent • Their sampling variance converges to GREG variance

  16. Calibration • Software • CLAN (Statistics Sweden) • BASCULA (The Netherlands) • GES (StatCan) • ReGenesees(ISTAT)- R package • A second R package, called ReGenesees.GUI, implements the presentation layer of the system: less experienced R users will take advantage from the user-friendly graphicalinterface. • downloadable from the Joinuphttps://joinup.ec.europa.eu/software/regenesees/release/all

  17. Weighting, use of auxiliaryvariable and calibration • Plannedmodules in HB • Mainthememodule • Calibrationestimators • Alreadyavailable: • GREG http://www.cros-portal.eu/content/generalised-regression-estimator

  18. Small area estimation • Most national surveys are planned to produce accurate estimation at national level. • Analyses at finer partition may not have the desired precision due to small sample size or even zero sample. • A small area is a domain where the sample size is not sufficient to satisfy prefixed level of precision.

  19. Small area estimation • Indirect estimators – make use of what has been observed on the other domains (or time) • Traditional estimators: • Synthetic estimators • Composite estimators • Model based estimators • Area level models • Unit level models • With this class of estimators extra-information is gained in the estimation process by making use of observations outside the domain of interest by means of implicit (synthetic estimators) or explicit (model based estimators) use of models.

  20. Small area estimation • Use information atlocallevel with common beta • Modifieddirect

  21. Small area estimation • Synthetic estimators: simple case it is assumed that small areas have same mean of larger domains (at least in classes), Synthetic estimators can be based on different models (relationships between variable of interest and auxiliary v.); linear model; linear mixed model at unit level; linear mixed model at area level.

  22. Small area estimation • Model based estimators • Based on area level model: • Based on unitlevel model:

  23. Small area estimation • Referencesin the HB: • http://www.cros-portal.eu/content/small-area-estimation • http://www.cros-portal.eu/content/eblup-area-level-sae • http://www.cros-portal.eu/content/eblup-unit-level-sae • http://www.cros-portal.eu/content/small-area-estimation-methods-time-series-data

  24. Small area estimation • Guidelines can be foundat: http://www.cros-portal.eu/sites/default/files//WP6-Report.pdf • Qualityassessment: http://www.cros-portal.eu/content/final-report-quality-assessment-sae-wp3 • In practice: • http://www.cros-portal.eu/content/final-report-software-tools-sae-sae-wp4 • R codes from ESSnet SAE project: http://www.cros-portal.eu/content/r-codes-documentations-sae-wp4

  25. Preliminaryestimation • The treatment of unit non-response may be applied. • In this case, the late response is treated as non-response but in order to avoid biased estimates, the self-selection of quick respondents mechanism should not be considered as random.

  26. Preliminary estimation • Raoet al. (1989) proposed composite estimators that may represent an improvement of the standard estimator. • The basic composite estimator is obtained as weighted average of the preliminary estimate at time t and the final estimate at time t-1 adjusted for the difference between preliminary estimates at time t and t-1. • chosen on the basis of variances and covariances

  27. Preliminaryestimation • In order to reduce the revision error of the preliminary estimates model based estimators can be considered, Rao, Srinath and Quenneville (1989) adopt a time series approach to preliminary estimation. • Let be respectively the preliminary estimate at time t, the final estimates and the measurement errors in preliminary estimates at time t

  28. Preliminaryestimation • Furthermore, suppose: • The estimatorresults: • Or whenauxiliaryvariables • Or takinginto account ofseasonality

  29. Preliminary estimation • Design based • http://www.cros-portal.eu/content/preliminary-estimates-design-based-methods • Model based • http://www.cros-portal.eu/content/preliminary-estimates-model-based-methods • Sub-sampling • http://www.cros-portal.eu/content/subsampling-preliminary-estimation

  30. Choice of estimationmethods • Qualityindicators: • Accuracy: degree of closeness of estimates to the true values. • Bias • Precision • Timeliness: is the length of time between the event or phenomenon they describe and their availability. – Revisionerrors • Coherenceand comparability: Coherence with otherstatistics Ref. ESS Handbook for Quality Reports Methodologies and Working papers, 2009

  31. Choice of estimationmethods • Close relationship with sampling design – (e.g. weights) – Choice of sampling strategy • Non probabilistic sample design? E.g. cut-off sampling model based estimators • Model simply assumes that the units cut off behave similarly to those in the sampled portion. • Model assumptions should be analysed as far as possible.

  32. Thankyou for yourattention

More Related