On coordination of stratified Pareto ps and simple random samples Annika Lindblom Alex Teterukovsky Statistics Sweden
The paper focuses on: • Presentation of the sampling designs in the SAMU: stratified SRS and Pareto ps • Sample co-ordination, in particular the implementation of Pareto ps design • Overlap between ps and SRS samples: - theoretical findings for SRS - empirical findings based on surveys in practice
The SAMU • A system for co-ordination of frame populations and samples from the Business Register at Statistics Sweden since 1972 • Three main objectives: - obtain comparable statistics - ensure high precision in estimates of change over time - spread the response burden
Inclusion probabilities Frame population divided into H disjoint strata Uh, h = 1,..H, where Uh contains Nh units. A sample of fixed size nh from each Uh is to be drawn. Inclusion probabilities are: Stratified SRS: same for all units within a stratum (Pareto)ps: unique for each unit, xk size measure for unit k
Permanent random numbers • To each unit k in the Business Register a permanent random number uk uniformly distributed over the interval (0,1), is attached • For SRS we choose the starting point and the direction, and sample the necessary number of units: Different blocks
Pareto ps sampling procedure • Compute the desired inclusion probabilities within each stratum • If k>1 then unit is sampled with probability 1 • For other units calculate the ranking variable: • The sample consists of the units with the nh smallest q-values within stratum h
Pareto and starting points Random number transformation is necessary. The objective of the transformation is to select the nh units with the smallest q-values within a stratum h independently of what starting point S is chosen. Transform uk into zk as follows. Sampling direction right: Sampling direction left:
Coordination and overlap • Theoretical. SRS/SRS. • Empirical. SRS/SRS. • Empirical. Pareto/Pareto. • Empirical. SRS/Pareto. Same surveys. • Empirical. SRS/Pareto. Different surveys. • Empirical. SRS/SRS and Pareto/Pareto over time.
Theoretical overlap. SRS/SRS. Coordinate 2 equal SRS samples (h is stratum): sample sizesnh frame population Nh completely enumerated units mh What is the expected overlap for all a’s and b’s? GEOMETRY!