 Download Download Presentation Types of Surveys

# Types of Surveys

Download Presentation ## Types of Surveys

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Types of Surveys Cross-sectional • surveys a specific population at a given point in time • will have one or more of the design components • stratification • clustering with multistage sampling • unequal probabilities of selection Longitudinal • surveys a specific population repeatedly over a period of time • panel • rotating samples

2. Cross Sectional Surveys Sampling Design Terminology

3. Methods of Sample Selection Basic methods • simple random sampling • systematic sampling • unequal probability sampling • stratified random sampling • cluster sampling • two-stage sampling

4. Simple Random Sampling Why? • basic building block of sampling • sample from a homogeneous group of units How? • physically make draws at random of the units under study • computer selection methods: R, Stata

5. Systematic Sampling Why? • easy • can be very efficient depending on the structure of the population How? • get a random start in the population • sample every kth unit for some chosen number k

6. Additional Note Simplifying assumption: • in terms of estimation a systematic sample is often treated as a simple random sample Key assumption: • the order of the units is unrelated to the measurements taken on them

7. Unequal Probability Sampling Why? • may want to give greater or lesser weight to certain population units • two-stage sampling with probability proportional to size at the first stage and equal sample sizes at the second stage provides a self-weighting design (all units have the same chance of inclusion in the sample) How? • with replacement • without replacement

8. With or Without Replacement? • in practice sampling is usually done without replacement • the formula for the variance based on without replacement sampling is difficult to use • the formula for with replacement sampling at the first stage is often used as an approximation Assumption: the population size is large and the sample size is small – sampling fraction is less than 10%

9. Stratified Random Sampling Why? • for administrative convenience • to improve efficiency • estimates may be required for each stratum How? • independent simple random samples are chosen within each stratum

10. Example: Survey of Youth in Custody • first U.S. survey of youths confined to long-term, state-operated institutions • complemented existing Children in Custody censuses. • companion survey to the Surveys of State Prisons • the data contain information on criminal histories, family situations, drug and alcohol use, and peer group activities • survey carried out in 1989 using stratified systematic sampling

11. SYC Design strata • type (a) groups of smaller institutions • type (b) individual larger institutions sampling units • strata type (a) • first stage – institution by probability proportional to size of the institution • second stage – individual youths in custody • strata type (b) • individual youths in custody • individuals chosen by systematic random sampling

12. Cluster Sampling Why? • convenience and cost • the frame or list of population units may be defined only for the clusters and not the units How? • take a simple random sample of clusters and measure all units in the cluster

13. Two-Stage Sampling Why? • cost and convenience • lack of a complete frame How? • take either a simple random sample or an unequal probability sample of primary units and then within a primary take a simple random sample of secondary units

14. Synthesis to a Complex Design Stratified two-stage cluster sampling Strata • geographical areas First stage units • smaller areas within the larger areas Second stage units • households Clusters • all individuals in the household

15. Why a Complex Design? • better cover of the entire region of interest (stratification) • efficient for interviewing: less travel, less costly Problem: estimation and analysis are more complex

16. Ontario Health Survey • carried out in 1990 • health status of the population was measured • data were collected relating to the risk factors associated with major causes of morbidity and mortality in Ontario • survey of 61,239 persons was carried out in a stratified two-stage cluster sample by Statistics Canada

17. OHSSample Selection • strata: public health units – divided into rural and urban strata • first stage: enumeration areas defined by the 1986 Census of Canada and selected by pps • second stage: dwellings selected by SRS • cluster: all persons in the dwelling

18. Longitudinal Surveys Sampling Design

19. Schematic Representation

20. Schematic Representation

21. British Household Panel Survey Objectives of the survey • to further understanding of social and economic change at the individual and household level in Britain • to identify, model and forecast such changes, their causes and consequences in relation to a range of socio-economic variables.

22. BHPS: Target Population and Frame Target population • private households in Great Britain Survey frame • small users Postcode Address File (PAF)

23. BHPS: Panel Sample • designed as an annual survey of each adult (16+) member of a nationally representative sample • 5,000 households approximately • 10,000 individual interviews approximately. • the same individuals are re-interviewed in successive waves • if individuals split off from original households, all adult members of their new households are also interviewed. • children are interviewed once they reach the age of 16 • 13 waves of the survey from 1991 to 2004

24. BHPS: Sampling Design Uses implicit stratification embedded in two-stage sampling • postcode sector ordered by region • within a region postcode sector ordered by socio-economic group as determined from census data and then divided into four or five strata Sample selection • systematic sampling of postcode sectors from ordered list • systematic sampling of delivery points (≈ addresses or households)

25. BHPS: Schema for Sampling

26. Survey Weights

27. Survey Weights: Definitions initial weight • equal to the inverse of the inclusion probability of the unit final weight • initial weight adjusted for nonresponse, poststratification and/or benchmarking • interpreted as the number of units in the population that the sample unit represents

28. Interpretation Interpretation • the survey weight for a particular sample unit is the number of units in the population that the unit represents

29. Effect of the Weights • Example: age distribution, Survey of Youth in Custody

30. Unweighted Histogram

31. Weighted Histogram

32. Weighted versus Unweighted

33. Observations • the histograms are similar but significantly different • the design probably utilized approximate proportional allocation • the distribution of ages in the unweighted case tends to be shifted to the right when compared to the weighted case • older ages are over-represented in the dataset

34. Survey Data Analysis Issues and Simple Examples from Graphical Methods

35. Basic Problem in Survey Data Analysis

36. Issues iid (independent and identical distribution) assumption • the assumption does not not hold in complex surveys because of correlations induced by the sampling design or because of the population structure • blindly applying standard programs to the analysis can lead to incorrect results

37. Example: Rank Correlation Coefficient Pay equity survey dispute: Canada Post and PSAC • two job evaluations on the same set of people (and same set of information) carried out in 1987 and 1993 • rank correlation between the two sets of job values obtained through the evaluations was 0.539 • assumption to obtain a valid estimate of correlation: pairs of observations are iid

38. Scatterplot of Evaluations • Rank correlation is 0.539

39. A Stratified Design with Distinct Differences Between Strata • the pay level increases with each pay category (four in number) • the job value also generally increases with each pay category • therefore the observations are not iid

40. Scatterplot by Pay Category

41. Correlations within Level Correlations within each pay level • Level 2: –0.293 • Level 3: –0.010 • Level 4: 0.317 • Level 5: 0.496 Only Level 4 is significantly different from 0

42. Graphical Displays first rule of data analysis • always try to plot the data to get some initial insights into the analysis common tools • histograms • bar graphs • scatterplots

43. Histograms unweighted • height of the bar in the ith class is proportional to the number in the class weighted • height of the bar in the ith class is proportional to the sum of the weights in the class

44. Body Mass Index measured by • weight in kilograms divided by square of height in meters • 7.0 < BMI < 45.0 • BMI < 20: health problems such as eating disorders • BMI > 27: health problems such as hypertension and coronary heart disease

45. BMI: Women

46. BMI: Men

47. BMI: Comparisons

48. Bar Graphs Same principle as histograms unweighted • size of the ith bar is proportional to the number in the class weighted • size of the ith bar is proportional to the sum of the weights in the class

49. Ontario Health Survey