continuous surveys statistical challenges and opportunities n.
Download
Skip this Video
Download Presentation
Continuous Surveys: Statistical Challenges and Opportunities

Loading in 2 Seconds...

play fullscreen
1 / 36

Continuous Surveys: Statistical Challenges and Opportunities - PowerPoint PPT Presentation


  • 237 Views
  • Uploaded on

Continuous Surveys: Statistical Challenges and Opportunities. Carl Schmertmann Center for Demography & Population Health Florida State University schmertmann@fsu.edu. Outline. CHALLENGES (long) Increased Temporal Complexity Increased Sampling Error New Weighting Problems

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Continuous Surveys: Statistical Challenges and Opportunities' - lavina


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
continuous surveys statistical challenges and opportunities

Continuous Surveys: Statistical Challenges and Opportunities

Carl Schmertmann

Center for Demography & Population Health

Florida State University

schmertmann@fsu.edu

outline
Outline
  • CHALLENGES (long)
    • Increased Temporal Complexity
    • Increased Sampling Error
    • New Weighting Problems
  • OPPORTUNITIES (brief, but important)
sample size comparison
Sample Size Comparison
  • US CENSUS LONG FORM:--- 17% / decade
  • ACS ROLLING SURVEY: 2 per 1000 Households / month 24 per 1000 Households / year 240 per 1000 Households / decade--- 24% / decade
1 temporal complexity
1. Temporal Complexity

1. Temporal Complexity

what is the population
What is the Population?
  • 1-Day Census
    • Population membership is binary: {0,1}
    • Each individual is IN or OUT
  • Continuous Survey
    • Population membership is fuzzy:0 --------------- + ---------------1
    • Individuals can be MORE IN (more person-days of residence) or MORE OUT (fewer)

1. Temporal Complexity

slide7

Residents (in 000s)

1. Temporal Complexity

slide8

Residents (in 000s)

Census Population = 12 000 (83% Type A)

1. Temporal Complexity

slide9

Residents (in 000s)

An ACS ‘Data Sandwich’

includes samples from all months

1. Temporal Complexity

slide10

Residents (in 000s)

ACS samples from 184 000 person-months

Avg Population: 15 333

(65% Type A)

1. Temporal Complexity

characteristics change over the sampling period
Characteristics change over the Sampling Period
  • Persons
    • Age
    • Marital Status
    • Employment
    • Education
  • Housing Units
    • Vacancy
    • Number of Occupants
    • $ Value

1. Temporal Complexity

rolling population
Rolling ‘Population’

Population formed by sandwiching monthly samples is the average frame of a film, not a snapshot

Individuals and housing units with changing characteristics are sampled and caught ‘in motion’.

1. Temporal Complexity

reference period problems
Reference Period Problems

Many ‘long-form’ questions refer to retrospective periods:

  • Income in last 12 months
  • Place of residence 1 year ago
  • Child born in last 12 months?
  • Etc.

1. Temporal Complexity

time reference example
Time Reference Example
  • ‘2004’ data from 12 monthly samples taken in Jan04…Dec04
  • Question on fertility in the 12 months prior to the survey, so there are 12 overlapping periods in ‘2004’ data
    • ‘Jan04’ question covers Jan03-Jan04
    • ‘Feb04’ question covers Feb03-Feb04
    • etc.

1. Temporal Complexity

slide15

Nov 2004

Oct 2004

Sep 2004

Dec 2004

Mar 2004

Aug 2004

Apr 2004

May 2004

Jul 2004

Jun 2004

. . . . . . . . . . . x x x x x x x x x x x x ●

. . . . . x x x x x x x x x x x x ● . . . . . .

. . . . . . . . . . x x x x x x x x x x x x ● .

. . x x x x x x x x x x x x ● . . . . . . . . .

. . . x x x x x x x x x x x x ● . . . . . . . .

. . . . . . x x x x x x x x x x x x ● . . . . .

. . . . x x x x x x x x x x x x ● . . . . . . .

. . . . . . . x x x x x x x x x x x x ● . . . .

. . . . . . . . x x x x x x x x x x x x ● . . .

. . . . . . . . . x x x x x x x x x x x x ● . .

Jan 03

Jan 04

Jan 05

Jan 2004

x x x x x x x x x x x x ● . . . . . . . . . . .

Feb 2004

. x x x x x x x x x x x x ● . . . . . . . . . .

1

7

11

12

11

10

8

1

6

9

10

2

3

4

9

8

5

7

6

5

4

3

2

1. Temporal Complexity

slide16

Reference Periods for ‘Last 12 Month’ Questions in 1-year ACS Datasets

1. Temporal Complexity

temporal issues summarized
Temporal Issues Summarized

‘Data Sandwiches’ contain:

  • New meaning of ‘population’
  • Units that change over sampling period (moving targets)
  • Multiple reference periods for retrospective questions

1. Temporal Complexity

2 sampling error
2. Sampling Error

2. Sampling Error

small samples
Small Samples

More overall data from continuous sampling, but…1-, 3-, or 5-Year Sandwiches have smaller samples than the single, decennial long form survey more sampling error

in published data

2. Sampling Error

small samples1
Small Samples

The problem is especially acute for

  • small areas
  • narrow age groups
  • rare subpopulations

e.g., How many unmarried teen births per year in Sevier County, Tennessee?

ACS 2006-2008 says 0 ± 161

2. Sampling Error

slide22

C24020. SEX BY OCCUPATION – Key West, Florida Data Set: 2006-2008 American Community Survey

3-Year Estimates(http://tinyurl.com/acs-alap)

…etc

2. Sampling Error

temporal instability
Temporal Instability

Teenage Birth Rate in a County

unfortunate result
Unfortunate Result

Aggregating over 1+ years of surveys produces datasets that are often

  • Unfamiliar and difficult to understand
  • Still too noisy to be useful for planners and researchers

2. Sampling Error

3 weighting for non response
3. Weighting for Non-Response

3. Weighting Problems

weighting
Weighting

Weighting from Respondents  Total Population

requires Population Control Totals:

(Place x Age x Sex x Race x Ethnicity x …)

3. Weighting Problems

decennial long form sample
Decennial Long Form Sample
  • Control Totals
    • Measured from a simultaneous enumeration of the population(Sample & Census on same day)
    • Only 1 set needed
    • Sample and Population defined identically (resid. on Census Day)

3. Weighting Problems

continuous survey
Continuous Survey
  • Control Totals
    • Must be estimated (no simultaneous census)
    • Many sets needed (2006, 2007, 2006-8, 2007-9, 2008-12, …)
    • Sample and Population defined differently

3. Weighting Problems

acs control totals persons
ACS Control Totals (Persons)
  • ACS responses are weighted to match official intercensal estimates by
    • Year (1 July midpoint snapshot)
    • County (sometimes city)
    • Age
    • Race
    • Sex
    • Hispanic Origin (yes/no)

3. Weighting Problems

acs control totals persons1
ACS Control Totals (Persons)

Potential Errors

  • Estimates are Wrong:
    • Unanticipated internal migration
    • Unanticipated international migration
    • etc
  • Population Definition don’t match
    • Seasonal fluctuations
    • Different race/ethnic categories

3. Weighting Problems

slide31

Census Pop = 12 000 (83% Type A)

Average Pop = 15 333 (65% Type A)

If every year looks like this…Intercensal Estim= 12 000 (83% Type A)

3. Weighting Problems

weighting error example
Weighting Error Example

ACS weighting to estimates produces:

  • Popn too small (Census < Avg Pop)
  • Popn too “A” (seasonal Bs missed)
  • Overestimates of vars + correl. with A (e.g., % with college education)
  • Underestimates of vars - correl. with A (e.g., % single-parent families)

3. Weighting Problems

opportunities
Opportunities

4. Opportunities

opportunities1
Opportunities

ACS table cells = millions of “seemingly unrelated” maximum likelihood estimates

Statistical models that exploit likely cell relationships (over times, ages, sexes, places, variables …) could, in principle

  • Retain frequency & recency
  • Reduce variance of estimates
  • Recover familiar measures

4. Opportunities

conclusion
Conclusion

CONTINUOUS SURVEYS like ACS create

  • Big Problems for producers and users
    • Unfamiliar, temporally complex data
    • Potentially high sample error
    • Technical problems with weighting
  • Big Opportunities, IF we can develop appropriate statistical models and practices

5. Conclusion

slide36

Thanks!

¡Gracias!

Obrigado!

5. Conclusion

ad