Probability Survey Design: An Overview - PowerPoint PPT Presentation

albert
probability survey design an overview l.
Skip this Video
Loading SlideShow in 5 Seconds..
Probability Survey Design: An Overview PowerPoint Presentation
Download Presentation
Probability Survey Design: An Overview

play fullscreen
1 / 48
Download Presentation
Probability Survey Design: An Overview
765 Views
Download Presentation

Probability Survey Design: An Overview

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Probability Survey Design:An Overview Anthony (Tony) R. Olsen USEPA NHEERL Western Ecology Division Email: olsen.tony@epa.gov Voice: 541 754-4790 1

  2. Why Monitoring Programs Fail • Objectives for monitoring are not clearly, precisely stated and understood • Monitoring measurement protocols, survey design, and statistical analysis become scientifically out-of-date • Monitoring results are not directly tied to management decision making • Results are not timely nor communicated to key audiences 2

  3. Types of Statistical Designs • Experimental designs • Random allocation of treatments • Observational studies • Factor space designs • Gradient studies • Available sites • Survey designs • Census • Probability survey • Response designs needed in all 3

  4. Survey DesignResponse Design • Survey design is process of selecting sites at which a response will be determined • Probability model for inference is based on the randomized selection process • Has a spatial component and may have a time component • Response design is process of obtaining a response at a site: • A single index period during a year • Multiple periods during year: monthly, quarterly 4

  5. The Response Design:Index Period • Time period within year selected for measurement (ecologically based) • Measurements may be taken more than once during index period with response design giving protocol for obtaining single value for indicator • Indicator variability within index period contributes to non-survey sampling error 5

  6. Ecological Resource Typesfrom Survey Design Perspective • Finite population of discrete entities • 0-dimensional • All small lakes in the 48 conterminous states • All 8-digit USGS CU in the 48 conterminous states • Continuous areal population • 2-dimensional • All forest land • All coastal estuarine resources • Continuous linear network population • 1-dimensional embedded in 2-dimensions • All perennial wadable streams 6

  7. Basic Spatial Survey Designs • Simple Random Sample • Systematic Sample • Regular grid • Regular spacing on linear resource • Spatially Balanced Sample • Combination of simple random and systematic • Guarantees all possible samples are distributed across the resource (target population) • Generalized Random Tessellation Stratified (GRTS) design 7

  8. Why aren’t Basic Designs Sufficient? • Monitoring objectives may include requirements that basic designs can’t address efficiently • Estimates for particular subpopulations requires greater sampling effort • Administrative restrictions and operational costs • Ecological resource occurrence in study region makes basic designs inefficient • Resource is known to be restricted to particular habitats 8

  9. Stratification: Reasons to Use • Administrative or operational convenience • Regions or states need to be operationally independent • Particular portions of the target population require different survey designs • Design for extensive wetlands (Everglades) may be different from praire pothole wetlands • Increase precision by constructing strata that are homogeneous 9

  10. More complex Survey Designs • Spatial strata random sample • Don’t have a list frame • Alternative way to spatially balance sample • Unequal probability sample • Alternative to stratification • Requires auxiliary information • Cluster sample • Can decrease field operation • Multiple stage sample • Way to decrease cost of sample frame construction • Adaptive Sampling 10

  11. Survey Design Key Components • Objectives stated precisely and quantitatively • Target population explicitly, precisely defined • Sampling frame constructed that represents the target population • Decision on which survey design meets needs • Selection of sites using survey design • Statistical analysis match survey design 11

  12. Study Objectives • Study objectives determine the monitoring design • Usual to have multiple objectives • Objectives compete for samples • Precise statements are required • Objectives must be prioritized • Target population and subpopulations are determined by objectives 12

  13. What is a Target Population? • Target population denotes the ecological resource about which information is wanted. • Requires a clear, precise definition • Must be understandable to users • Field crews must be able to determine if a particular site is included • More difficult to define than most expect. • Includes definition of what the elements are that make up the target population 13

  14. Subpopulations and Domains • Subsets of the target population that are of particular interest • Examples for aquatic ecosystems • Ecoregions, biogeographic regions • All lentic resources in region with area < 100 ha • All lotic resources with with Strahler order < 4 • Tidal creeks versus open water estuarine areas • All lotic resources with < 20% riparian canopy cover • All 5-th field HUCs with >10 NWI wetland polygons • All 6-th field HUCs with >25% Federal land ownership 14

  15. Subpopulations: Impact on Design • Objectives identify critical subpopulations with expected sample sizes: Domains • Survey design addresses domain sample size requirements • Explicitly using stratification, unequal weighting • Implicitly when other requirements provide sufficient sample sizes • Other subpopulations can not be defined prior to sample selection 15

  16. Generalized Random Tessellation Stratified Designs • Spatially balances sample across the resource (improved precision) • Overview of process for areal resource • Randomly place hierarchical grid over area • Randomly select point within each grid cell • Select grid such that expected number of sample points in cell < 1 & expect most random points in cells to be in resource • Hierarchically randomize points to place them on line, assigning each point unit length • Select a systematic sample from the points using a random start 16

  17. GRTS Survey Design Options • Multiple density categories to allocate samples: unequal probability • Nested subsamples for measuring additional indicators or duplicate samples • Panels for monitoring over time • Oversample selection to address non-target and inaccessible sites • Special study areas with study-wide base • Explicit stratification • Incorporate multiple stage sampling 17

  18. Panels and Oversample • Panel: a collection of sites that will have the same revisit schedule over time • Basic design is single panel • 5-year rotating panel: panel 1 visited in year 1, 6, 11,etc; panel 2 visited in year 2, 7, 12, etc; … • More complex possible to balance priority of status estimation versus trend estimation • Oversample: design adjustment for • expected non-target sites • landowner access denial sites • physically inaccessible sites 18

  19. Example Designs • Everglades marshes and canals • Streams and rivers in 12 western states • Headwater watersheds in coastal plains of Mid-Atlantic • Prairie pothole wetlands in North Dakota and South Dakota • 6-th field hydrologic units in Pacific Northwest • FIA and FHM monitoring of forests • Amphibians in Olympic National Park and Southeast Oregon • Riverine wetlands associated with the Great Lakes • All Lakes >1 ha for fish tissue contaminants 19

  20. 20

  21. 22

  22. 24

  23. 26

  24. Monitoring Design Information • WWW.EPA.GOV/WED click on EMAP Monitoring Design and Analysis • Overview of survey design • Bibliography • Design and analysis information • EMAP Design Team • Works with States, Tribal Nations, EPA Regions, Other Federal Agencies • Members from ORD ecology divisions, NERL, Office of Water • Contact: Web page above 27

  25. Multiple Density Nested Random Tessellation Stratified Survey Design • Enables design-based estimators and variance estimators • Precise control over inclusion probabilities • Element & region variable probability assignment • Joint inclusion probability can be determined • Controls sample and subsample spatial balance • Nested subsamples easily selected • Unified theory for 0-, 1-, and 2-dimensional resources such as lakes, streams, and coastal waters 28

  26. Spatially-Structured “List” • Finite resource: 0-dimensional • Assign each unit to grid cell using GIS • Randomly order units within a single cell • Apply hierarchical randomization to cells • Linear resource: 1-dimensional, can be network • Clip linear resource to cell boundaries using GIS • Divide into segments • Randomly order units within a single cell • Apply hierarchical randomization to cells • Extensive resource: 2-dimensional • Select one point at random within each cell • Apply hierarchical randomization to cells 29

  27. Expansion of hexagon hierarchy to three levels 200 500 600 100 300 400 000 140 150 120 130 100 110 160 121 126 122 120 125 123 124 • Grid Cell Shape • hexagon • Square • Diamond • Grid Cell Address • General Balanced Ternary • Peano Key • Morton Key Hierarchical Randomization based on Hexagons

  28. Systematic Sample with Random Start from Ordered “List” Randomized List: 2 4 0 3 6 1 5 1 4 2 0 6 5 3 1 5 0 4 2 6 3 4 2 6 0 5 1 3 4 2 5 3 Assign weights: 2 4 0 3 6 1 5 1 4 2 0 6 5 3 1 5 0 4 2 6 3 4 2 6 0 5 1 3 4 2 5 3 Systematic Sample Random Start 2 4 0 3 6 1 5 1 4 2 0 6 5 3 1 5 0 4 2 6 3 4 2 6 0 5 1 3 4 2 5 3 Sample: 42 46 24 20 23 50 53 36 31 Translate location on line to lake identifier, lat/long location on stream, lat/long location of point

  29. Stratification and Unequal Probability Selection • Stratification: reasons • Improve precision of results • Operational/administrative efficiency • Different subpopulations require different survey designs • Unequal weighting • Allocate sample to subpopulations • Improve precision of results • Based on auxiliary information 33

  30. Status, Change, Trend • Status • How many stream km in Region III meet their designated use? • How many stream km have degraded riparian zones? • Change/Trends • Has the status of the streams in Region III changed between two time periods? • What is the trend over the last 10 years in the percent of stream km in Region III that meet their designated use? • What is the trend in nitrate concentration on the Santiam River at its confluence with the Willamette River. 34

  31. Highland Stream Conditions Biological Quality % of Stream Length Ranking of Stressors

  32. Monitoring Design and Analysis: Key Steps • Specify objectives and scope • Select sites to sample • Gain site access • Measurement protocols • Determine indicator based on measurements • Inference from the sample to entire aquatic resource • Communication of results 36

  33. Target Population: Lakes • All lakes (and reservoirs) within the conterminous U.S. excluding the Laurentian Great Lakes and the Great Salt Lake with permanent fish population. • A lake is defined as a permanent body of water of at least one hectare in surface area with a minimum of 1,000 sq m of open (unvegetated) water, and a maximum depth of one meter or more. 37

  34. Aquatic/riparian Resource • Define geographic region of interest - e.g. a state or province. • Aquatic/riparian components • Stream channel: habitat and water column • Stream near-channel: riparian • Stream upland area: terrestrial influence • Raises question of what constitutes the elements of the target population 38

  35. Alternatives for Defining Elements of Stream Target Population • All watersheds defined by a point anywhere on stream network (Point) • All watersheds defined by dividing the landscape into hydrologic units at a specified scale (HUC) • All watersheds defined by stream segments of network (Segment) 39

  36. Implications of choice • How many (what proportion of) stream km support aquatic life use? • How many (what proportion of) watersheds in region have greater than 50% of stream length supporting aquatic life use? • How many (what proportion of) stream segments in region support aquatic life use? 40

  37. State-wide Monitoring:When Multiple Years Required • Rotating basins • Each year monitor subset of state • Census • Probability Survey • Complete all subsets in 5-years • State-wide • Each year sample over entire state • Complete all sites to be sampled in 5-years • Census: partition all sites into 5 subsets • Probability survey over time 41

  38. Sampling Frame: Streams • GIS coverage that includes all streams in the target population • River Reach File Version 3 (RF3); NHD • Quality of RF3 as sampling frame • Excludes some channels that appear on 1:24,000 USGS maps and not on maps • Includes some channels/features that are not streams • Impacts survey design • Limited information available in RF3 to help define design for domains • Other GIS coverages can add attributes required 42

  39. Sampling Frame: Lakes • GIS coverage of lakes and reservoirs • RF3; NHD; state lists/coverages • Lakes: two alternatives for elements • Each lake is element: lake viewed as a point • All points in all lakes are elements: area view • Quality of RF3 as sampling frame • Excludes some lakes and reservoirs • Includes features that are not a lake or reservoir • Target population may be more restrictive than all of RF3 43

  40. Sampling Frame: Coastal Waters • GIS coverage of coastal waters in study • Estuary open water • Tidal streams • Near-shore waters • Elements are all point locations within target population 44

  41. RF3 Sample Frame: Lakes

  42. Sample Selected: Lakes

  43. Delaware Reporting Traditional 305(b) Report Chemical Evidence Aggregation of Existing Data New Report Chemical Evidence Probability Survey New Report Biological Evidence Probability Survey 47

  44. Survey DesignImproved Estimates of Population SizeOregon Coastal Coho Salmon • Historic long term monitoring of spawning suggests minimal problem • Historic survey biased • Salmon populations continue to decline • Survey results more accurately reflect populations • State program modified based on probability design Estimated No. Fish per mile