1 / 51

A. Townsend Peterson Natural History Museum The University of Kansas

The Critical Role of Primary Occurrence Data in Biodiversity Modeling Keeping Data and Products Alive. A. Townsend Peterson Natural History Museum The University of Kansas. Keeping Biodiversity Information Alive I. Biodiversity information exists in two forms:

royce
Download Presentation

A. Townsend Peterson Natural History Museum The University of Kansas

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Critical Role of Primary Occurrence Data in Biodiversity ModelingKeeping Data and Products Alive A. Townsend Peterson Natural History Museum The University of Kansas

  2. Keeping Biodiversity Information Alive I • Biodiversity information exists in two forms: • primary data - specimens and observations, studies of the organism, etc. • secondary information - summaries, range maps, field guides, county records

  3. Secondary Information Primary data

  4. Keeping Biodiversity Information Alive II • Basing analyses and prioritizations on secondary products is convenient, but ...

  5. Keeping Biodiversity Information Alive II • Basing analyses and prioritizations on secondary products is convenient, but ... • Severs the vital connection between product and data - product begins to degrade

  6. Data and Product Degradation The statement is made ... “Species X is present at Site Y” Almost immediately, new information becomes available, taxonomies are redone, landscapes and patterns of land use change, and data become less meaningful. Hence, data quality and meaning, as well as quality and meaning of products based on the data, begin to degrade.

  7. Keeping Biodiversity Information Alive II • Basing analyses and prioritizations on secondary products is convenient, but ... • Severs the vital connection between product and data - product begins to degrade • Does not allow product to grow, improve, and evolve with more and better data

  8. Product Evolution and Improvement • Most biodiversity data sets grow by 1-5% yearly... data (and products based on the data) get better through time • More data sets are computerized and should be available each year • Improvement in technology and data availability for land-use evaluation permits better contextual understanding

  9. Keeping Biodiversity Information Alive II • Basing analyses and prioritizations on secondary products is convenient, but ... • Severs the vital connection between product and data - product begins to degrade • Does not allow product to grow, improve, and evolve with more and better data • Loses opportunity to take advantage of new, quantitative approaches for distributional modeling and synthetic analysis

  10. Keeping Biodiversity Information Alive III • Using primary information, especially point occurrence data, solves many problems: • Applicable to all regions and all taxa • Products don’t get old; rather, they update and evolve with improved information - product gets better • Products not limited to original planned purpose for which they were developed

  11. Keeping Biodiversity Information Alive IV • Primary point occurrence data offer several distinct advantages... • Species’ occurrences are referenced to particular points in space, permitting characterization of ecological needs • Species’ occurrences are referenced to particular points in time, permitting evaluation of temporal changes • Point occurrence data lend themselves well to statistical and other quantitative approaches • Products can be developed at any scale, from meters to continents • Products are verifiable, making quality assessment possible • No subjective, intermediate interpretive steps involved

  12. Mexican Bird Collections British Museum Paris Field Museum KU Natural History Museum

  13. “World Museum” Mexican Birds

  14. Example North American DataThe U.S. Breeding Bird Survey(soon to be added to the Species Analyst!)

  15. New Facility for Sharing Biodiversity Data .... The Species Analyst North American Biodiversity Information Network A distributed database for biodiversity information, linking institutions and serving data to all potential users http://chipotle.nhm.ukans.edu/nabin/

  16. The Challenge of Inferring Geographic Distributions 4. Inference beyond the limits of the actual data becomes necessary 1. Existing sampling is incomplete 2. Gaps in known distribution can represent real absence or simply nondetection 3. Without more data distinguishing between these two possibilities is not possible

  17. Two Types of Error in Distributional Predictions Actual geographic distribution

  18. Two Types of Error in Distributional Predictions Predicted geographic distribution

  19. Two Types of Error in Distributional Predictions Actual geographic distribution Predicted geographic distribution

  20. Two Types of Error in Distributional Predictions Actual geographic distribution Predicted geographic distribution Overprediction, or Commission Underprediction, or Omission

  21. Two Types of Error in Distributional Predictions Objective: To Minimize Both Forms of Error

  22. Two Types of Error in Distributional Predictions Objective: To Minimize Both Forms of Error

  23. Predictive Methodologies I • Georeferenced occurrence points • Electronic maps ... geographic coverages ... summarizing environmental dimensions important to species, such as temperature, rainfall, topography, soils, geology, etc. • Use nonrandom associations between points and coverages to build a model of a species’ ecological niche • Project model back onto geography to predict distribution

  24. Predictive Methodologies II • Subset points into a training dataset (for building models) and a test dataset (for assessing their effectiveness • Apply an algorithm to training data • BIOCLIM • logistic regression • discriminant function analysis • distance measures • etc. • Assess effectiveness of model, asking whether observed omission and commission are significantly less than random

  25. BIOCLIM Example Occurrence points overlain on geographic coverage, such as rainfall

  26. BIOCLIM Example Occurrence points overlain on geographic coverage, such as rainfall Frequency histogram of occurrence points in rainfall classes

  27. BIOCLIM Example Occurrence points overlain on geographic coverage, such as rainfall Frequency histogram of occurrence points in rainfall classes Distribution trimmed to eliminate marginal habitat records

  28. BIOCLIM Example Occurrence points overlain on geographic coverage, such as rainfall Frequency histogram of occurrence points in rainfall classes Distribution trimmed to eliminate marginal habitat records Project trimmed distribution

  29. Genetic Algorithm for Rule-set Prediction(GARP) • Developed by David Stockwell, San Diego Supercomputer Center • Takes advantage of multiple algorithms (BIOCLIM, logistic regression, etc.) • Different decision rules may apply to different sectors of species’ distributions • Uses a genetic algorithm, an artificial intelligence application, for choosing rules • Implemented on WWW, and open for public use (http://biodi.sdsc.edu)

  30. Example of Heterogeneous Rule Sets in GARP TRANSLATE - model into natural language No-rule number, Type, Prior-prior, Post-accuracy, Sig-Significance, Cov-coverage, Use-usage No Type Prior Post Sig Cov Use 1 m 0.50 0.92 26.66 0.41 0.376; IF Veg=[ 0, 9]r AND Elev=[ 0,2233]r AND Precip=[ 0, 9]r AND Temp=[ 0, 2]r AND Latitude=[25.1,15.2]r AND Longitud=[-107.6,-93.0]r AND Coastal=[0.0,131.0]r THEN Taxon=PRESENT 8 r 0.48 1.00 21.32 0.17 0.166; IF + Veg*0.02 r + Precip*0.29 r - Temp*0.26 r - Latitude*0.45 r + Longitud*0.27 r - Coastal*0.43 r THEN Taxon=BACKGROUND 0 r 0.51 0.98 27.10 0.32 0.127; IF - Veg*0.25 r + Elev*0.12 r + Precip*0.32 r + Temp*0.45 r - Latitude*0.43 r + Longitud*0.18 r - Coastal*0.50 r THEN Taxon=BACKGROUND 6 r 0.49 0.81 23.16 0.51 0.071; IF + Veg*0.38 r - Elev*0.49 r + Precip*0.02 r - Temp*0.36 r + Latitude*0.16 r - Longitud*0.21 r - Coastal*0.30 r THEN Taxon=PRESENT 14 d 0.51 0.85 19.33 0.32 0.066; IF Elev=[99,1245]r AND Precip=[ 3, 5]r AND Temp=[ 0, 3]r THEN Taxon=PRESENT 2 r 0.48 0.92 26.52 0.36 0.060; IF + Veg*0.02 r + Elev*0.49 r + Precip*0.17 r + Temp*0.17 r - Latitude*0.36 r - Longitud*0.02 r - Coastal*0.50 r THEN Taxon=BACKGROUND 3 r 0.48 0.89 25.83 0.39 0.040; IF - Veg*0.43 r + Elev*0.11 r - Precip*0.02 r + Temp*0.34 r - Latitude*0.02 r + Longitud*0.19 r - Coastal*0.50 r THEN Taxon=BACKGROUND 13 d 0.49 1.00 19.36 0.15 0.038; IF Elev=[336,2233]r AND Precip=[ 7, 7]r AND Temp=[ 0, 5]r AND Latitude=[30.0,18.7]r THEN Taxon=BACKGROUND 4 r 0.50 0.91 24.57 0.36 0.029; IF - Veg*0.32 r + Elev*0.07 r + Precip*0.47 r - Temp*0.39 r - Latitude*0.39 r + Longitud*0.50 r - Coastal*0.50 r THEN Taxon=BACKGROUND

  31. GARP Comparison of GARP and BIOCLIM Example: Aratinga canicularis Note extreme over- prediction (commission) in BIOCLIM map BIOCLIM

  32. How Well Can We Predict? • Numerous preliminary efforts showed promise ... good balance between omission and commission • 25 species chosen randomly from the Mexican avifauna for testing • Ability to predict into areas not included in the training data set taken as a measure of model quality • Each species tested in Oaxaca and Jalisco with Lisa Ball and Kevin Cohoon

  33. How Well Can We Predict? Atlapetes pileatus all known occurrence points in Mexico with Lisa Ball and Kevin Cohoon

  34. Removing Points from Test State State of Jalisco with Lisa Ball and Kevin Cohoon

  35. Distributional Model for Points with Lisa Ball and Kevin Cohoon

  36. Focus on Prediction for Test State with Lisa Ball and Kevin Cohoon

  37. Overlay Test Points with Lisa Ball and Kevin Cohoon

  38. How Well Can We Predict? Answer: Reasonably well ... about 85-90% of the species with Lisa Ball and Kevin Cohoon

  39. How Well Can We Predict? Choose training and test states

  40. How Well Can We Predict? Plot training points - Toxostoma rufum

  41. How Well Can We Predict? Develop GARP model

  42. How Well Can We Predict? Overlay test data points

  43. How Well Can We Predict? 715 of 741 test points correctly predicted 34% of total area predicted present 252 points expected to be predicted at random Statistical significance P< 10-225 All 34 species tested significant, no probability exceeding 10-3

  44. Advantages Over Range Maps • Smaller predictions, much greater precision • Reduces errors of commission, which are especially critical • Comparisons in Mexico indicate 103 - 1022 times more statistically significant than range maps • Models improve over time, rather than degrading continually from time of publication • Not limited to areas with published range maps • Produces ecological model (wait a minute...)

  45. Predicting Geographic Distributions Makes Possible ... • Understanding rare and endangered species’ distributions • Designing reintroduction programs • Understanding the effects of global climate change and other types of change • Projecting species invasions • Designing biodiversity conservation plans

  46. Building Maps of Species Diversity Reserve Locations in Southwestern Mexican Dry Forest Primary concentration of endemic species (12) Secondary concentration (4 species) with Daniel A. Kluza

  47. Predicting Species Invasions - Asian Longhorn Beetle

  48. Invasive Species and Endangered Species Barred Owls invading the range of Spotted Owls

  49. Predicting the Effects of Global Climate Change Ortalis poliocephala Before (green) vs. After (red)

  50. Primary Point Occurrence Data • Offer many advantages over secondary information • Applicable across spatial scales, verifiable • Open doors to new, synthetic analyses • Abundantly available for most taxa in most regions • Integrated with new data-sharing efforts • Primary point occurrence data form the most appropriate basis for gap analysis applications

More Related