1 / 47

Ecoinformatics Workshop Summary

Ecoinformatics Workshop Summary. SEEK, LTER Network Main Office University of New Mexico Aluquerque, NM. Topics covered. Grid networks – Ecogrid Workflow systems – Kepler / Ptolemy II Metadata compilers – Morpho Databases – MySQL, MetaCAT, DBDesigner QA/QC – SAS, S-Plus, Access, Excel

etta
Download Presentation

Ecoinformatics Workshop Summary

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ecoinformatics Workshop Summary SEEK, LTER Network Main Office University of New Mexico Aluquerque, NM

  2. Topics covered • Grid networks – Ecogrid • Workflow systems – Kepler / Ptolemy II • Metadata compilers – Morpho • Databases – MySQL, MetaCAT, DBDesigner • QA/QC – SAS, S-Plus, Access, Excel • Interactive & Dynamic web sites – DreamWeaver

  3. SEEK Overview

  4. Grid Networks

  5. SEEK EcoGrid • Goal: standardize interfaces (using web and grid services) • We have standardized data via EML • Integrate diverse data networks from ecology, biodiversity, and environmental sciences • Grid-standardized interfaces • Uniform interface to: • Metacat, SRB, DiGIR, Xanthoria, etc. • Anyone can implement these interfaces • Hides complexity of underlying systems • Metadata-mediated data access • Supports multiple metadata standards • EML, Darwin Core as foci • Computational services • Pre-defined analytical services • On-the-fly analytical services

  6. EcoGrid Node

  7. EcoGrid client interactions • Modes of interaction • Client-server • Fully distributed • Peer-to-peer • EcoGrid Registry • Node discovery • Service discovery • Aggregation services • Centralized access • Reliability • Data preservation

  8. Kepler: scientific workflows EML provides semi-automated data binding Scientific workflows represent knowledge about the process; Kepler captures this knowledge

  9. Kepler: ecological modeling

  10. Lotka-Volterra Predator Prey Model

  11. Running the model

  12. Metadata what are they? and why should they be created?

  13. Metadata Example In front of you are two tuna cans. How do you decide which one to buy?

  14. Metadata Example Metadata helps you decide which one to get !

  15. Ecological Metadata Language • Adopted by the LTER Information Management • Metadata specification developed by the ecology discipline for the ecology discipline • Based on prior work of Ecological Society of America and others (Michener et. al., 1997) • Seven years in development – 14 versions • EML 2.0.1 • Implemented as an XML Schema • Supports four separate modules • Dataset • Citation • Software • Protocol

  16. Associated Metadata • Data Set • Data Table • Xml files

  17. Morpho • provides a way for ecologists to share data by defining a common structure to document their data • uses an XML format to create the common structure.

  18. Morpho - tree editor

  19. Morpho – entering metadata Again, chose from the earlier entries, another, data package or enter new information

  20. Morpho - metadata Once data is up loaded to Morpho you can edit data or metadata This is the window that press finish in the morpho wizard.

  21. Databases • Small scale & on local computer – Access • Bigger & on server - MySQL

  22. Example - why use a database? • Coordinate field data collection and data entry forms DATE SITE WEB PLOT QD SPECIES OBS COVER HEIGHT COUNT PHEN COMMENTS 2/3/1999 FPC 1 E 1 ERPU8 1 0.5 4 13 V NA 2/3/1999 FPC 1 E 1 ERPU8 2 0.1 2 16 V NA 2/3/1999 FPC 1 E 1 GUSA2 1 0.01 4 2 V NA 2/3/1999 FPC 1 E 1 GUSA2 2 0.1 5 1 V NA 2/3/1999 FPC 1 E 1 GUSA2 3 0.5 12 1 V NA

  23. Database example DATE SITE WEB PLOT QD SPECIES OBS COVER HEIGHT COUNT PHEN COMMENTS 2/3/1999 FPC 1 E 1 ERPU8 1 0.5 4 13 V NA 2/3/1999 FPC 1 E 1 ERPU8 2 0.1 2 16 V NA 2/3/1999 FPC 1 E 1 GUSA2 1 0.01 4 2 V NA 2/3/1999 FPC 1 E 1 GUSA2 2 0.1 5 1 V NA 2/3/1999 FPC 1 E 1 GUSA2 3 0.5 12 1 V NA • Divide to 4 tables: • Location table • Species table • Visit table • Observation table

  24. Database example Location Visit Observation Species

  25. Database example

  26. Database example

  27. QA/QC QC • Designing data sheets • Data entry using • Validation rules • Filters • Lookup tables • Validate entered data • Double entry • Prior data • Filters

  28. QA/QC QA • Graphics • 􀂄Box plots • 􀂄Scatterplots • 􀂄Normal probability plots • Formal statistical methods • 􀂄Grubbs’test Edwards 2000

  29. QA/QC The goal of QA is NOT to eliminate outliers! Rather, we wish to detect unusual & extreme values.

  30. µ + 3σ µ µ - 3σ

  31. What did I learn? • Know your subject. Have a plan. • Some planning (little time) in advance will save a lot of head-ache (and time and money and missed opportunities) later. • Unorganized data might become a quick way to wall yourself off the increasingly collaborative and computerized research world.

More Related