Challenges in Integrating Diverse Data
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

Judy Cushing The Evergreen State College Olympia WA [email protected] PowerPoint PPT Presentation


  • 138 Views
  • Uploaded on
  • Presentation posted in: General

Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers. Judy Cushing The Evergreen State College Olympia WA [email protected] www.evergreen.edu/bdei NSF EIA-0310659, EIA-0131952

Download Presentation

Judy Cushing The Evergreen State College Olympia WA [email protected]

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Judy cushing the evergreen state college olympia wa judyc evergreen

Challenges in Integrating Diverse Data for Ecological SynthesisSpecial Roles & Responsibilitiesfor Information Managers

Judy Cushing

The Evergreen State College

Olympia WA

[email protected]

www.evergreen.edu/bdeiNSF EIA-0310659, EIA-0131952

http://canopy.evergreen.edu/canopydbNSFDBI-0417311, DBI-0319309, …

www2.evergreen.edu/quantecology


Judy cushing the evergreen state college olympia wa judyc evergreen

Challenges in Integrating Diverse Data Lessons Learned from the Grasslands Data Integration (GDI) Project*

Information Managers

Jincheng Gao (KNZ), Nicole Kaplan (SGS),

Ken Ramsey (JRN) , Mark Servilla (NET),

Kristin Vanderbilt (SEV)

Computer Scientists and Data Analysts

Judy Cushing, Carri LeRoy,

Juli Mallett, Lee Zeman

Ecologists

Christine Laney (JRN), Alan Knapp (SGS),

Daniel Milchunas (SGS), Esteban Muldavin (SEV)

Integrate Above-Ground Net Primary Productivity (ANPP) data, with its drivers (contextual data) for

cross-site comparisons (Ecological Synthesis),

past and future

(come visit our poster!)

Jincheng Gao (KNZ), Nicole Kaplan (SGS),

Ken Ramsey (JRN) , Mark Servilla (NET),

Kristin Vanderbilt (SEV)

LTER Information Management

Jincheng Gao (KNZ), Nicole Kaplan (SGS),

Ken Ramsey (JRN) , Mark Servilla (NET),

Kristin Vanderbilt (SEV)

LTER Information Management

Christine Laney (JRN), Alan Knapp (SGS),

Daniel Milchunas (SGS), Esteban Muldavin (SEV)

LTER Ecologists

Christine Laney (JRN), Alan Knapp (SGS),

Daniel Milchunas (SGS), Esteban Muldavin (SEV)

LTER Ecologists


What s in the gdi database

What’s in the GDI Database?

  • recorded or calculated annual aboveground NPP values from 5 LTERs: Jornada, Sevilleta,SGS, Konza, Kruger

  • 4,126,700 grams, over 20 years in 1697 plots


What s did we find

What’s did we Find?

  • Ecology

    • Environmental drivers of ANPP

    • ANPP-based grassland community composition.

  • Preliminary definition & provision of contextual data –

  • Ecotrends ++….

  • Information Management: species table fixes, ideas for better experimental design documentation, scripting for data integration…. CHANGE LOGS WERE ESSENTIAL; USDA PLANTS DB

  • 4. CS – case study on Data Integration; need for TOOLS:

  • PASTA-LIKE SERVICE &

  • TAXONOMIC CONCEPT SERVICE

Jincheng Gao (KNZ), Nicole Kaplan (SGS),

Ken Ramsey (JRN) , Mark Servilla (NET),

Kristin Vanderbilt (SEV)

LTER Information Management

Jincheng Gao (KNZ), Nicole Kaplan (SGS),

Ken Ramsey (JRN) , Mark Servilla (NET),

Kristin Vanderbilt (SEV)

LTER Information Management

Christine Laney (JRN), Alan Knapp (SGS),

Daniel Milchunas (SGS), Esteban Muldavin (SEV)

LTER Ecologists

Christine Laney (JRN), Alan Knapp (SGS),

Daniel Milchunas (SGS), Esteban Muldavin (SEV)

LTER Ecologists


Anpp vs precip

ANPP vs. Precip

No climate data yet


Judy cushing the evergreen state college olympia wa judyc evergreen

r = 0.608

r = 0.631

r = 0.329

r = 0.196


Judy cushing the evergreen state college olympia wa judyc evergreen

CART Model: Classification and Regression Tree Model, R2 = 0.642!!

Variables included in model: LTER, year, PDSI, NH4, NO3, absTmax, asbTmin, Tmax, Tmin, Tmean, Precip


Lesson 1 what you ims do is important

Lesson 1What you (IMs) do is important

  • ANPP – a critical ecological measure (indicator?)

  • You (Kristin, Ken, Nicole) made GDI happen….

  • It’s a collaborative & interdisciplinary project –

  • and not a technology problem….

    • IMs

    • Computer Scientists

    • Ecologists

    • Statistician (Data Analyst)

  • You know the issues, physically possess the data

  • for important ecological & scientific DB problems

  • e.g., global climate change, resource management


Judy cushing the evergreen state college olympia wa judyc evergreen

Lesson 2The GDI DB should be dynamic – Not StaticA static data warehouse is an oxymoronas is “Museum of Innovation”

  • More years, future years

  • Current data – further refined

  • More sites, different ecosystems


Lesson 3 volume matters more sites more years more trouble

Lesson 3Volume Matters….More sites, more years, more trouble….

  • More species codes

  • Differences in experimental design

  • Cross-site comparison highlights data anomalies

  • High volumes make a qualitative difference

  • A good data structure* matters even more….

* Ask me why GIS not been a priority to illustrate my field datasets….


Lesson 4 information managers critical computer science in crisis

Lesson 4Information Managers CriticalComputer Science in Crisis….

There won’t be enough CS graduates …

to do all the jobs …

even today….


Nsf s icer cpath initiative integrative computing education research nsf

NSF’S ICER (CPATH) INITIATIVE INTEGRATIVE COMPUTING EDUCATION & RESEARCH NSF

  • CS content changed (changing!) radically….

  • No uniform agreement on the core…

  • Graduates lack a systems approach….

  • Dwindling pipeline….

  • US industry [& science] competitiveness threatened….


Nsf s icer cpath initiative nsf asked why is cs in crisis what can be done

NSF’S ICER (CPATH) INITIATIVE NSF asked: Why is CS in crisis? What can be done?

Northwest Region: http://www.evergreen.edu/icer

Improve the quality of computing education ….

Attract more people ….

Improve retention….

Strengthen interdisciplinary connections….

Improve CS educational research ….

Google asked: What can industry do?

I ask: What should the LTER IMs do?


Lesson 4 cont computer science in crisis

Lesson 4 (cont)Computer Science in Crisis….

My charge on this panel:

IMs typically come from “the sciences” (essential)

Yet their tasks are programming & managing software projects.

What skills or tools are essential for IMs?

…As an educator, which are effectively learned on-the-job,

and which require formal training?

Tools are learned on the job,

Skills through practice.

(but should be demonstrable before hiring)

Concepts require (some) formal training….

(there is a handful of critical concepts?)


Lesson 4 cont what cs to do the gdi

Lesson 4 (cont)What CS to do the GDI?

  • Concepts

    • Formal Languages & Parsing

    • Data Structures

  • Abilities

    • See patterns (and non-patterns)

    • Learn new technology fast; see when the tools won’t do it

    • Build new technology, services….

  • Skills (tools)

    • Scripting Languages, Database tools and SQL

But, CS is not enough… needed an interdisciplinary team….

historical perspective, ecology vision, statistical expertise

Future tools –

PASTA- like & TAXONOMIC SERVICES,

Contextual data provision (ClimDB, EcoTrends)


Questions

Questions?

Judy Cushing

[email protected]

www.evergreen.edu/bdei

http://canopy.evergreen.edu/canopydb

www2.evergreen.edu/quantecology


  • Login