320 likes | 592 Views
Predicting food web connectivity Phylogenetic scope, evidence thresholds, and intelligent agents. Cynthia Sims Parr Ecological Society of America Memphis, TN August 8, 2006. ELVIS: Ecosystem Localization, Visualization, and Information System. Oreochromis niloticus Nile tilapia. Bacteria
E N D
Predicting food web connectivityPhylogenetic scope, evidence thresholds, and intelligent agents Cynthia Sims Parr Ecological Society of America Memphis, TN August 8, 2006
ELVIS: Ecosystem Localization, Visualization, and Information System Oreochromis niloticus Nile tilapia Bacteria Microprotozoa Amphithoe longimana Caprella penantis Cymadusa compta Lembos rectangularis Batea catharinensis Ostracoda Melanitta Tadorna tadorna Food web constructor Species list constructor ? . . .
ELVIS’s Food Web Constructor predicts basic network structure Prelude to systems models
G taxon S taxon Food Web node S link G Evolutionary tree A step
Evolutionary Distance Weighting • Set distance thresholds • Find relatives of target nodes X, Y with known link status E.g. relative A is close to X, relative B close to Y where Link Value between A and B is known • For each found link, compute weight based on distance • Compute certainty index for a predicted link by combining weighted link values, with a discount for negative evidence
Food web database 4600 distinct taxa Food web data: Cohen 1989, Dunne et al. 2006, Vazquez 2006, Jonsson et al. 2005 Evolutionary tree: Parr et al. 2004. + plants from ITIS + hierarchy of non-taxonomic nodes
Testing the algorithm • Take each web out of the database • Attempt to predict its links • Compare prediction with actual data Accuracy percentage of all predictions that are correct89% Precision percentage of predicted links that are correct55% Recall percentage of actual links that are predicted47%
Choosing parameters • 30 web subsample • Representative of habitats, years, # nodes, percent identified to species • Iterate over parameter settings • Tradeoff between Precision percentage of predicted links that are correct Recall percentage of actual links that are predicted
Evolutionary distance threshold2 steps up and 4 steps down recall precision steps down steps up steps up
ancestor descendent siblings Evolutionary direction penalty not very sensitive
*** Database search Evolutionary distance weighting *** % *** Paired T-tests df=251 ***p<0.001 Is evolutionary distance weighting better than strict database search? Database search is more precise, but evolutionary distance wt has better recall.
Older webs contribute Recall percentage of actual links that are predicted47% 48% with no EcoWEB data Precision percentage of predicted links that are correct55% 39% with no EcoWEB data
large webs have fewer unknown “taxa” recent webs are bigger large webs have better taxonomic resolution …but large webs are harder to predict
How can we do better predicting links? Trait space distance weighting Euclidean distance in natural history N-space Parameterize functions from the literature that might predict links using characteristics of taxa. For example, size or stoichiometry. LinkStatusAB= ƒ(α, sizeA, sizeB), ƒ(β, stoichA, stoichB) … …need more data
Animal Diversity Web http://www.animaldiversity.org geographic range habitats physical description reproduction lifespan behavior and trophic info conservation status Triples “Esox lucius” hasMaxMass “1.4 kg” “Esox lucius” isSubclassOf “Esox” “Esox” eats “Actinopterygii” ETHANEvolutionary Trees and Natural History ontology
UMBC Triple ShopQueryWhat are body masses of fishes that eat fishes? Enter a SPARQL query SELECT DISTINCT ?predator ?prey ?preymaxmass ?predatormaxmass WHERE { ?link rdf:type spec:ConfirmedFoodWebLink . ?link spec:predator ?predator . ?link spec:prey ?prey . ?predator rdfs:subClassOf ethan:Actinopterygii . ?prey rdfs:subClassOf ethan:Actinopterygii . OPTIONAL { ?predator kw:mass_kg_high ?predatormaxmass } . OPTIONAL { ?prey kw:mass_kg_high ?preymaxmass } } . . . leaving out the FROM clause
Esox_lucius.owl webs_publisher.php? published_study=11 Actinopterygii.owl UMBC Triple ShopCreate a datasetFind semantic web docs that can answer query. http://swoogle.umbc.edu
UMBC Triple ShopGet results Apply query to dataset with semantic reasoning. http://sparql.cs.umbc.edu/tripleshop2/
Summary • Food Web Constructor uses evolutionary approach and large databases • We chose parameters using subsample • Explored results over entire database • Evolutionary distance weighting recalls links better than database search • Older webs are useful • Large webs harder to predict • Some phyla are easier than others to predict • For future algorithms, we can gather and integrate data via ontologies and intelligent agents
http://spire.umbc.edu UMBC: Tim Finin, Joel Sachs, Andriy Parafiynyk, Li Ding, Rong Pan, Lushan Han, UMCP: David Wang, RMBL: Neo Martinez, Rich Williams, Jennifer Dunne, UC Davis: Jim Quinn, Allan Hollander UMMZ Animal Diversity Web: Phil Myers, Roger Espinosa UMCP: Bill Fagan, Bongshin Lee, Ben Bederson
Others ETHANworkflow KeywordsHTML Keywords OWL XSLT template Filters ETHAN Taxon acct OWL ADW taxon acct HTML ADW database MySQL Acct data tabular text Animal name tree Taxon Path OWL ITIS Plants, etc. Phylum-sized ET chunk OWL Evolutionary Tree side of ontology OWL SPIRE taxon database MySQL
UMBC Info. Retrieval Agents Food Web Constructor Evidence Provider U Maryland Semantic Prototypes In Ecoinformatics UC Davis Semantic Web Tools Species List constructor NASA Goddard Rocky Mtn Bio Lab Invasive Species Forecasting System Remote Sensing Data Food Webs Ecological Interaction Ontologies
Food Web Constructor example Nile Tilapia in St. Marks http://spire.umbc.edu/fwc QuestionWhat are potential predators and prey ofOreochromis niloticus in the St. Marks estuary in Florida? ProcedureSubmit species list for St. Marks, with Oreochromis niloticus added.
Implications: parameterized functions LinkPredictedCD = ƒ(α, sizeC,sizeD) + ƒ(β , stoichC,stoichD) • Requires good data for target species • Can incrementally add natural history functions to get better estimate, try different functions from literature or use genetic algorithms • Parameterizing functions: multivariate statistics, machine learning, fuzzy inference • Could use evolutionary info if you localize parameter estimates to clades or taxonomic subsets
3 changes 2 steps X Y Distance weighting options • Evolutionary • Uses phylogeny or classification or combination of these – assumes related organisms like each other • Distance could be branch length or # steps • Does not need natural history data
has-a has-a TaxonA HigherTaxon TaxonB is-a is-a is-a is-a Breeding Season Reproductive Characteristic Breeding Duration Sexual maturity Age of Sexual Maturity Ontologies Richer way to design databases: instances of concepts that have well-defined meanings and formal relationships. “Higher Taxon” lives in “Australia” “Taxon A” lives in “Australia” “Taxon A” hasAgeOfSexualMaturity “1 year” “TaxonA” hasBreedingDuration “5 months” “Taxon B” lives in “Australia”