240 likes | 371 Views
Biological nomenclature in the postgenomic era: Biological and computational issues. George Garrity and Catherine Lyons Bergey’s Manual Trust and Explicatrix, LLC. Imagine. A clinical microbiologist’s predicament The microbial ecologist’s dilemma The case of Francisella novicida
E N D
Biological nomenclature in the postgenomic era: Biological and computational issues.George Garrity and Catherine LyonsBergey’s Manual Trust and Explicatrix, LLC
Imagine.. • A clinical microbiologist’s predicament • The microbial ecologist’s dilemma • The case of Francisella novicida • The history of the Altermonadaceae • Genus described in 1972 • 15 emendations, 20 species • 19 moved to four genera • 5 synonyms, two subspecies • 64 names, five genera, three families, two classes • The common thread in all these stories…
Stan Falkow’s Underwear “Given a choice, most taxonomists would rather wear each other’s underwear than use each other’s names” Why is this so?
My objective • Share some insights on problems in three areas • Nomenclature and taxonomy • Publishing taxonomic information • A generalized taxonomic model • Finite state machine • Simple grammar • Global issues • Data equivalence • Data provenance • Data curation
Problems in nomenclature • Systematic biologists • Marking territory • Personal achievement • Other biologists • End-users • Unfamiliar with literature • Unique aspects • Unaware of Codes of Nomenclature • Legalistic framework • Formation and assignment of names • Circumscription and emendation of taxa • Priority and citation • Synonymy and homonymy • Correction of orthographic errors • Adjudication of nomenclatural disputes • But • Do not govern classification or identification
Problems in nomenclature (cont.) • Biological names • Primary entry point into STM literature • Prominent role in laws/regulations • Commerce, public safety, public health • Primary entry point into scientific databases • Poor identifiers • Fixed in time and scope • May not be revised • Synonymies generally not address • Persist, but • obsolesce in relation to taxon • An archival record of a taxonomic definition for a single point in time
The name/taxon disjunction • Impact • Accumulation of dubious names in literature/databases • Effects assertions of: • Identity, commonality of pathways, common ancestry, homology, parology, xenology • Legal consequences
Problems in print publishing • Key requirement • Proposals and emendations must appear in print • Code specific • Prokaryotic Code • Effective, legitimate, and valid • Registration • Taxonomies are retrospective • Can only cite earlier publications • Cannot cite future emendations • Increasingly based on molecular sequence data • Deposit of sequence data in public databases • Not conveniently referenced in print
Problems with electronic publishing • No formal publishing mechanisms • Does not fulfill fundamental requirement of the Code(s) • Lack bibliographic information • Not citable • Not persistent • Subject to uncontrolled change • May disappear • Link rot • 404 Link not found
A brief glimpse at where we’re headed • The Bergamot/N4L model • Separates names from taxa • Taxa nameless • Uniquely, persistently identified • Supports multiple, overlapping taxonomies • Accumulation of new data vs. new methodologies • Rank agnostic • Unique from all other approaches • An identifier resolution service, not an information space in which to practice taxonomy. • Names provide an entry point into the literature • Reliably • Persistently • A lightweight information layer
A simple grammar species -> current.name.pointer, exemplar.deposit.pointer+, sequence.deposit.pointer+ taxon -> current.name.pointer, nomos.defined.data, (taxon+|species+) nomos.defined.data -> (sequence|phenotypic.feature|text)+ name -> (citation, bibliographic.record, name.status) exemplar -> exemplar.id, source sequence -> gene, sequence.deposit source -> exemplar|exemplar.deposit|text exemplar.deposit -> brc.id.pointer, deposit.id.pointer, source sequence.deposit -> brc.id.pointer, deposit.id.pointer, source phenotypic.feature -> feature.name, feature.value, deposit.id.pointer
Name+ Taxon Species+ Exemplar+ Sequence+
GenBank DDBJ EMBL others Collections BRC Literature Governing bodies Name+ Taxon Species+ Exemplar+ Sequence+
Priority Validity Literature Governing bodies Synonymy Exemplar req. Name+ Taxon Species+ Exemplar+ Sequence+ direct GenBank DDBJ EMBL others Collections BRC indirect phenotypic Proposal Practitioner+ STM Legal General Databases Public Private Practitioner+ Practitioner+ genotypic “omics” BRC
Name+ Name+ Species+ Species+ Exemplar+ Sequence+ Sequence+ Sequence+ A properly formed species Candidatus or exemplar lost Environmental sequence Name+ Name+ “Name”+ Species+ Species+ Exemplar+ Sequence+ Old type strain, not yet sequenced Old type, exemplar based on drawing or description Misidentifed taxon Exemplar*
GenBank DDBJ EMBL others Collections BRC Literature Governing bodies N4L/Bergamot Name+ Taxon Species+ Exemplar+ Sequence+
A bit of background information • Bergey’s Manual Trust • Principal information source • Bergey’s Manual of Determinative Bacteriology • Bergey’s Manual of Systematic Bacteriology • Taxonomic Outline of the Procaryotes
A bit of background information • Bergey’s Manual Trust • Principal information source • Bergey’s Manual of Determinative Bacteriology • Bergey’s Manual of Systematic Bacteriology • Taxonomic Outline of the Procaryotes
A bit of background information • Bergey’s Manual Trust • Principal information source • Bergey’s Manual of Determinative Bacteriology • Bergey’s Manual of Systematic Bacteriology • Taxonomic Outline of the Procaryotes • Expertise in content packaging/delivery • SGML/XML publishing • The Systematics • XML compliant SGML instance
A bit of background information • Bergey’s Manual Trust • Principal information source • Bergey’s Manual of Determinative Bacteriology • Bergey’s Manual of Systematic Bacteriology • Taxonomic Outline of the Procaryotes • Expertise in content packaging/delivery • SGML/XML publishing • The Systematics • XML compliant SGML instance • The Outline • An experiment in SGML/XML publishing
A bit of background information • Bergey’s Manual Trust • Principal information source • Bergey’s Manual of Determinative Bacteriology • Bergey’s Manual of Systematic Bacteriology • Taxonomic Outline of the Procaryotes • Expertise in content packaging/delivery • SGML/XML publishing • The Systematics • XML compliant SGML instance • The Outline • An experiment in SGML/XML publishing
A bit of background information • Bergey’s Manual Trust • Principal information source • Bergey’s Manual of Determinative Bacteriology • Bergey’s Manual of Systematic Bacteriology • Taxonomic Outline of the Procaryotes • Expertise in content packaging/delivery • SGML/XML publishing • The Systematics • XML compliant SGML instance • The Outline • An experiment in SGML/XML publishing • Derivative projects • Bergamot/N4L • The Determinative