slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Robert Hanner, PhD Database Working Group Chair, CBOL Global Campaign Coordinator, FISH-BOL Associate Director, Canadian PowerPoint Presentation
Download Presentation
Robert Hanner, PhD Database Working Group Chair, CBOL Global Campaign Coordinator, FISH-BOL Associate Director, Canadian

Loading in 2 Seconds...

play fullscreen
1 / 38

Robert Hanner, PhD Database Working Group Chair, CBOL Global Campaign Coordinator, FISH-BOL Associate Director, Canadian - PowerPoint PPT Presentation


  • 395 Views
  • Uploaded on

2nd International Barcode of Life Conference 18 September 2007. The BARCODE Data Standard : Enabling Molecular Diagnostics for Biodivesity. Robert Hanner, PhD Database Working Group Chair, CBOL Global Campaign Coordinator, FISH-BOL

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Robert Hanner, PhD Database Working Group Chair, CBOL Global Campaign Coordinator, FISH-BOL Associate Director, Canadian' - Faraday


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

2nd International Barcode of Life Conference 18 September 2007

The BARCODE Data Standard: Enabling Molecular Diagnostics for Biodivesity

Robert Hanner, PhD

Database Working Group Chair, CBOL

Global Campaign Coordinator, FISH-BOL

Associate Director, Canadian Barcode of Life Network

Biodiversity Institute of Ontario, University of Guelph, Canada

the infrastructure of taxonomy
The Infrastructure of Taxonomy
  • Collections and databases of specimens
  • Codes of Taxonomic Nomenclature
  • Compilations of taxonomic names
  • Data repositories (characters, gene sequences, images, trees)
  • Monographs
  • Floristic and faunistic surveys/inventories
  • Revisions
  • The (undigitized) Taxonomic Literature
roles of insd an archival database repository for nucleotide sequence
Roles of INSDan archival database/repository for nucleotide sequence

Output of Project A

Common

access

interface

Users

Output of Project B

Output of Project C

Assignment of a unique identifier (an accession number) to a sequence

Standardization of data structure including data items and values

slide8

New tools for taxonomy

DNA Barcoding

The ability to compare genotype information across a huge range of organisms is a powerful tool

validation demonstrates that a procedure is robust reliable and reproducible
Validation demonstrates that a procedure is robust, reliable and reproducible.

PCR amplification and DNA sequencing:

  • Are robust methods which produces successful results a high percentage of the time.
  • Are reliable methods that produce accurate results.
  • Are reproducible methods producing similar results each time a sample is tested.
manual assembly

Manual Assembly

Subjective interpretation?

slide14

“Only [27%] of papers had a legitimate specimens examined section, with museum numbers for each voucher, and names of the museums where the specimens used in the study could be examined”

couplets consisting of species name dna sequence

Couplets Consisting of:“Species Name - DNA Sequence”

Basis of a “look-up table” enabling molecular diagnostic applications

However, both elements are assertions

Underlying specimens and associated raw sequence data are not typically available for secondary inspection

problem areas
Problem Areas

TRANSPARENCY AND TRACEABILITY

  • Genetic Data Quality
  • Specimen Data Quality
  • Taxonomy
  • Information Access
slide17

First International Barcode of Life

Conference: Feb 5-8, 2005

rationale for defining barcode keyword in genbank
Rationale for Defining “BARCODE” keyword in GenBank
  • Provides the community with reference records with verifiable and retrievable data:
    • Associated with retrievable voucher specimens (liberally defined: tissue, DNA, etc.)
    • Linked to on-line metadata
    • Meet an agreed upon standard of taxonomic identification
    • Provide an assured level of data completeness
    • On an agreed upon gene region
    • Recommended for use in identifying unknowns
slide20

The Barcode Data Standard

  • Establishing a new data standard for “BARCODE” keyword records in DDBJ/EMBL/GenBank:
  • Minimum 500bp, <1% ambiguous base calls
  • Double stranded sequence
  • Trace files and associated quality scores
  • Primers used to generate sequence
  • Linkages to:
    • A morphological voucher specimen
    • Structured reference to collections
    • Geospatial reference information
    • Valid species name
    • Who performed the identification
    • Literature citations
features qualifiers and values
Features, Qualifiers and Values

The Feature table is updated based on discussions at the International Collaborators meeting of INSDC

slide24

NCBI Barcode Submission Tool in Beta Test Phase

Since 2005, better software, more sequences, better links to museum vouchers…

triplet structure for specimen identifiers
Triplet structure for specimen identifiers

/specimen_voucher=“<institution-code>|<collection-code>|<specimen-id>”

<institution-code> - abbreviation of the archiving institution

<collection-code> - collection within the institution (possibly null) (*)

<specimen-id> - specimen identifier within the collection

The above approach is used in the DarwinCore/GBIF and is parallel to the Life Science Identifier (LSID) that is an Object Management Group (OMG) standard.

(*) museums herbaria

culture collections stock centers

germplasm repositories (seed banks)

frozen tissue banks zoos/aquaria/botanical gardens

DNA banks personal collections

e-voucher archives

summary
Summary
  • INSDC is an archival genetic database in the public domain
  • BOLD is a public/private workbench for assembling BARCODE compliant projects & supports the organization of barcode campaigns
  • BOLD and GenBank continue to develop routines for synchronization and interoperability
  • As of this Meeting, the BARCODE Data Standard is Ready for

Full Implementation!

acknowledgments
Acknowledgments:
  • All Participants of the CBOL Database Work Group
  • Scott Federhen, NCBI
  • Donal Hobern, GBIF
  • Scott Miller, Smithsonian Institution
  • David Schindel, CBOL
  • Sujeevan Ratnasingham, Biodiversity Institute of Ontario