1 / 21

The BARCODE Data Standard

The BARCODE Data Standard . David E. Schindel, Executive Secretary National Museum of Natural History Smithsonian Institution SchindelD@si.edu ; http://www.barcoding.si.edu 202/633-0812; fax 202/633-2938. BARCODE Data Standard is:.

luther
Download Presentation

The BARCODE Data Standard

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The BARCODE Data Standard David E. Schindel, Executive Secretary National Museum of Natural History Smithsonian Institution SchindelD@si.edu; http://www.barcoding.si.edu 202/633-0812; fax 202/633-2938

  2. BARCODE Data Standard is: • A set of required elements for a reserved Keyword (‘BARCODE’) in GenBank • A set of sequence quality requirements • Required or recommneded formats for data interoperability with: • Voucher specimens in biorepositories • Georeferenced data • Taxonomic literature

  3. Small ribosomal RNA The Mitochondrial Genome D-Loop mtDNA DNA Cytochrome b ND1 ND6 ND5 COI ND2 COI L-strand H-strand Typical Animal Cell ND4 ND4L COII ND3 COIII ATPase subunit 8 ATPase subunit 6 Mitochondrion An Internal ID System for All Animals

  4. Non-COI regions for other taxa • Land plants: • Chloroplast matK and rbcL approved Nov 09 • 70-75% resolvingability, higher in angiosperms • Non-coding plastid and nuclear regions being explored • Fungi: • CBOL Working Group met this week in Amsterdam • Agreed to recommend ITS; 72% effective • Protists: • CBOL Working Group July meeting, Berlin

  5. BARCODE Record Flow Chart Key Mirroring Update Channel Private Records USER /GenBank

  6. BARCODE Records in GenBank

  7. Submission of BARCODE Records to EBI and DDBJ

  8. Required Elements for BARCODE • Taxonomic identification to species • Voucher specimen ID in standard format • Name of barcode region • Length, quality, 2 trace files • Forward/reverse primer sequences, names • Country/Ocean/Sea of origin

  9. Highly Recommended Elements • Latitude/longitude • Name of Collector • Collection date • Name of identifier

  10. BARCODE Records in INSDC Specimen Metadata Voucher Specimen Species Name GeoreferenceHabitatCharacter setsImagesBehaviorOther genes Indices - Catalogue of Life - GBIF/ECAT Nomenclators - Zoo Record - IPNI - NameBank Publication links - New species Barcode Sequence Trace files Primers Literaturecitation Record in BOLD Databases - Provisional sp.

  11. Compliance with Standard (1) • 1.37 million records in BOLD • 514,390 BARCODE records in INSDC • 395,774 have ordinal name plus Barcode Index Number for taxonomic ID • Rapid data release versus time for annotation • Exposure to data theft, risk of misidentification • Added value of Linnean name • Incidence of misidentifications in GenBank • Danger of circular reasoning

  12. Taxonomic Identification • The genus and species combination that can be found in: • a taxonomic index such as Catalog of Life, Zoological Record or IPNI; • a taxonomic treatment of a previously published species name; or • a published description of the species; or • A provisional label for a potential new species;

  13. Rod Page’s ‘Dark Taxa’ R. Page, iPhyloblogspot, 12 April 2011

  14. Taxonomic Content in iBOL Data iBOL ‘Phase 1’ • Org name: Order + BIN • Tentative Name: blank GenBank ‘Phase 0’ • Tentative name is in BOLD, unreleased GenBank ‘Phase 1’ • Org name = Order + BIN plus • Tentative name iBOL ‘Phase 2’ • Org name: Order + BIN • Tentative Name: blank GenBank ‘Phase 2’ • Org name = sp. name

  15. Unique identifier for the voucher specimen In standardized format based on Darwin Core: Institutional acronym:Collectioncode:Specimennumber Institutional acronym:Specimennumber personal:Collectioncode:Specimen number GTI/CBOL/iBOL Workshop, 7 November 2009

  16. Compliance with Standard (2) • 514,390 BARCODE records in INSDC • Traces, primers, length, country, and presence of voucherID checked by GenBank • 99.9% have entry for /specimen_voucher • 13,151 have formatted voucher from 38 institutions • 20 confirmed in biorepositories • 11 unconfirmed • 7 unlisted

  17. Darwin Core TripletStructured Link to Vouchers Collection Code : Catalog ID Institutional Acronym : : : NHM LEP 123456 : : personal DHJanzen SRNP12345

  18. CBOL/GBIF/NCBI Registry of Biorepositories www.biorepositories.org

More Related