1 / 8

Having it all

Having it all. Complete: every occurrence is found Precise: every occurrence is accurate Comprehensive: all types of features Richly described: biological functional and cross-species data What is needed to make this really happen? Can’t be swept under the rug. Managing annotations changes.

ivana
Download Presentation

Having it all

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Having it all • Complete: every occurrence is found • Precise: every occurrence is accurate • Comprehensive: all types of features • Richly described: biological functional and cross-species data • What is needed to make this really happen? • Can’t be swept under the rug.

  2. Managing annotations changes • More information: • Assembly changes, transcriptome data, comparative data, Cis regulatory element data, Mass spec data, repeat studies • Will lead to: • Additions, merges, splits, splerges, deletions, new feature types • Compounded by: • Conflict resolution and innate complexity • alternate transcripts, dicistronic genes, overlaps and intersections

  3. The Essentials • Full-length cDNA sequences and other ‘hard’ biological evidence • High-quality assemblies • Manual editors for curation • Combiners • Annotation standards and verification • Tracking and versioning • Open source software components and standards are critical to long term success

  4. Integration • Agree on Standards • GFF3 file format • Sequence Ontology • Gene Ontology • Phenotype… • Agree on Process • Exchange (DAS2) • Convergence • Versioning • Feedback

  5. GFF3 • Need a common exchange format • To share and distribute data • Easy to transfer between databases • Then can be visualized and seen by everyone! • And can be edited/commented I.e. Apollo

  6. SO enables rigorous description and querying of the data • How often are exons unique to a transcript? • How often does an exon appear in all of the transcripts? • For exons that occur in all the transcripts, How often are they coding exons? • For exons that occur in only one of the transcripts, how often are they noncoding? • Do unique exons contain the stop codon more often than exons in all the transcripts?

  7. Annotation integration • DAS2 • IGB (Gregg Helt @ Affy) • Apollo (Suzanna Lewis @ Berkeley) • Allow exchange of annotations between researchers at different sites.

  8. Design and Plan for Integration • It is essential to tightly define standards at the outset • It is essential to have on-going assessment and evaluation of accumulated pooled data

More Related