1 / 38

CACAO Literature-based Functional Annotation as an Intercollegiate Competition

CACAO Literature-based Functional Annotation as an Intercollegiate Competition. ASM/JGI Functional Genomics Workshop 2011 Jim Hu Texas A&M PortEco/EcoliWiki. Objectives. Our goals for this unit are to train you to be able to:

Download Presentation

CACAO Literature-based Functional Annotation as an Intercollegiate Competition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CACAOLiterature-based Functional Annotation as an Intercollegiate Competition ASM/JGI Functional Genomics Workshop 2011 Jim Hu Texas A&M PortEco/EcoliWiki ASM/JGI Functional Genomics 2011 slide 1

  2. Objectives • Our goals for this unit are to train you to be able to: • make functional annotations to the Gene Ontology (GO) based on the literature. • teach your students to do GO annotation with your supervision • participate in the international Community Assessment of Community Annotation with Ontologies (CACAO) competitions. • After this unit, participants should • be able to distinguish different levels of annotation • be able to explain how Gene Ontology (GO) represents gene function, and why it is valuable • make annotations to GO, finding terms with online term browsers (GONUTS, Amigo) • be able to describe how CACAO couples GO annotation with undergraduate education • be able to incorporate GO into any papers that come from future functional genomics experiments. ASM/JGI Functional Genomics 2011 slide 2

  3. Additional materials http://gowiki.tamu.edu/wiki/index.php/ASM-JGI_2011 Includes: • This presentation • An extended unit covering similar material in more detail ASM/JGI Functional Genomics 2011 slide 3

  4. What is annotation? L. Stein (2001) Nature Reviews Genetics 2:493 ASM/JGI Functional Genomics 2011 slide 4

  5. Classic MODel Literature Database Curators (rate limiting) Datasets ASM/JGI Functional Genomics 2011 slide 5

  6. Classic MODel is Expensive ASM/JGI Functional Genomics 2011 slide 6

  7. Community Annotation with Students • Goal: Recruit more community participation • Problem: Incentives to participate are weak • Approach: Couple GO* annotation to teaching • Do this as a competition • Teams • Students get points for annotations • Students can take points for correcting each other • Different scales: within a course, within a campus, between schools *GO = Gene Ontology ASM/JGI Functional Genomics 2011 slide 7

  8. What are Ontologies and why use them? • What? • Controlled vocabulary • Relationships • Why? • Standardization • facilitate comparison across systems • facilitate computer based reasoning systems • Good for data mining! ASM/JGI Functional Genomics 2011 slide 8

  9. is_a part_of GO = Gene Ontology • 3 ontologies for gene products • Biological Process • Molecular Function • Cellular Component • Used to make annotations • aka Gene associations • Term + qualifiers + evidence code + reference etc. figure from GO consortium presentations ASM/JGI Functional Genomics 2011 slide 9

  10. Cellular Component • where a gene product acts ASM/JGI Functional Genomics 2011 slide 10

  11. Molecular Function • activities or “jobs” of a gene product glucose-6-phosphate isomerase activity figure from GO consortium presentations ASM/JGI Functional Genomics 2011 slide 11

  12. Biological Process a commonly recognized series of events cell division Figure from Nature Reviews Microbiology 6, 28-40 (January 2008) ASM/JGI Functional Genomics 2011 slide 12

  13. Set up GONUTS accounts • Go to http://gowiki.tamu.edu • Log in as Demo (a user that cannot edit, but that can create accounts) • Click on log in in the upper right corner • Username: Demo • Password: • Create an account for yourself • Click Login/Create Account on the left sidebar • Click Create Account • Enter your information • Log in as yourself ASM/JGI Functional Genomics 2011 slide 13

  14. GONUTS • wiki for Gene Ontology (unofficial) • Kinds of pages: • GO terms (Categories) • Gene products (proteins) from UniProt • where the annotations go • Publications from PubMed • Misc. other ASM/JGI Functional Genomics 2011 slide 14

  15. GONUTS GO term browsing demo • Enter "anhydrase" in the search box, • click Search • Restrict the search to Category pages • All GO terms are categories • Go to the term page and review the sections • Edit the usage notes ASM/JGI Functional Genomics 2011 slide 15

  16. Annotation on GONUTS • First we need a gene page to hold the annotations • Users can create gene pages for anything in UniProt. • New gene pages are populated with information, including previous GO annotations. ASM/JGI Functional Genomics 2011 slide 16

  17. Key elements of a GO annotation Submitted to GO consortium Viewable on GONUTS ASM/JGI Functional Genomics 2011 slide 17

  18. Community Assessment of Community Annotation with Ontologies Teams of students curate Faculty supervision Support from our team Intramural or Intercollegiate competition Distributed annotation jamborees Assessment via surveys and wiki data-mining CACAOcoupling annotation to teaching credit ASM/JGI Functional Genomics 2011 slide 18

  19. CACAO is competitive • Teams get points for complete annotations • GO term (right level of specificity) • reference • evidence code • identify where in the paper the evidence comes from • Teams can take away points from competitors by challenging annotations • finding a problem • suggesting a better alternative ASM/JGI Functional Genomics 2011 slide 19

  20. Tracking the players • An extension tag added to a user page identifies all the annotations made by that user • Exercise: Edit your user page to add <myAnnotations/> ASM/JGI Functional Genomics 2011 slide 20

  21. Tracking the teams • Team members are assigned to a wiki category • An extension identifies all the annotations made by team members ASM/JGI Functional Genomics 2011 slide 21

  22. Tracking the teams • Team members are assigned to a wiki category • An extension identifies all the annotations made by team members ASM/JGI Functional Genomics 2011 slide 22

  23. Submitting challenges • Clicking submit challenge brings up a challenge form ASM/JGI Functional Genomics 2011 slide 23

  24. Responding to challenges • Pending challenges are shown in the Team pages ASM/JGI Functional Genomics 2011 slide 24

  25. Responding to challenges • A form records responses to challenges ASM/JGI Functional Genomics 2011 slide 25

  26. Overall scoreboard • A scoreboard page gathers information about all teams and challenges ASM/JGI Functional Genomics 2011 slide 26

  27. Overall scoreboard • A scoreboard page gathers information about all teams and challenges ASM/JGI Functional Genomics 2011 slide 27

  28. Judgement • Mentors with curator experience judge the challenges/rebuttals ASM/JGI Functional Genomics 2011 slide 28

  29. History • We just completed our 3rd semester of CACAO • EcoliWiki GO annotation by our staff would be ~250-400/semester ASM/JGI Functional Genomics 2011 slide 29

  30. Evolution of CACAO • What's changed since the first cycle? • Multiple rounds • More time for challenges • Development of our online scoreboard system • Changes in allowed evidence types • no IPI, EXP • More documentation • Rules tweaks as we learn how students will game the system • Outreach: In the Spring 2011 round we had instructors who are new to GO ASM/JGI Functional Genomics 2011 slide 30

  31. Spring 2011 • Training • Annotation • 4 rounds + "World series" • 1 week annotation • 1 week challenges + rebuttals • Judgement and posting of scores • Some did a subset of the rounds (18-25 students in rounds 1-4) • All did the World Series (105 students) • Assessment • Every annotation was reviewed by us • Parallel training of a grad student in judging ASM/JGI Functional Genomics 2011 slide 31

  32. Spring 2011 • PortEco/EcoliWiki provided • web infrastructure via GONUTS • training and guest lectures via Skype • handouts • powerpoints • instructor manuals • coaching students via email, Skype, Google chat • online surveys for assessment • We restricted the kinds of annotations that would be scored • We did not restrict what genes/organisms to use • GONUTs only allows what's in UniProt • We promoted certain areas • Provided reviews • Predictions from computational methods ASM/JGI Functional Genomics 2011 slide 32

  33. Results All rounds ASM/JGI Functional Genomics 2011 slide 33

  34. By organism ASM/JGI Functional Genomics 2011 slide 34

  35. How we view the results • Overall, we think CACAO works • Lots of annotations • The students love it • Quality remains a challenge, but • quality seems correlated with experience • QC is relatively fast ASM/JGI Functional Genomics 2011 slide 35

  36. Plans and Challenges • Adjust the system to promote better annotations and challenges • Improve the scoreboard/tracking system • More flexible • Improve UI • Data mining needs improvement • More documentation • Analyze common errors • Improve assessment • we want to do serious assessment, which means human subjects and IRBs. We need a collaborator for this • Outreach • We want more participants, but can we handle them? ASM/JGI Functional Genomics 2011 slide 36

  37. TAMU Brenley McIntosh Adrienne Zweifel Mahitha Rajendran Daniel Renfro Debby Siegele UCL Ruth Lovering Varsha Khodiyar Miami (Ohio) Iddo Friedberg Univ. of N. Texas Lee Hughes Michigan State Rob Britton Penn State Sarah Ades People ASM/JGI Functional Genomics 2011 slide 37

  38. CACAO Supplemental slides (not shown in the meeting) ASM/JGI Functional Genomics 2011 slide 38

More Related