1 / 54

Software and Databases for managing and selecting molecular markers General introduction

Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction to metabolic pathway databases. Identification of polymorphisms in data-based sequences. Databases (General and Crop Specific)

enye
Download Presentation

Software and Databases for managing and selecting molecular markers General introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction to metabolic pathway databases. Identification of polymorphisms in data-based sequences

  2. Databases (General and Crop Specific) Germplasm GRIN: http://www.ars-grin.gov/npgs/ TGRC: http://tgrc.ucdavis.edu/ Sequence NCBI: http://www.ncbi.nlm.nih.gov/ SGN: http://solgenomics.net/ Metabolic PlantCyc: http://www.plantcyc.org:1555/PLANT/server.html?

  3. New format to NCBI

  4. Access current and past scientific lit.

  5. Increased emphasis on phenotypic data

  6. Germplasm databases

  7. Crop specific germplasm resources

  8. Example: QTL for color uniformity in elite crosses Audrey Darrigues, Eileen Kabelka

  9. Carotenoid Biosynthesis: Candidate pathway for genes that affect color and color uniformity. Disclaimer: this is not the only candidate pathway…

  10. Databases that link pathways to genes http://www.arabidopsis.org/help/tutorials/aracyc_intro.jsp

  11. Databases that link pathways to genes http://metacyc.org/ http://www.plantcyc.org/ http://sgn.cornell.edu/tools/solcyc/ http://www.arabidopsis.org/biocyc/index.jsp http://www.arabidopsis.org/help/tutorials/aracyc_intro.jsp External Plant Metabolic databases CapCyc (Pepper) (C. anuum) CoffeaCyc (Coffee) (C. canephora) SolCyc (Tomato) (S. lycopersicum) NicotianaCyc (Tobacco) (N. tabacum) PetuniaCyc (Petunia) (P. hybrida) PotatoCyc (Potato) (S. tuberosum) SolaCyc (Eggplant) (S. melongena)

  12. http://www.plantcyc.org:1555/

  13. Note: missing step (lycopene isomerase, tangerine)

  14. Check boxes (Note: MetaCyc has many more choices, but no plants)

  15. Capsicum annum sequence retrieved Scroll down page

  16. http://www.ncbi.nlm.nih.gov/

  17. Select database

  18. Query CCACCACCATCCTCACTTTAACCCACAAATCCCACTTTCTTTGGCCTAATTAACAATTTT |||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||| Sbjct CCACCACCATCCTCACTTTAACCCACAAATCCCATTTTCTTTGGCCTAATTAACAATTTT Zeaxanthin epoxidase Probable location on Chromosome 2 Alignment of Z83835 and EF581828 reveals 5 SNPs over ~2000 bp

  19. 51 annotated loci

  20. Candidates identified in other databases are here Information missing from other databases is here…

  21. Comment on the databases: Information is not always complete/up to date. Display is not always optimal, and several steps may be needed to go from pathway > gene > potential marker. Sequence data has error associated with it. eSNPs are not the same as validated markers. Germplasm data may also have error (e.g. PI 128216) There is a wealth of information organized and available.

  22. The previous example detailed how we might identify sequence based markers for trait selection. Query CCACCACCATCCTCACTTTAACCCACAAATCCCACTTTCTTTGGCCTAATTAACAATTTT |||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||| Sbjct CCACCACCATCCTCACTTTAACCCACAAATCCCATTTTCTTTGGCCTAATTAACAATTTT Improving efficiency of selection in terms of 1) relative efficiency of selection, 2) time, 3) gain under selection and 4) cost will benefit from markers for both forward and background selection. Remainder of Presentation will focus on Where to apply markers in a program Forward and background selection Marker resources Alternative population structures and size

  23. Comparison of direct selection with indirect selection (MAS). Relative efficiency of selection:r(gen) x {Hi/Hd} Line performance over locations > MAS > Single plant

  24. Accelerating Backcross Selection F1 50:50 Expected proportion of Recurrent Parent (RP) genome in BC progeny BC1 75:25 BC2 87.5:12.5 BC3 93.75:6.25 BC4 96.875:3.125

  25. References: Frisch, M., M. Bohn, and A.E. Melchinger. 1999. Comparison of Selection Strategies for Marker-Assisted Backcrossing of a Gene. Crop Science 39: 1295-1301.

  26. Progeny needed for Background Selection During MAS Q10 indicates a 90% probability of success From Frisch et al., 1999.

  27. Marker Data Points required (Modified from Frisch et al., 1999; based on assumption of 12 chromosomes; initial selection with 4 markers/chromosome)

  28. For effective background selection we need: Markers for our target locus (C > T SNP for Zep) Markers on the target chromosome (Chrom. 2) Markers unlinked to the target chromosome (~2 per chromosome arm)

  29. http://www.tomatomap.net http://sgn.cornell.edu/

  30. Ovate

  31. HBa0104A12

  32. 44 polymorphic markers 55 polymorphic markers

  33. Where can we expect to be? analysis by Buell et al., unpublished Data based on estimated ~42% of sequence, therefore expect as many as 300 markers for a cross like E6203 x H1706

  34. BioPerl NCBI BLAST DOS Perl CygWin (Unix emulator) Cyc BioPerl BLAST UNIX Perl When is the time to move from reliance on public databases to in house pipelines? In-house database

  35. Complete genome sequences are available for: Soybean, Corn, Potato, Tomato, Cucumber, and more are coming….

More Related