Illumina sequencing of 27 cultivated and wild alfalfa transcriptomes: gene and single nucleotide pol...
1 / 1

Introduction - PowerPoint PPT Presentation

  • Uploaded on

Illumina sequencing of 27 cultivated and wild alfalfa transcriptomes: gene and single nucleotide polymorphism (SNP) discovery.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Introduction' - jaimin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Illumina sequencing of 27 cultivated and wild alfalfa transcriptomes: gene and single nucleotide polymorphism (SNP) discovery

Xuehui Li1, Ananta Acharya1, Andrew D. Farmer2, John A. Crow2, Arvind K. Bharti2, Yanling Wei1, YuanhongHan1, Jiqing Gou1, Gregory D. May2, Maria J. Monteros1, E. Charles Brummer1


Alfalfa, a perennial, outcrossing species, is a widely planted forage legume. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker-assisted breeding holds promise to propel alfalfa improvement, but is constrained by the lack of a large number of markers. With low cost and high time/labor efficiency, next generation sequencing enables high-throughput discovery of SNPs, even for species with large complex genomes. In this experiment, our objective was to increase the number of SNPs for alfalfa research and molecular breeding.

Materials and Methods

We have sequenced 27 alfalfa genotypes (23 cultivated tetraploids and four wild diploids). Total RNA was isolated from young and old stems, and pooled for each genotype for Illumina sequencing. Each transcriptome was sequenced on a single lane of the Illumina Genome Analyzer IIx to produce about 17-32 million 72-bp reads. Quality-filtered reads were used for de novo assembly to generate contigs. To assess the representation and quality of our alfalfa assembly, BLASTx was performed against the annotated non-redundant GenBank protein database. SNPs were detected by realigned reads to the assembled contigs under conditions of : (1) average quality of bases calling the SNP >20; (2) number of uniquely aligned reads calling the SNP >=20; and (3) p value of contingency test <0.01.


Li et al., 2011. Prevalence of segregation distortion in diploid alfalfa and its implications for genetics and breeding applications. Theor. Appl. Genet. 123:667-679.

Robins et al., 2007. Genetic mapping of biomass production in tetraploid alfalfa. Crop Sci. 47:1-10.


This project is funded by the USDA National Institute of Food and Agriculture.

Results and Conclusion

  • Sequencing of 27 genotypes resulted in a total of 740 million reads (Table 1), the assembling of which generated 25,183 contigs with a total length of 26.8 Mbp and an average length of 1,065 bp, giving an average read depth of 56-fold for each genotype.

  • Overall, 21,954 (87.2%) of 25,183 contigs matched to 14,878 unique protein accessions.

  • The realignment of reads to the contigs enabled the detection of 873,384 putative SNPs and 25,183 InDels. In total, 7,812 (31%) of the 25,183 contigs aligned to M. truncatulapseudomolecules version 3.5.1, carrying 298,771 SNPs and 9,205 InDels, which were widely distributed along the eight chromosomes (Figure 1).

  • High Resolution Melting (HRM) analysis of 192 putative SNPs validated about 85% of them, including confirming the allele dosage inferred from sequencing (Figure 2a and 2b).

  • Principle Components Analysis (PCA) with the 173,947 SNPs indicated that subspecies falcatais clearly separated from diploid caerulea and tetraploid sativa (cultivated tetraploid alfalfa) (Figure 3).

  • Selected SNPs have been mapped to tetraploid and diploid alfalfa linkage maps previously constructed with RFLP and SSR markers (Li et al., 2011; Robins et al., 2007) (Figure 4).

  • An alfalfa Illumina Infinium array with ~10,000 SNPs is being developed, which will enable high-throughput genotyping and facilitate genome-wide association studies and genomic selection in alfalfa.

  • Our results demonstrated that next generation transcriptome sequencing is an efficient way to discover high quality SNPs for alfalfa. These ESTs and SNP markers could effectively contribute to future alfalfa research and breeding applications.

Table 1. 27 genotypes used in this study and sequence statistics

Figure 1. SNPs distribution along eight chromosomes of M. truncatula.X-axis is the genome location for each chromosome. The number of SNPs per 1,000 bp was calculated for each 0.5 million base pair interval and plotted on the Y-axis.

Figure 2. Examples of high resolution melting analysis of SNP

(a) Validation of three SNP phenotypes; (b) Validation of potential allele dose in heterozygotes.



Figure 3. PCA analysis of 27 genotypes.Blue solid circle represents tetraploid sativa; red solid circle represents tetraploid falcata; blue triangle represents diploid caerulea; red triangle represents diploid falcata.

Figure 4. Physical map of M.truncatula (Build 3.0) and genetic linkage maps for one diploid (CC78) and one tetraploid mapping population (ABE408×Wis6) based on RFLP, SSR, and SNP.

The physical locations indicated on the maps are all in the scale of 5 × 105 base pairs. Markers in red on linkage maps are SNPs.