Illumina sequencing of 27 cultivated and wild alfalfa transcriptomes: gene and single nucleotide polymorphism (SNP) discovery.
Illumina sequencing of 27 cultivated and wild alfalfa transcriptomes: gene and single nucleotide polymorphism (SNP) discovery
Xuehui Li1, Ananta Acharya1, Andrew D. Farmer2, John A. Crow2, Arvind K. Bharti2, Yanling Wei1, YuanhongHan1, Jiqing Gou1, Gregory D. May2, Maria J. Monteros1, E. Charles Brummer1
Alfalfa, a perennial, outcrossing species, is a widely planted forage legume. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker-assisted breeding holds promise to propel alfalfa improvement, but is constrained by the lack of a large number of markers. With low cost and high time/labor efficiency, next generation sequencing enables high-throughput discovery of SNPs, even for species with large complex genomes. In this experiment, our objective was to increase the number of SNPs for alfalfa research and molecular breeding.
Materials and Methods
We have sequenced 27 alfalfa genotypes (23 cultivated tetraploids and four wild diploids). Total RNA was isolated from young and old stems, and pooled for each genotype for Illumina sequencing. Each transcriptome was sequenced on a single lane of the Illumina Genome Analyzer IIx to produce about 17-32 million 72-bp reads. Quality-filtered reads were used for de novo assembly to generate contigs. To assess the representation and quality of our alfalfa assembly, BLASTx was performed against the annotated non-redundant GenBank protein database. SNPs were detected by realigned reads to the assembled contigs under conditions of : (1) average quality of bases calling the SNP >20; (2) number of uniquely aligned reads calling the SNP >=20; and (3) p value of contingency test <0.01.
Li et al., 2011. Prevalence of segregation distortion in diploid alfalfa and its implications for genetics and breeding applications. Theor. Appl. Genet. 123:667-679.
Robins et al., 2007. Genetic mapping of biomass production in tetraploid alfalfa. Crop Sci. 47:1-10.
This project is funded by the USDA National Institute of Food and Agriculture.
Results and Conclusion
Table 1. 27 genotypes used in this study and sequence statistics
Figure 1. SNPs distribution along eight chromosomes of M. truncatula.X-axis is the genome location for each chromosome. The number of SNPs per 1,000 bp was calculated for each 0.5 million base pair interval and plotted on the Y-axis.
Figure 2. Examples of high resolution melting analysis of SNP
(a) Validation of three SNP phenotypes; (b) Validation of potential allele dose in heterozygotes.
Figure 3. PCA analysis of 27 genotypes.Blue solid circle represents tetraploid sativa; red solid circle represents tetraploid falcata; blue triangle represents diploid caerulea; red triangle represents diploid falcata.
Figure 4. Physical map of M.truncatula (Build 3.0) and genetic linkage maps for one diploid (CC78) and one tetraploid mapping population (ABE408×Wis6) based on RFLP, SSR, and SNP.
The physical locations indicated on the maps are all in the scale of 5 × 105 base pairs. Markers in red on linkage maps are SNPs.