120 likes | 214 Views
This lab meeting conducted on 4/16/03 aimed to analyze a set of genomic data on rice, focusing on over 80,000 genes obtained from mRNA sequences and identifying gene boundaries by extending matches. The dataset contained 1378 genes with an average of 5.34 exons per gene. A prediction model was used to address fgene contamination issues, yielding varied sensitivities and specificities. Multiple prediction methods were compared, with different sensitivities and specificities recorded. Further analysis on manual corrections and non-fgene predictions proved valuable in understanding gene structures. The focus was on improving accuracy in gene prediction and nucleotide identification. Overall, the meeting discussed various methods and findings to enhance genomic analysis for accurate gene boundary determination in rice data sets.
E N D
Rice Data Set ??? Lab meeting 4-16-03
Procedures 1.TIGR Data : >80,000 genes get mRNA 2. 1050 full length mRNA sequences from NCBI make library 3. Blast -> conseq -> percentage identity -> cut of 88% ~1038 sequences 4. combine with NCBI dna dataset: check gene boundary, check single exon gene 5. extend the boundary if find the match
Check dataset • 1378 genes • 5.34 exons/gene • 219bp/exon • 322bp/intron • 41 incomplete • 256 single genes
Problem: fgene contamination Predictions: fgene gm genscan Gene Sensitivity 78.10% 30.75% 19.36% Gene Specificity 71.75% 18.44% 15.18% Exon Sensitivity 93.74% 65.51% 35.97% Exon Specificity 90.67% 62.98% 55.20% Nucleotid Sensitivity 97.52% 84.74% 55.26% Nucleotid Specificity 95.35% 88.76% 84.51%
Only NCBI Part Predictions: fgene1.list gm1.list genscan1.list Gene Sensitivity 45.95% 31.76% 21.62% Gene Specificity 42.90% 21.96% 17.58% Exon Sensitivity 80.79% 65.44% 36.52% Exon Specificity 76.58% 62.84% 53.75% Nucleotid Sensitivity 92.65% 84.04% 55.80% Nucleotid Specificity 91.87% 89.74% 85.78% 296 genes, 4.98 exons/gene, 43 single gene, 355bp/intron, 253bp/exon
Not NCBI Predictions: fgene2.list gm2.list genscan2.list Gene Sensitivity 86.89% 30.47% 18.74% Gene Specificity 79.48% 17.64% 14.55% Exon Sensitivity 96.98% 65.52% 35.83% Exon Specificity 94.29% 63.02% 55.58% Nucleotid Sensitivity 98.99% 84.95% 55.10% Nucleotid Specificity 96.38% 88.47% 84.13% 1083 gene, 5.44exon/gene, 213 single genes, 210bp/exon,313bp/intron
NCBI+from_Manual_clone Predictions: fgene3.list gm3.list genscan3.list Gene Sensitivity 49.72% 34.17% 24.17% Gene Specificity 46.25% 22.24% 19.59% Exon Sensitivity 82.66% 67.50% 38.68% Exon Specificity 78.38% 63.60% 56.24% NucleotideSensitivity 93.43% 85.60% 58.14% NucleotideSpecificity 92.09% 89.51% 85.62% 360 genes
NCBI_Manual_all_correct Predictions: fgene4.list gm4.list genscan4.list Gene Sensitivity 54.75% 40.75% 31.75% Gene Specificity 51.29% 27.49% 26.24% Exon Sensitivity 83.88% 69.78% 42.98% Exon Specificity 79.82% 66.01% 60.56% Nucleotid Sensitivity 93.90% 86.62% 61.10% Nucleotid Specificity 92.64% 90.29% 87.07% 400genes
NCBI_Manual_AllC_notfgene Predictions: fgene5.list gm5.list genscan5.list Gene Sensitivity 49.75% 35.34% 23.95% Gene Specificity 41.31% 21.23% 18.31% Exon Sensitivity 85.19% 65.39% 36.47% Exon Specificity 78.88% 62.06% 53.19% Nucleotid Sensitivity 94.49% 85.41% 57.26% Nucleotid Specificity 89.90% 86.62% 82.25% 597 genes
Manual_not_fgene+allC Predictions: fgene gm genscan twinscan Gene Sensitivity 10.14% 11.05% 7.96% 15.15% Gene Specificity 9.35% 5.48% 5.52% 10.26% Exon Sensitivity 58.44% 35.94% 18.15% 49.70% Exon Specificity 59.99% 32.48% 24.69% 49.54% NucleotidSensitivity 80.09% 67.39% 46.18% 85.29% NucleotidSpecificity 82.31% 75.22% 67.79% 77.35%
99% cutoff clones • Predictions: fgene99.list • Gene Sensitivity 92.25% • Gene Specificity 86.23% • Exon Sensitivity 97.69% • Exon Specificity 94.17% • Nucleoti Sensitivity 99.49% • Nucleoti Specificity 95.50%
Min’s Method on Rice cDNA set • Predictions: g++. twinscan min_001 min_02 min_min min_wei • GeneSN 40.83% 47.34% 46.75% 44.38% 46.15% 47.93% • Gene Sp 33.17% 39.90% 39.50% 37.13% 38.61% 40.40% • Exon SN 70.88% 79.36% 78.87% 76.04% 76.82% 77.53% • Exon Sp 67.72% 74.52% 75.46% 73.05% 74.97% 76.02% • NucSN 88.76% 92.11% 91.26% 90.83% 90.77% 91.72% • NucSp 89.06% 90.16% 90.17% 89.69% 90.17% 90.43%