“Gene Finding in Novel Genomes” by Ian Korf. Presented by: Christine Lee SoCAL BSI 2004. Outline. Background and Motivation Existing gene finder programs Snap as ab initio high performance gene finder Novel genome gene prediction The Data Genome compositional differences
SoCAL BSI 2004
Data set characteristics At Arabidopsis thaliana, Ce Caenorhabditis elegans,
Dm Drosophila melanogaster, Os Oryza sativa.
Performance of foreign and bootstrapped parameters. The bold face values are determined by 5-fold cross-validation
within the same species. At Arabidopsis thaliana, Ce Caenorhabditis elegans, Dm Drosophila melanogaster,
Os Oryza sativa. Sensitivity (NSN) and specificity (NSP) are reported at the nucleotide level. The bootstrapped values
(bottom part of the table) are derived from parameter estimates based on gene predictions and no
actual data. In these experiments, only inter-species gene parameters were used; dashes
represent cells that would contain intra-species predictions.