Genome Informatics Research Lab

  IMIM * UPF * CRG * GRIB HOME DATASETS * Genomics-96
   
"Evaluation of gene structure prediction programs."
M. Burset and R. Guigó.
Genomics, 34:353-357 (1996).

Summary

A number of computer programs for the prediction of gene structure in DNA genomic sequences are analyzed. The programs are tested in a large set of vertebrate sequences. Below you will find links to the sequences test sets, the programs analyzed, and the results obtained.

Sequence Test Set

Here you will find the file containing the genomic DNA sequences used in the analysis (fasta format), the file of corresponding coding regions for each sequence (table format), and the file containing the encoded amino acid sequences (fasta format).


and also, above DNA sequences with 1% random frameshift errors, and the corresponding coding regions with boundaries adjusted to take into account the frameshift mutations.

Tables of Results

For each of the programs analyzed you will find the predicted CDS file, containing a prediction for each sequence in above test set, and a table with accuracy statistics for each sequence analyzed. Just click on the blue balls to visualize those plain-text files.

ORIGINAL MUTATED
CDS Accuracy CDS Accuracy
GeneID
GeneID+
SORFIND
GParser2
GParser3
GRAIL II
GAP
Genlang
FgeneH
XPound

 
  Disclaimer webmaster