SGP predictions combine geneid predictions with tblastx comparison of the Humangenome against the Mouse genome. SGP was run on the unmasked fasta sequences for the following human genome version: golden_path_20011222 It was also using homology evidences (the SRs - similarity regions) from TBLASTX, in which a masked version of the previous human genome assembly was comparedagainst the following mouse assembly version: (v3) mmFeb2002 (MGSCv3) Predictions were obtained per chromosome and output in the following formats: chr1.sgp (geneid) chr1.gff chr1.gtf chr1.cds (nucleotide sequence of predicted sequences) chr1.prot (amino acid sequence of predicted protein sequences) chr1_tbxsr.gff (gff output of geneid combined with tblastx) NOTE: Coordinates of RefSeq genes were considered so that SGP(geneid) would not make any predictions in those regions.