sgp2
sgp2 is a program to predict genes by comparing anonymous genomic sequences from two different species. It combines tblastx, a sequence similarity search program, with geneid, an "ab initio" gene prediction program. Scores of exons are computed as log-likelihood ratios, function of the splice sites defining the exon, the coding bias in composition of the exon sequence as measured by a Markov Model of order five, and of the optimal alignment at the amino acid level between the target exon sequence and the counterpart homologous sequence in the reference set.
Paste your FASTA sequence here >HSCKBG GATCAGTTTTTTTTTTTAATCGCACTTATGCTTATTGTTTATTAGCGTTTCCTCCCATCT TTGCCTGAAGTCTCCGGGGACTGCCTTTGGGGGTCGGGTAAACTTGTCCCCTGCGAAGAG GGCCCAGGGTTGGGGTCTGGAAACTCCGAGGCTGCACTTGCCAGCGGCCTCTTAAGGCCA CAGCGTCCCCGTGGTTTCTGGCTCGCAGCCCCCCGAGACCCAGGACTTGTCCAAGGTCAG GGCACCGCGGGTGCCCCCGGGCTGGGCCGCAGCAGACTGCGCTTCCCGCGCGCCTTCGCT TTGCACCAGGATCGCCCAGGAAATGCCTGCGGGCACCTTGAGGAAGGTCGGCGGCTCCGG GCCAGCTCGCACTGGCCGGGGTGGGGCGGGGGCCGTACCTGCTGCGGAAGCCCCGAAAGC TTTCGCCCGGCCCCTCGCCGCCGCCGCGGGGGCTGGCTGGACTAGGCGGGCAGGCTCGAG GATGCGGATGAACCCAAGCGTCCTCGAGTGCCCGGAGGCTCTCCGCCTCAGTTTCCCGCC CAGAGGCAAGGGCGTGCGAGGGGATCCAGATATCCAAGGACCTGAGGTTTCGGCCTCGAG GTCTTGGGCGGGGGACTGGGCAGGCTGCGCGGGGTCCCAGCGAGGGGACAGCTCGGGTGG GCGGCCAGGGTGTTGGGGGCTGCGGGCGGCGGACAAAGCGGCGGCACCACCCCGCGGCGC GGGCCAATGGAATGAATGGGCTATAAATAGCCGCCAATGGGCGGCCCGCGTTGTGCCCCT TAAGAGCCGCGGGAGCGCGGAGCGGCCGCTGTTCGCCTGCGTCGCTCCGGGAGCTGCCGA CGGACGGAGCGCCCCCGCCCCCGCCCGGCCGCCCGGTGAGTGGGCCCGGGGGCCGGGGGC GTCCGCGCCCGGGCTAGGGGCGCTGCGAGCAAAGGGGGCGCGTCGCCTGGAGCGCGCGCC GGACCGGCCGGGGGTCCCCGGCGATGATGGCGCTCCCCGCGCGCGCTGCGGACCCCGCTG ACCTTGGCCGCGTCCCGGGGGGCGCCGGGGGGCCCGGCGGCGGGGGCCTGAGTGGTACGC GGGAGCCCGGGAACCCCGGCGTGCCGGTCCCCTCTGACCCCGCGTCTCCCCGCAGCCCGC CGCCGCCATGCCCTTCTCCAACAGCCACAACGCACTGAAGCTGCGCTTCCCGGCCGAGGA CGAGTTCCCCGACCTGAGCGCCCACAACAACCACATGGCCAAGGTGCTGACCCCCGAGCT GTACGCGGAGCTGCGCGCCAAGAGCACGCCGAGCGGCTTCACGCTGGACGACGTCATCCA GACAGGCGTGGACAACCCGGGTACGCGACCCCTCGGGGCCGGGGTCCCGGCCCCCCCTCC CCCCGCGCAGCCGCAGGGTCCTCAGCAGCGCGCTCGGGCCCGGCAGTGACGTCACTGTCC CCGTCCCGCGCCCCCTCCCCCAGGCCACCCGTACATCATGACCGTGGGCTGCGTGGCGGG CGACGAGGAGTCCTACGAAGTGTTCAAGGATCTCTTCGACCCCATCATCGAGGACCGGCA CGGCGGCTACAAGCCCAGCGATGAGCACAAGACCGACCTCAACCCCGACAACCTGCAGGT GCGGGGCTGCGGGCGGGCCGGGCGGGCGGGGCCGGGGTCTTCGGGCGCTCACTCCCGTCT CGCCTCCCAGGGCGGCGACGACCTGGACCCCAACTACGTGCTGAGCTCGCGGGTGCGCAC GGGCCGCAGCATCCGTGGCTTCTGCCTCCCCCCGCACTGCAGCCGCGGGGAGCGCCGAGC CATCGAGAAGCTCGCGGTGGAAGGTAGGGGCCGGGCGGGCCGAGGGGCGGCGGCGGCCGC GTCCCCCTCCCGGCGCGGTCCCCGCCCGCTTTTGTTTACGTCGCCCGGGAGCGGCAGCCG CCGTCGCGCTCTTATCTGCGCGCGCCCGGGTTCAGTTTCCCGGACCCACCGAGGGACGGA GGCCCAGCCCCCGCGCCCACAGCGGCCTGGGGCCCAGGGAGGGCGGGTCCTGGCGCGGGG TCACCGCCTGGGACCGTCGCCCGGGCCGTGAGGACTGGACGCCCGCAGATCCGGGCGGGT GGGGCCCTCTGACGTCCCCCGAGGTGGGGCACGGGGGCGGGCGGGTCCGCGCTGCGGGCT GGAGGGGCGGGCGCGGGAGCCCAGCGTCCTGAGCGCACCCCTCGCAGCCCTGTCCAGCCT GGACGGCGACCTGGCGGGCCGATACTACGCGCTCAAGAGCATGACGGAGGCGGAGCAGCA GCAGCTCATCGACGACCACTTCCTCTTCGACAAGCCCGTGTCGCCCCTGCTGCTGGCCTC GGGCATGGCCCGCGACTGGCCCGACGCCCGCGGTATCTGGTGCGTGTCCCTCTGCGCCCT CTCGCGGCGTCCTCCCTCCCCGCTACCTCCGCTTTCCCTCTCGCCCCCCTCGCGGGGGTG GGGCCCCTCGCGGCGAGGAGGAGGAGGAGGAGGAGGGAGGGGCCGGCCGCGCTCCGGGTC TGGGTTCCGTGCCGCGCCTCCTCCTGCGCCGGTGACCTTGGCCGAGCAGGTGCGTTAAGG GACTGGGCCCCGGCCCGTGGGGGCTCAGGACTCAGCAACACCTCCCCACCCCGAGACGTG AGGTGGGGGCGGGGCTCTCTGGCGCCTCTCCCCGACGGCCCTGGGAGCTGGAGCTCTTTG TTTTCTTTTCTCACTCCTCCGCCGCTGGGATTCTACCAGGGGCTGGTGACGCCAAAGCTT CTCCAGGGGCAGGGCTCCTACCCCCACTGTGGGGGGCGGGTCGGGCTGTCCTGGCGGTCC CTGGCCCCGCCCCACCTCGGGCCACAGCGCATGATGGCAGCTGGGGTTCTCCTGCTGTGA GGCGTCCCGGTTCCCCCGCCCGCCCCGTGTTGGCGGGTGGAGTCTTGGCAGCAGCCTCCA CTCCTGGGCATGGCAGGGAGCAGCACCTCAGGGACTTGGGAAGTTCCTTTGGTCTGGGGG CGGCCTGGGGCTTTTTTCTGGGTATGCCCTGAGACCAGCCCTCCCGCAGGCACAATGACA ATAAGACCTTCCTGGTGTGGGTCAACGAGGAGGACCACCTGCGGGTCATCTCCATGCAGA AGGGGGGCAACATGAAGGAGGTGTTCACCCGCTTCTGCACCGGCCTCACCCAGGTGCCAG GGACGGGGCAGGCCCAGACCCCAGGGCCCCAGCAGGGATGTGGGTGCCCCAGCATCAGTC CCCCCGGGGGATTTCCGGCACTGGGGAGTCTCAGGGCCTGTAGGGGTTTCAGGCAGGCCT TCTCCCTCATACCCTCTTCTCCGTCTGCAGATTGAAACTCTCTTCAAGTCTAAGGACTAT GAGTTCATGTGGAACCCTCACCTGGGCTACATCCTCACCTGCCCATCCAACCTGGGCACC GGGCTGCGGGCAGGTGTGCATATCAAGCTGCCCAACCTGGGCAAGCATGAGAAGTTCTCG GAGGTGCTTAAGCGGCTGCGACTTCAGAAGCGAGGCACAGGTGAGCAGGGCAGGTGCTGC GGCTTCCCGTGGCCTTTGGGCAGCCCTGTTTCCTCCGCCCTGACTTGCTGTCTCCCCAGG CGGTGTGGACACGGCTGCGGTGGGCGGGGTCTTCGACGTCTCCAACGCTGACCGCCTGGG CTTCTCAGAGGTGGAGCTGGTGCAGATGGTGGTGGACGGAGTGAAGCTGCTCATCGAGAT GGAACAGCGGCTGGAGCAGGGCCAGGCCATCGACGACCTCATGCCTGCCCAGAAATGAAG CCCGGCCCACACCCGACACCAGCCCTGCTGCTTCCTAACTTATTGCCTGGGCAGTGCCCA CCATGCACCCCTGATGTTCGCCGTCTGGCGAGCCCTTAGCCTTGCTGTAGAGACTTCCGT CACCCTTGGTAGAGTTTATTTTTTTGATGGCTAAGATACTGCTGATGCTGAAATAAACTA GGGTTTTGGCCTGCCTGCGTCTGAGTGGTGCCTCTCCTTTCCCAGGGGGGAGGGGGAAGG GCAGCAGCCAGGCCCCAGGAGTCTTGAGTCCTGGGCCTGCTGTGGGCCTCGCCTTCTGTG AGATGGGACAAGAGCCAGGAGGTGGCCACTCTGTTCTGCCTGCCCTACCTAGTCCATGGG CCCCTTCCCTCGTGTCTATCGGGCTGTGCAGGCAGGAACATGGGAGAGAGCGAGGGAGGA
... or search a FASTA file to process
Paste your FASTA sequence here >MUSCRKNB CTGCAGCTGGACGTGGTGGCCCATGCCTTTAATCCCAGCACTTCGGAGGCAGAGGCAGGC GAATTTCTGAGTTCGAGGCCAGCCTGGTCTACAGAGTGAGTTCCAGGACAGCCAGGGCTA TACAGAGAAACCCTGTCTTAAACAAACAAACAAACCAAAAAAGAAAAAAAGAAAGGGTGT GCCGCCACGCAGGACCTCACCTCTCTATTTGTATCAACTGCACCATCTACCACACTTTTC TACCCACGACCATTCCCTAATGAGCATCAGATCGAATCCAAACTTTCTCCCTCTACACGT TTCCTGATCTGTTTGTAGAGAAGGGACACCGTCTTTCTGTAGGCTGCATTCCATGAAGTG GGAATGCTAGGTCATAGAGCAATTTAAAAATTTTTTAACAAATACTGCCAAATTACTTTC CAGAAAAAGATGAAGCTGCTGGCAGCTGCAGCAGGCAAGGTGCGAACGCCTATCTTCACA CATTCCAATTCACACCTGGCACTGGCTGTGGTCAAATTCCAGGTTTCCCCTGTGTGTGTG TGTGGGGGGGGGGGTGGCACGGGGAGAGAGACTGTTTCAATGTGTTGGATCTGATTAATG AGCTTAATTTTGACTGGTGCTTCCTCCACCCCCAGTTTGCCTGCCGGTGTTGGGGAGTTC TTTGTGGGTGCGGTGGTAGAAACCGGTCCCCCGAGAAAAGGGACAGGAGATGAGAATTTG GAGATTCAAAAACAGTCTCTAGGAACTAAATGTCTTGAATCTTTCAGGGTCTGGGTCCAA GGGTCTTGGCGGTGTTCCTTAGAGCCCGCCCAAGGTCAGAACACCCTGGGTGCTTCCGGG CAGGACCTCAGCCAGACTGCGCCTCTGGTGTTGCACCAAGAACACCCAGGAGATGCCCGC AGGCACCTTGAGGAAGGTCGGCGGCTCTGTGGCTCTTGAGGATATGGCCTGGATGCGGTG TTTGGATGCTGTACCCAATGTTAAAAACCTGGACTCTCACCATGACCCTCTCTACAACGC AGTCGGCCTGAGGAGACAGGGAAGGCAGGAGCCTCAAGTGCCATCTGGCTTCCCAGACTC AGTTTCCTGCCCCCAGGCAGTGTGTGTGAAGCAGGTCTAGGATATCTATGGCTCTGGGGT TTCGTCTTAGGGATCCTGTGCGAGCACCACCGGGGTGGCCAAACTGCGCAGCGGGGTCGA GACTTGGGGACCGCTAGGGTGGGCCGCTGGGGGTGTCAAGGGCTGCTGCCTCGGACAAAG CGGCGGCACCACCCCAAAGCGCGGACCAATGGAATGAATGGGCTATAAATAGCCGCCAAT GGGAGGCCGCGACGCGCCCCTTAAGAGCTCAGGGAGCAGCGAGCAGCCGTCGTTCTTCTG CGCTGCGCCAGGAGCTGCAAGCACAGGCATCCATCGCCCCTGCTTCGTCCGGCATCCCGG TGAGCGGGTCCAAGGGCCAGGGGTATAGTCCTGCGGGGCTCTACGCTGTGGGGTGCGGGA CCGGGGCATCACGCCGAGCCTCGCTGCCGGGAGCGCTCATCGGACCCGCCCGGGAACTTG GGATGCGCTGGACTCTGACGATGCAGACCTCGCTGACCTTGGTGGCGTCCAGGGAAAGTC TCGGGGGTCCCAAGTGGTTTGCAGGTCCCTGGCGGCCGGTGTTTAAATCTCCTCTGACCC CGCTCTTCCCTGCAGCCCGCTGCCGCCGCCATGCCCTTCTCCAACAGCCATAATACGCAG AAGCTGCGCTTCCCGGCCGAGGATGAGTTCCCTGATCTGAGCAGCCACAACAACCATATG GCCAAGGTGCTGACCCCGGAGCTGTACGCCGAGCTCCGTGCCAAGTGCACGCCGAGCGGC TTTACTTTGGACGACGCCATTCAGACTGGCGTAGACAATCCGGGTATGCACACCCCTGTA GCGTCAGGCTTCCGCCTCCCCAAGAAGCCCCCCGGGCAAGGATCCCACTGCTCTTCCCTG AACCTTCGGTGGGCTGGGGTCCCCTGTTCCCCTCTCCGCGCTTAGCCTTAAGAGCCTCGG CTTGCTCCTGCCTGACGGTGACGTCACTGTCGCCGCGCCCCTCCTCCAGGCCACCCGTAC ATCATGACTGTGGGTGCAGTGGCGGGCGACGAGGAGAGTTACGACGTATTCAAGGACCTC TTCGACCCCATTATTGAGGAGCGGCACGGCGGCTACCAGCCCAGTGATGAGCACAAGACC GACCTCAACCCAGACAACCTGCAGGTGCGGGGAATCAGGGTCCGGGCGTGCTGGGGAGAG GGGTCTCGGCGCTCACTCTGGCCACCACCTTGTATTCCCAGGGTGGCGATGACCTGGACC CCAACTACGTGCTGAGCTCGCGAGTGCGCACAGGCCGCAGCATCCGCGGCTTCTGTCTCC CCCCGCACTGCAGCCGCGGGGAGCGCCGCGCCATCGAGAAGCTGGCAGTAGAAGGTAGGG GTCGGGTAACAGCCGCCAGAGCTGCTGCGCTCTTCTCTGCGCGCGCGTTCCCACTGGGGA CTGAGGTCGAAGACCCACCTTCGCCCGCGGGGACACACTGCTGCAGGCGGGGCGGCGGGC TTGCTCGCCCGCCCACCCTGTCCTTGAACTCTGCACTCCGCAGCTCTGTCCAGCCTAGAT GGCGACCTGTCTGGCAGGTACTACGCGCTCAAGAGCATGACTGAGGCGGAGCAGCAGCAG CTCATTGACGACCACTTCCTCTTCGATAAGCCTGTGTCGCCTCTGCTGCTGGCCTCCGGC ATGGCCCGCGACTGGCCGGATGCTCGTGGCATATGGTACGAGCCCTCTCCCCCCCCAGTC CCCAGAAGGTGGGGCCTGCCCTGAATTCCTAGTTTGTGCAGTGCCTCCCTCCGCCCAGGT GACCTTGGTTCTGCCGATGACTGTGGTCCTTGCGCTGCGGGAGGCCGCAGTCTCCAGGGA CTCAAGGGTGGTGACCAGTCTCTTTGGCGTCTGTTCTCCGCCCTCCTCCTGGGAGCCGGC GCTTCTTGTTTTCTCTCCACCTTCTCACCCCCTTATTCCGCCGGGATCGTGCCAGGTGCC AATGACGCAAAAGCCTCCGCACCGTCCGGGCAGGGCTCCTACCCCTGCAGACTGAGCGGG CGGAGGCGTGCTTCCTCTAGTGGGATGCTCTGGAGGCTCCAGACCCTTGCGGGCCACACA GCACATGACTGGTGATTGAGATGTATATGAGCTGCCTTCCTCCACTGTTGCTGGGATTGG CGTCTTGGTGACAGCTCACATTCCAGCACTTTGTTGAGGAAGCATTTTGTTTGAGGTCAC CCACCTATAGTGTCTTTCTTGGGTGTGATCTAAGGCCACATTCTGGCAGGGACTCCAGCC TGACTTTACCCCTCTTCTAGGCACAATGACAATAAGACTTTCCTGGTGTGGATTAACGAG GAGGACCACCTGCGAGTCATCTCCATGCAGAAGGGGGGCAACATGAAGGAAGTGTTCACC CGATTCTGCACCGGCCTCACTCAGGTCAGGCCTGGGCATTCCCCAGGGCTTCCTCACTAG GGGCGTTGGTGCTGGGGAGGGATCAGGGGCAGATCAGAGCCCACATCTCCCAGGGTCCGT AGGGCTTCCGGCCAGGCTTCCTTCTCTTAACTCTCCCTTCTCCACCTTCAGATCGAAACT CTCTTCAAGTCCAAGAACTATGAGTTCATGTGGAATCCTCACCTGGGCTACATCCTCACA TGCCCATCCAACCTGGGCACCGGACTGCGGGCAGGTGTACACATCAAGCTGCCCCACCTG GGGAAGCACGAGAAGTTCTCGGAGGTGCTCAAGCGGCTGCGGCTTCAGAAGCGAGGCACA GGTGAGGGGCAGAAGGCACAGGTGAGGGGCAGAAGGTGCAGGTGTTGTCCTCAGCCCCGC TGACCTGCCCATGTCCCACCGCAGGTGGCGTGGACACCGCTGCTGTCGGTGGGGTGTTTG ATGTCTCCAACGCTGACCGCCTGGGCTTCTCGGAGGTGGAACTGGTGCAGATGGTGGTGG ACGGAGTGAAACTACTCATTGAGATGGAGCAACGGCTCGAGCAGGGTCAGGCAATCGATG ACCTCATGCCGGCCCAGAAGTGAAGCCTGGCCCTCGCCACCATCAGGCTGCCGCTTCCTA ACTTATTACCCGGGCAGTGCCCGCCATGCATCCTTGATGTTTGCCGCCTGGCGCTGAGCC CTTAGCCTCGCTGTAGAGACTTCTGTCGCCCTGGGTAGAGTTTATTTTTTTGATGGCTAA GCTGTTGCTGACACTGAAAATAAACTAGGGTTTGGCCTGCCCTATGTCCGAGTGTTGCTT CTCCTTTCCTAGAGACAGTGTGTGTGTGTGTGTGTGTATGTGTGTGCGCGAGCTGGCCTT CTGTGTCATCTCACCTAGCAGATGAAACATGAGCCATAGAAGATACAGGGCAGAGAGGGA GGGAGGCTCTGAGTCCAGCCTTGAGCATCTAAGGACATCTGTGCTTGCAGGGTGGAGCCT TAGTGTTTCCTTAGTCCCAGC
Output format: geneid GFF geneid including CDS sequence geneid extended (only genes) GFF extended (only genes)
Do you want the tblastx output ?
To see a dummy example with the form ready to submit, click here.
This example contains the following features:
sgp2 server is the web server to sgp2, a program to predict genes, exons, splice sites and other signals along a DNA sequence. Visit sgp2 homepage for more information about this program.
Version 1 of sgp2 has been mostly written by Gen�s Parra, Josep F. Abril and Roderic Guigo. sgp2 server written by Enrique Blanco and Gen�s Parra with the always useful help of Moises Burset and Josep F. Abril.