4. Promoter prediction I: methods EMBL - UPF course 2001
   
 

Pattern driven

  1. Collecting a set of real TF binding sites to build a characteristic representation or profile from them.
  2. Searching potential binding sites on the input sequences by using their characteristic profile.
  3. Assembling found binding sites following some rules about these arrangements should be done to re-build the promoter region.


Sequence driven

Making different pairwise comparisons (alignments) between input sequences to form common patterns corresponding to well-conserved functional binding sites without using more information.


   
 

Some recent approaches:

  • (Statistical) Discriminant analysis, regression analysis
  • Consensus sequences (regular expressions)
  • Position weight matrices
  • Neural networks
  • Clustering of putative binding sites
  • Oligonucleotide counts (word frequency), Markov models
  • Hidden Markov models
  • Pairwise alignment, multiple alignment
  • Iterative methods: Gibbs sampling

L A S T H O M E N E X T

Enrique Blanco, Sergi Castellano and Genis Parra © 2001