In this practice, we run a Gibbs sampler to discover unknown but conserved
patterns in a set of input sequences. Then this pattern implemented
as a weight matrix will be used to search in the sequences for more
occurrences.
|
ADVICE: It is very useful to open 2 or more browser windows, preserving this
text in one of them and running the exercise using another one.
Input sequences:
6 genes of Drosophila melanogaster.
WWW tools:
Regulatory Sequence Analysis Tools (RSA) by Jacques van Helden
from SCMBB - Service de Conformation des Macromolécules Biologiques et de Bioinformatique (Université Libre de Bruxelles).
Step 1:Run a Gibbs sampler in a set of sequences.
- Read the promoter regions in fasta format.
- Connect to RSA tools server.
- In the menu (left frame), click over Pattern discovery: gibbs (matrices).
- Copy and paste the promoter regions into the Sequence box.
- Click the button Go to submit the query.
- Inspect the output: best motif, weight matrix and information content.
- Press the button pattern matching (patser) to
use this matrix to find new occurrences in the same set of sequences.
- Press the button Go.
- Press the button Feature map to enter
a new menu about plotting the results.
- Press the button Go to obtain
a graphical output of the reported matches.
Step 2:Modify some parameters of the sampler.
- Modify the size of the pattern to generate: increase or decrease the value in
the box Matrix length and repeat the process.
Questions:
- Can you see the "generator" patterns that were used to build the
matrix in the plot containing the new found occurrences?
- Changing the size of the produced pattern, do you get subsets of the
same core pattern or they are completely diferent?
Results: