Weight matrix:
Given a collection of known binding sites, a weight matrix is computed by measuring
the frequency of every element in every position of the site (weight). Then,
the score for any putative site is the sum of matrix values for that
sequence (log likelihood scores).
Disadvantage:
A cut-off (threshold) value to filter is required. Different positions are supposed to
make independent contributions to DNA-protein binding.
TACGAT |
TATAAT |
TATAAT |
GATACT |
TATGAT |
TATGTT |
Weight matrix:
|
1 |
2 |
3 |
4 |
5 |
6 |
A |
0 | 6 | 0 | 3 | 4 | 0 |
C |
0 | 0 | 1 | 0 | 1 | 0 |
G |
1 | 0 | 0 | 3 | 0 | 0 |
T |
5 | 0 | 5 | 0 | 1 | 6 |