We observe the following empirical frequencies for 2-mers in H. influenza:
Where the first nucleotide s(i)
is the row and the second nucleotide s(i+1) is given in the
columns, hence the frequency of AC is 0.0505. Convert the above
frequency matrix into a transition matrix for the Markov model of
di-nucleotide sequences discussed in class. Note that each entry of
the matrix is the conditional probability: P(s(i+1)| s(i)).
A |
C |
G |
T |
|
A |
0.1202 |
0.0665 |
0.0514 |
0.0721 |
C |
0.0505 |
0.0372 |
0.0522 |
0.0518 |
G |
0.0483 |
0.0396 |
0.0363 |
0.0656 |
T |
0.0912 |
0.0484 |
0.0499 |
0.1189 |
We observe the following empirical frequencies for 2-mers in H. influenza: Where the first nucleotide s(i)...