Question

Problem 2: Sequence similarity measure. Let 3 and y be two given DNA sequences, represented as strings with characters in the

  

0 0
Add a comment Improve this question Transcribed image text
Answer #1

The solution is done with python code and using score_matrix for dynamic programming.

1. recurrence relation is :

score_matrix[i,j] = max (score_matrix[i – 1, j – 1] + GET_SCORE(X[i – 1], Y[j – 1]),
score_matrix[i, j – 1] -0.2
score_matrix[i – 1, j] -0.2);

here score_matrix[i,j] is the ith rown and jth column of the matrix, the first parameter of max calculate the score in case of match or mismatch of elements, the second parameter calculate the score in case element in 1st string is replaced by _ , and the last parameter calculate the score in case element in 2nd string is replaced by _ .

2. first three rows of the table are:

[[ 0.0, -1.0, -2.0, -3.0, -4.0, -5.0, -6.0, -7.0,  -8.0, -9.00, -10.0, -11.0,-12.0, -13.0, -14.0,-15.0, -16.0,-17.0,-1.8.0 ,-19.0, -20.0]
[-1.0, -10.1, 0.0, -0.2, -0.4, -0.6, -0.8, -1.0, -1.2, -1.4,-1.6, -1.8,  -2.0, -2.2, -2.4, -2.6,  -2.8, -3.0, -3.2, -3.4, -3.6]
[-2.0, -0.3, -0.2, -0.1, 0.8,   0.6, 0.4, 0.2, 0, -0.2, -0.4, -0.6,  -0.8, -1.0, -1.2, -1.4,  -1.6, -1.8, -2.0, -2.2,  -2.4]]

overall similarity code: 8.3

3. alignments:

-
C

T-
TC

-T-
TTC

--T-
GTTC

C--T-
CGTTC

-C--T-
ACGTTC

--C--T-
TACGTTC

---C--T-
ATACGTTC

----C--T-
CATACGTTC

-----C--T-
CCATACGTTC

------C--T-
CCCATACGTTC

-------C--T-
CCCCATACGTTC

--------C--T-
ACCCCATACGTTC

---------C--T-
TACCCCATACGTTC

----------C--T-
ATACCCCATACGTTC

-----------C--T-
AATACCCCATACGTTC

------------C--T-
TAATACCCCATACGTTC

-------------C--T-
CTAATACCCCATACGTTC

G-------------C--T-
GCTAATACCCCATACGTTC

-G-------------C--T-
AGCTAATACCCCATACGTTC

T-G-------------C--T-
-AGCTAATACCCCATACGTTC

GT-G-------------C--T-
--AGCTAATACCCCATACGTTC

CGT-G-------------C--T-
---AGCTAATACCCCATACGTTC

GCGT-G-------------C--T-
----AGCTAATACCCCATACGTTC

GGCGT-G-------------C--T-
-----AGCTAATACCCCATACGTTC

CGGCGT-G-------------C--T-
------AGCTAATACCCCATACGTTC

GCGGCGT-G-------------C--T-
-------AGCTAATACCCCATACGTTC

GGCGGCGT-G-------------C--T-
--------AGCTAATACCCCATACGTTC

TGGCGGCGT-G-------------C--T-
---------AGCTAATACCCCATACGTTC

CTGGCGGCGT-G-------------C--T-
----------AGCTAATACCCCATACGTTC

GCTGGCGGCGT-G-------------C--T-
-----------AGCTAATACCCCATACGTTC

CGCTGGCGGCGT-G-------------C--T-
------------AGCTAATACCCCATACGTTC

ACGCTGGCGGCGT-G-------------C--T-
-------------AGCTAATACCCCATACGTTC

AACGCTGGCGGCGT-G-------------C--T-
--------------AGCTAATACCCCATACGTTC

GAACGCTGGCGGCGT-G-------------C--T-
---------------AGCTAATACCCCATACGTTC

TGAACGCTGGCGGCGT-G-------------C--T-
----------------AGCTAATACCCCATACGTTC

GTGAACGCTGGCGGCGT-G-------------C--T-
-----------------AGCTAATACCCCATACGTTC

best alignment: GTGAACGCTGGCGGCGT-G-------------C--T-

  -----------------AGCTAATACCCCATACGTTC

4. code for this:

import numpy as np d={'A':{'A':1,'G':-0.1,'C':-0.1,'T':-0.15},'G':{'A':-0.1,'G':1,'C':-0.15,'T':-0.1},'C':{'A':-0.1,'G':-0.15,'C':1,'T':-0.1},'T':{'A':-0.15,'G':-0.1,'C':-0.1,'T':1},} def GET_SCORE(n1, n2, penalty=-1, reward=1): return d[n1][n2] def global_alignment(X, Y, penalty=-1, reward=1): # initialize score matrix score_matrix = np.ndarray((len(X) + 1, len(Y) + 1)) for i in range(len(X) + 1): score_matrix[i, 0] = penalty * i for j in range(len(Y) + 1): score_matrix[0, j] =penalty * j # define each cell in the matrix by as the max score possible in that stage for i in range(1, len(X) + 1): for j in range(1, len(Y) + 1): match = score_matrix[i - 1, j - 1] + GET_SCORE(X[i - 1], Y[j - 1], penalty, reward) delete = score_matrix[i - 1, j] - 0.2 insert = score_matrix[i, j - 1] - 0.2 score_matrix[i, j] = max([match, delete, insert]) print(score_matrix) print() print(np.amax(score_matrix)) i = len(X) j = len(Y) align_X = "" align_Y = "" while i > 0 or j > 0: current_score = score_matrix[i, j] left_score = score_matrix[i - 1, j] if i > 0 and j > 0 and X[i - 1] == Y[j - 1]: align_X = X[i - 1] + align_X align_Y = Y[j - 1] + align_Y i = i - 1 j = j - 1 elif i > 0 and current_score == left_score + penalty: align_X = X[i - 1] + align_X align_Y = "-" + align_Y i = i - 1 else: align_X = "-" + align_X align_Y = Y[j - 1] + align_Y j = j - 1 print(align_X) print(align_Y) print() return align_X, align_Y X='GTGAACGCTGGCGGCGTGCT' y='AGCTAATACCCCATACGTTC' print(global_alignment(X,y))
Add a comment
Know the answer?
Add Answer to:
   Problem 2: Sequence similarity measure. Let 3 and y be two given DNA sequences, represented...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT