Question

Consider two homologous DNA sequences, GATTC and CCATG. Use the Needleman-Wunsch algorithm to find the optimal global alignment between these two sequences. Use a linear gap penalty of -4 and the substitution matrix provided below. The dynamic programming matrix is already outlined below, you just need to fill it according to the algorithm. Be sure to write out your final alignment!

substitution matrix A C GT A 10 -5 0 -5 C -5 10 -5 0 G 0 -5 10 -5 T -5 0 5 10 gaps = -4G A TTC

0 0
Add a comment Improve this question Transcribed image text
Answer #1

here we use two matrix one is score matrix and other is trace back matrix

with the help of score matrix we fill trace back matrix and with the trace back matrix we find best optimal global alignment

given sequence is GATTC and CCATG

we make both matrix for the given sequence

score matrix

G A T T C
   0
C
C
A
T
G

since gap =-4

fill first row and first column by keep adding gap each time with start at 0.

so now score matrix is

G A T T C
   0 -4 -8 -12 -16 -20
C   -4
C -8
A   -12
T -16
G   -20

for filling other box of matrix use following Dynamic Programming formula

D(i,j)=max\left\{\begin{matrix} D(i-1,j-1)+S(X_i,Y_j) & & \\ D(i-1,j)+gap& & \\ D(i,j-1)+gap& & \end{matrix}\right.

where S(Xi,Yj) is the substitution score for residue i,j

we fill matrix accordingly

G A T T C
   0 -4 -8 -12 -16 -20
C   -4 -5   
C -8
A   -12
T -16
G   -20

look at how come highlighed entry

D(1,1) = max{ D(0,0)+S(C,G), D(0,1)+gap, D(1,0)+gap }

look at matrix D(0,0) =0 , D(0,1) = -4 , D(1,0) = -4 and from substitution matrix which is given in question look entry for S(C,G) = -5

now D(1,1) = max{ 0-5 , -4-4 ,-4-4} =max{-5, -8, -8} = -5

since this entry come from diagonal so fill tace back matrix with diagonal

G A T T C
C dia      
C   
A
T
G

trace back entry tell us corresponding score matrix come from diagonal or up or left so it is useful for find optimize global alignment

similarly fill score matrix enty and corresponding entry

score matrix

G A T T C
   0 -4 -8 -12 -16 -20
C   -4 -5    -9 -8 -12 -6
C -8 -9 -10 -9 -8 -2
A   -12 -8 1 -3 -8 -6
T -16 -12 -3 11 7 3
G   -20 -6 -7 7   6 2

trace back matrix

G A T T C
C dia dia/left dia dia/left dia
C dia/up dia    dia     dia     dia
A    dia     dia left    left up
T up   up     dia dia/left      left
G     dia       up    up    dia dia/left

now look one more highlighted entry

look at how come highlighed entry

D(5,5) = max{ D(4,4)+S(C,G), D(5,4)+gap, D(4,5)+gap }

look at matrix D(4,4) =7 , D(4,5) = 3 , D(5,4) = 6 and from substitution matrix which is given in question look entry for S(G,C) = -5

now D(5,5) = max{ 7-5 , 6-4 ,3-4} =max{2, 2, -1} = 2

since this entry come from diagonal and left so fill tace back matrix with diagonal

similarly we fill whole matrix

now trace traceback matrix from right bottom index and trace it path

since right bottom entry is dia/left so we move both path let's suppose we move diagonal up means 4th row and 4th column which again has diagonal/left entry suppose we move diagonal up means 3rd row and 3rd column which has left so we move left then we are 3rd row and 2nd column which has entry as diagonal now we have 2nd row and first column which has entry diagonal and up suppose we move up so our tracing is complete...

look at trace back highlighed entry

trace back matrix

G A T T C
C dia dia/left dia dia/left dia
C dia/up dia    dia     dia     dia
A    dia     dia left    left up
T up   up     dia dia/left      left
G     dia       up    up    dia dia/left

so sequence is CAATG(row) and GATTC (column) entry which is global alignment .

similarly explore all entryu from trace back matrix...

G A T T C
C dia dia/left dia dia/left dia
C dia/up dia    dia     dia     dia
A    dia     dia left    left up
T up   up     dia dia/left      left
G     dia       up    up    dia dia/left

so sequence is CCAATG(row) and GGATTC (column) entry which is global alignment .

and

G A T T C
C dia dia/left dia dia/left dia
C dia/up dia    dia     dia     dia
A    dia     dia left    left up
T up   up    dia dia/left      left
G     dia       up    up    dia dia/left

so sequence is CATTG(row) and GATTC (column) entry which is global alignment .

and

G A T T C
C dia dia/left dia dia/left dia
C dia/up dia    dia     dia     dia
A    dia     dia left    left up
T up   up     dia dia/left      left
G     dia       up    up    dia dia/left

so sequence is CCATTG(row) and GGATTC (column) entry which is global alignment .

and

G A T T C
C dia dia/left dia dia/left dia
C dia/up dia    dia     dia     dia
A    dia     dia left    left up
T up   up     dia dia/left      left
G     dia       up    up dia dia/left

so sequence is CATGG(row) and GATTC (column) entry which is global alignment

and

G A T T C
C dia dia/left dia dia/left dia
C dia/up dia    dia     dia     dia
A    dia     dia left    left up
T up   up     dia dia/left      left
G     dia       up    up    dia dia/left

so sequence is CCATGG(row) and GGATTC (column) entry which is global alignment .

Add a comment
Know the answer?
Add Answer to:
Consider two homologous DNA sequences, GATTC and CCATG. Use the Needleman-Wunsch algorithm to find the optimal...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Let S and T be two sequences of length n and m, respectively. When calculating the...

    Let S and T be two sequences of length n and m, respectively. When calculating the dynamic programming table to find the optimal global alignments between the two sequences S and T, we can keep pointers to find the optimal alignments by following these pointers from cell (n, m) to cell (0, 0). Each of the paths represents a different optimal alignment for the two sequences. a) Give an algorithm in O(nm) which calculates the number of different alignments between...

  • 5. Biophysics 5. Based only on polarity of the amino acids (i.e., two non-identical amino acids...

    5. Biophysics 5. Based only on polarity of the amino acids (i.e., two non-identical amino acids are considered similar if they are both hydrophobic, or polar, or charged), 5. Based only on polarity of the amino acids (i.e., two non-identical A) how would you manually perform the global alignment of the two polypeptide sequences (gaps are allowed): VLLVAKKR ITSVVPKR and ILLVKKKLTTVVLPKK? B) Complete the scoring matrix for the alignment of the above sequences if a match score is 1, mismatch...

  •    Problem 2: Sequence similarity measure. Let 3 and y be two given DNA sequences, represented...

       Problem 2: Sequence similarity measure. Let 3 and y be two given DNA sequences, represented as strings with characters in the set {A, G, C,T}. The similarity measure of r and y is defined as the maximum score of any alignment of r and y, where the score for an alignment is computed by adding substitution score and deletion and insertion scores, as explained below. (Some operations have negative scores.) The score for changing a character T, into a...

  • Use the dynamic programming technique to find an optimal parenthesization of a matrix-chain product whose sequence...

    Use the dynamic programming technique to find an optimal parenthesization of a matrix-chain product whose sequence of dimensions is <5, 8, 4, 10, 7, 50, 6>. Matrix Dimension A1 5*8 A2 8*4 A3 4*10 A4 10*7 A5 7*50 A6 50*6 You may do this either by implementing the MATRIX-CHAIN-ORDER algorithm in the text or by simulating the algorithm by hand. In either case, show the dynamic programming tables at the end of the computation. Using Floyd’s algorithm (See Dynamic Programming...

  • 1. Homologous recombination can happen between non-identical DNA sequences. T/F? 2. Homologous recombination can happen in_______...

    1. Homologous recombination can happen between non-identical DNA sequences. T/F? 2. Homologous recombination can happen in_______ a) meiosis b) mitosis c) both 3. Homologous recombination in meiosis has the main purpose of_____ a) DNA repair b) Creating new chromosomes   c) Sealing double-stranded breaks 4. Strand invasion usually happens without enzymatic assistance. T/F? 5. When replication fork runs into a nick, it results in a_______ a) single-stranded break b) double-stranded break 6. The invading end is usually a _______ a) 3'...

  • please use c program Population of DNA. In previous weeks, we worked with DNA sequences. Oftentimes,...

    please use c program Population of DNA. In previous weeks, we worked with DNA sequences. Oftentimes, geneticists nav to deal with not just one DNA sequence, but a whole set, or population, of samples. We will use a character matrix to store the DNA sequences. Please write the following functions. (a) void setupRand DNA Pop (int n, int m char pop[] [COLS_MAX]) This function creates a population of n random DNA sequences, each of length m. Hint. Remember to terminate...

  • jnment Score: Resources Give Up? Hint Check Answer estion of 10 > Consider the two sequence...

    jnment Score: Resources Give Up? Hint Check Answer estion of 10 > Consider the two sequence alignments. Alignment 1. A-SNLFDIRLIG GSNDFYEVKIMD Alignment 2 ASNLFDIRLI-G GSNDFYEVKIMD Calculate the alignment scores of each sequence alignment using identity-based scoring and the Blosum 62 substitution matrix. The identity-based scores should incorporate a gap penalty. In this case, the Blum-62 substitution matrit does not impose a gap penalty Blosum-62 Substitution Matrix Ala Arg Asn Asp Cys Gin Glu Gly His Ile Leu Lys Met Phe...

  • Use BLAST to find DNA sequences in databases Perform a BLAST search as follows: Do an...

    Use BLAST to find DNA sequences in databases Perform a BLAST search as follows: Do an Internet search for “ncbi blast”. Click on the link for the result: BLAST: Basic Local Alignment Search Tool. Under the heading “Basic BLAST,” click on “nucleotide blast”. pMCT118_F   5’- GAAACTGGCCTCCAAACACTGCCCGCCG -3’ (forward primer) pMCT118_R   5’- GTCTTGTTGGAGATGCACGTGCCCCTTGC -3’ (reverse primer) Enter the pMCT118 primer (query) into the search window. (see Moodle metacourse page for the file – just copy and paste the sequence into the...

  • Align the same two sequences in part one with the new scoring scheme: This question relates...

    Align the same two sequences in part one with the new scoring scheme: This question relates to Bioinformatics --- Genome Sequence Analysis. Below doesn't match the question above but should give you an idea what it should look like. Answer should be in this format: We would like to align two DNA sequences: (v) C GATACT, and (w) GATIC GT based on the following scoring scheme as discussed in class: i) s(i, j) = 1 if Vi = w; (matches);...

  • *SOLVE QS 13 ONLY 11. (5 pts) We would like to align two DNA sequences: (v)GATTCGT, and (w) GAATTAGTT based on the following scoring scheme as discussed in class: s(i i-1 if v w (matches) ii) s(i, j)...

    *SOLVE QS 13 ONLY 11. (5 pts) We would like to align two DNA sequences: (v)GATTCGT, and (w) GAATTAGTT based on the following scoring scheme as discussed in class: s(i i-1 if v w (matches) ii) s(i, j) = 0 if vis wh (mismatches); ii) d 0 What would be the maximum alignment score? Explain how you get the result. (indels: insertions or deletions). 12. (5 pts) Align the same two sequences in the previous problem with the new scoring...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT