Let S and T be two sequences of length n and m, respectively. When calculating the dynamic programming table to find the optimal global alignments between the two sequences S and T, we can keep pointers to find the optimal alignments by following these pointers from cell (n, m) to cell (0, 0). Each of the paths represents a different optimal alignment for the two sequences.
a) Give an algorithm in O(nm) which calculates the number of different alignments between the two sequences. (Hint : Use dynamic programming).
b) Build the dynamic programming table of your algorithm for the sequences X = AABAACAAA and Y = ABACAA.
Answer:
Alphabet:
An alphabet, denoted by Σ, is a finite, unordered set of symbols; e.g., DNA: ΣD = {A, C, G, T} RNA: ΣR = {A, C, G, U} Amino acids: ΣAA = {A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y }
Sequences or Strings:
A sequence or string, s, is a finite succession of the symbols in Σ. We say that s ∈ Σ ∗ , where Σ∗ is the set of all sequences over alphabet Σ, including the empty sequence, ∅. For example, Σ∗ R = {∅, A, C, G, U, AA, AC, AG, AU, CA, CC, CG, CU, . . .}. Given a sequence s of length m, we use s[1]s[2] · · · s[m] to denote the symbols in s. Sometimes, we will use s1s2 · · · sm for convenience. Subsequences: A subsequence of s is any sequence obtained by removing zero or more symbols from s. The sequences CATA and CTG are subsequences of CATTAG. AATTCG is not. A proper subsequence is a subsequence obtained by removing one or more symbols from s. Substrings: A substring of s is a subsequence of s consisting of consecutive symbols in s. Given a sequence, s, of length m, the substring that begins with s[i] and ends with s[j] is denoted s[i . . . j], 1 ≤ i ≤ j ≤ m. The sequence CAT is a substring of CATTAG. CATA is not. A prefix of s is denoted s[1 . . . j], j ≤ m. A suffix of s is denoted s[i . . . m], 1 ≤ i.
#Could you please leave a THUMBS UP for my work..
Let S and T be two sequences of length n and m, respectively. When calculating the...
Consider two homologous DNA sequences, GATTC and CCATG. Use the Needleman-Wunsch algorithm to find the optimal global alignment between these two sequences. Use a linear gap penalty of -4 and the substitution matrix provided below. The dynamic programming matrix is already outlined below, you just need to fill it according to the algorithm. Be sure to write out your final alignment! substitution matrix A C GT A 10 -5 0 -5 C -5 10 -5 0 G 0 -5 10...
The total number of sequence alignments between two sequences of length m and n respectively, can be expressed in a closed form as: In n+m-i Using Stirling's approximation (n!) n'e" /2m, show that the number of possible gapped alignments is exponential in n.
Problem 2: Sequence similarity measure. Let 3 and y be two given DNA sequences, represented as strings with characters in the set {A, G, C,T}. The similarity measure of r and y is defined as the maximum score of any alignment of r and y, where the score for an alignment is computed by adding substitution score and deletion and insertion scores, as explained below. (Some operations have negative scores.) The score for changing a character T, into a...
5. Let F(n, m) denote the number of paths from top-left cell to bottom-right cell in a (n x m) grid (that only permits moving right or moving down). It satisfies the recurrence relation F(n, m) F(n-1, m) + F(n, m-1) What should be the initial condition for this recurrence relation? (Hint: What would be the number of paths if there was only a single row or a single column in the grid?)[5] Convince yourself that F(n, m) gives correct...
5) There is an s miles long road, with n possible locations for advertisements at distances di,... , dn from the beginning of the road. Putting an advertise- ment at the i'th location brings in revenue pi. Two advertisements cannot be located closer than distance from each other. Find a set of locations for the advertisement bringing in maximal revenue. a) Design a dynamic programming algorithm for this problem b) Use your algorithm to find the optimal solution for distances...
Consider the system. (1) M →1.0) M +0.1 kg, B=0.2 N-s/m Mv(1) + By(t) = 1,01) Consider a system described by the following differential equation: 0.1"WX2 +0.2v(t) = .0), where y(t) and 4.0) are the output and the input of the system. dt (la) Convert the above differential equation into the form of the typical first-order dynamic system: + ) = ), and explain the physical meaning of the two parameters 7 and v.. (5%) dv(1) (1b) According to the...
Let S = {n ∈ N | 1 ≤ n < 6} and R = {(m, n) ∈ S × S | m ≡ n mod 3} a. List all numbers of S. b. List all ordered pairs in R. c. Does R satisfy any of the following properties: (R), (AR), (S), (AS), and/or (T)? d. Draw the digraph D presenting the relation R where S are the vertices, and R determines the directed edges. e. Give each edge in...
Consider the following ODE where m= 10 kg, b = 2 N-s/m and u(t) = 0. Assume initial conditions of z(0) = 15 and ż(0) = 25. Use MATLAB control system toolbox commands to simulate the dynamic response of the system for 0 st 35. m2 + bż = u(t) PLOT 2(t) and ż(t) on two sets of axes using the subplot command. REPORT: max 2 = max ż
Consider two sets of integers, S = {s1, s2, ..., sm} and T = {t1, t2, ..., tn}, m ≤ n. (a) Propose an algorithm (only pseudo-code) that uses a hash table of size m to test whether S is a subset of T. (b) What is the average running time complexity of your algorithm?
Let {an} m-o and {bn}ņ=be any two sequences of real numbers, we define the following: N • For any real number L € R, we write an = L if and only if lim Lan = L. N-0 n=0 n=0 X • We write an = bn if and only if there is a real number L such that n=0 n=0 I and Σ. = L. Select all the correct sentences in the following list: X η (Α) Σ Σ...