Sequencing by hybridization. One experimental procedure for identifying a new DNA sequence repeatedly probes it to determine which k-mers (substrings of length k) it contains. Based on these, the full sequence must then be reconstructed.
Let’s now formulate this as a combinatorial problem. For any string x (the DNA sequence), let ⌈(x) denote the multiset of all of its k-mers. In particular, ⌈(x) contains exactly |x|- k + 1 elements.
The reconstruction problem is now easy to state: given a multiset of k-length strings, find a string x such that ⌈(x) is exactly this multiset.
(a) Show that the reconstruction problem reduces to RUDRATA PATH
. (Hint: Construct a directed graph with one node for each k-mer, and with an edge from a to b if the last k – 1 characters of a match the first k – 1 characters of b.)
(b) But in fact, there is much better news. Show that the same problem also reduces to EULER PATH
. (Hint: This time, use one directed edge for each k-mer.)
We need at least 10 more requests to produce the solution.
0 / 10 have requested this problem solution
The more requests, the faster the answer.