Question

The unit of length in this question is Kmer. In search size of 16mer, with two...

The unit of length in this question is Kmer.

In search size of 16mer, with two spaced 4mer seeds in each genomic location. For example. each 16 nucleotide window in the genome is represented.

As described above, how many entries in a spaced seed index?

Explanation would be greatly appreciated.

Assignment website said 2 is not the answer.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

K-mers are a fairly simple concept that turn out to be tremendously powerful.

A “k-mer” is a word of DNA that is k long:

ATTG - a 4-mer
ATGGAC - a 6-mer

Typically we extract k-mers from genomic assemblies or read data sets by running a k-length window across all of the reads and sequences – e.g. given a sequence of length 16, you could extract 11 k-mers of length six from it like so:

AGGATGAGACAGATAG

becomes the following set of 6-mers:

AGGATG
 GGATGA
  GATGAG
   ATGAGA
    TGAGAC
     GAGACA
      AGACAG
       GACAGA
        ACAGAT
         CAGATA
          AGATAG

k-mers are most useful when they’re long, because then they’re specific. That is, if you have a 31-mer taken from a human genome, it’s pretty unlikely that another genome has that exact 31-mer in it. (You can calculate the probability if you assume genomes are random: there are 431possible 31-mers, and 431 = 4,611,686,018,427,387,904. So, you know, a lot.)

The important concept here is that long k-mers are species specific. We’ll go into a bit more detail later.

K-mers and assembly graphs

We’ve already run into k-mers before, as it turns out - when we were doing genome assembly. One of the three major ways that genome assembly works is by taking reads, breaking them into k-mers, and then “walking” from one k-mer to the next to bridge between reads. To see how this works, let’s take the 16-base sequence above, and add another overlapping sequence:

AGGATGAGACAGATAG
    TGAGACAGATAGGATTGC

One way to assemble these together is to break them down into k-mers –

becomes the following set of 6-mers:

AGGATG
 GGATGA
  GATGAG
   ATGAGA
    TGAGAC
     GAGACA
      AGACAG
       GACAGA
        ACAGAT
         CAGATA
          AGATAG -> off the end of the first sequence
           GATAGG <- beginning of the second sequence
            ATAGGA
             TAGGAT
              AGGATT
               GGATTG
                GATTGC

and if you walk from one 6-mer to the next based on 5-mer overlap, you get the assembled sequence:

AGGATGAGACAGATAGGATTGC

Graphs of many k-mers together are called De Bruijn graphs, and assemblers like MEGAHIT and SOAPdenovo are De Bruijn graph assemblers - they use k-mers underneath.

Add a comment
Know the answer?
Add Answer to:
The unit of length in this question is Kmer. In search size of 16mer, with two...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Make a program using Java that asks the user to input an integer "size". That integer...

    Make a program using Java that asks the user to input an integer "size". That integer makes and prints out an evenly spaced, size by size 2D array (ex: 7 should make an index of 0-6 for col and rows). The array must be filled with random positive integers less than 100. Then, using recursion, find a "peak" and print out its number and location. (A peak is basically a number that is bigger than all of its "neighbors" (above,...

  • UNIT 33 | Embryonic Development and Heredity 671 s. What is the specific function of the...

    UNIT 33 | Embryonic Development and Heredity 671 s. What is the specific function of the two umbilical arteries? 2. Observe a fresh or preserved animal fetus and placenta, if available. Identify the following structures and then write a brief description of each a. Placenta b. Amnion (amniotic sac) c. Umbilical cord 蚰鷊 3. Observe a model of a pregnant human torso, if available Figre33-7 Human male karyotype with 23 pairs The placenta is located in which region of the...

  • For this assignment, your job is to create a simple game called Opoly. The objectives of...

    For this assignment, your job is to create a simple game called Opoly. The objectives of this assignment are to: Break down a problem into smaller, easier problems. Write Java methods that call upon other methods to accomplish tasks. Use a seed value to generate random a sequence of random numbers. Learn the coding practice of writing methods that perform a specific task. Opoly works this way: The board is a circular track of variable length (the user determines the...

  • and w Two-dimensional gel electrophoresis separates proteins based on a. shape; charge Ob.size; concentration c. concentration;...

    and w Two-dimensional gel electrophoresis separates proteins based on a. shape; charge Ob.size; concentration c. concentration; shape O d. size, charge O e. size; shape Refer to the table. Several strains of a bacterium are sequenced to investigate the pan and core genomes. In the table, + denotes presence of the gene and denotes its absence. Gene Gene Gene Gene Gene Strain ! Strain 2 + Strain 3 + Strain 4 + + + + Strain 5 + + What...

  • ***JAVA: Please make "Thing" movies. Objective In this assignment, you are asked to implement a bag...

    ***JAVA: Please make "Thing" movies. Objective In this assignment, you are asked to implement a bag collection using a linked list and, in order to focus on the linked list implementation details, we will implement the collection to store only one type of object of your choice (i.e., not generic). You can use the object you created for program #2 IMPORTANT: You may not use the LinkedList class from the java library. Instead, you must implement your own linked list...

  • 2. A dominant allele H reduces the number of body bristles that Drosophila flies have, giving...

    2. A dominant allele H reduces the number of body bristles that Drosophila flies have, giving rise to a “hairless” phenotype. In the homozygous condition, H is lethal. An independently assorting dominant allele S has no effect on bristle number except in the presence of H, in which case a single dose of S suppresses the hairless phenotype, thus restoring the "hairy" phenotype. However, S also is lethal in the homozygous (S/S) condition. What ratio of hairy to hairless flies...

  • Assignment Details The Unit 6 Assignment requires you to consider how effective teams are built. Some...

    Assignment Details The Unit 6 Assignment requires you to consider how effective teams are built. Some considerations in this assignment include the traits of an effective team leader as well as the strategies one would use to recruit team members that would work effectively together. Using material from Chapter 12 of your text as well as the article in the supplemental reading (Rao, 2016), you will write an informative essay sharing best practices for effective team-building. Outcomes evaluated through this...

  • can u tell me if these answers are correct please!??!!! Choose the best answer for the...

    can u tell me if these answers are correct please!??!!! Choose the best answer for the following questions. Place your answer on the line. If your answer is not on the line.it does not count 1 Mender's discovery that characteristics are inherited due to the transmission of hereditary factors resulted from his (1) dissections to determine how fertilization occurs in pea plants (2analysis of the offspring produced from many pea plant crosses (3) careful microscopic examinations of genes and chromosomes...

  • code in java please:) Part I Various public transporation can be described as follows: A PublicTransportation...

    code in java please:) Part I Various public transporation can be described as follows: A PublicTransportation class with the following: ticket price (double type) and mumber of stops (int type). A CityBus is a PublicTransportation that in addition has the following: an route number (long type), an begin operation year (int type), a line name (String type), and driver(smame (String type) -A Tram is a CityBus that in addition has the following: maximum speed (int type), which indicates the maximum...

  • in java no mathcall please ennas are out of date.Please sign was wkollieemail.cpeedu so we can...

    in java no mathcall please ennas are out of date.Please sign was wkollieemail.cpeedu so we can verily your subscription Sign In ASSIGNMENT Unit 9 Methods, Arrays, and the Java Standard Class Library Reimplementing the Arrays class (60 points) The Arrays class, which is also part of the Java standard class library, provides various class methods that perform common tasks on arrays, such as searching and sorting. Write a public class called MyArrays, which provides some of the functionality which is...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT