The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list...

Question

Question

The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list...

The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list of
start/stop positions of exons. Each exon is on a separate line and the start and stop positions are
separated by a comma. The start and stop positions follow Python conventions; they start from zero
and are inclusive at the start and exclusive at the end. Write a program that will extract the exon
segments, concatenate them, and write them to a new file.
This is a tricky exercise with several parts. Before starting, think about how to divide the complexity of
the problem to easier tasks.
Your program will have to:
 read the exon file line by line
 split each exon line into two numbers
 turn those numbers into integers
 extract the matching part of the genomic DNA sequence
 concatenate all the exon sequences together

FILES:

exon.txt

gennomic_dna.txt

TCGATCGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGACTGATCGATCGATCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGATCGATCATATGTCAGTCGATGCATCGTAGCATCGTATAGTAGCTACGTAGCTACGATCGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTAGCTAGTACGATCGCGTAGCTAGCATGCTACGTAGATCGATCGATGCATGCTAGCTAGCTAGCTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCGATGCTACGTAGATCGATCGCTAGTAGATCGATCGCTAGCTAGCTGACTAGTACGCTGCTAGTAGTCAGCTAGATCGATGCTAGTCA

python

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

# dna.py

# Open file genomic_data.txt and store into fin file object
fin = open("genonmic_dna.txt","r")
# Read the contents of the file and store to dna_sequence
dna_sequence = fin.read()
# Close the file object
fin.close()

# Open file exons.txt and store into fin file object
fin = open("exon.txt","r")
# Read the contents of the file and split it by the lines and store to a list called exons
exons = fin.read().splitlines()
# Close the file object
fin.close()

# String to store the final sequence
final_sequence = ""

# Iterate through each range in the exons list (every value is in the form 'number,number' )
for exon in exons:
# Split it by the comma and store the two values to starting_position and ending_position
starting_position, ending_position = exon.split(',')
# Currently starting_position and ending_position are simply strings that represent a number
# Convert it to integer using int()
starting_position = int(starting_position)
ending_position = int(ending_position)
# Extract the matching part from the dna_sequence
final_sequence = final_sequence + dna_sequence[starting_position:ending_position]

# Open file object for multiple.txt in write mode
fout = open("multiple.txt","w")
# Write the final_sequence
fout.write(final_sequence)
# Close the file object
fout.close()

Sample Input/Output

Suppose our genomic_dna.txt is as follows:

TCGATCGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGACTGATCGATCGATCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGATCGATCATATGTCAGTCGATGCATCGTAGCATCGTATAGTAGCTACGTAGCTACGATCGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTAGCTAGTACGATCGCGTAGCTAGCATGCTACGTAGATCGATCGATGCATGCTAGCTAGCTAGCTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCGATGCTACGTAGATCGATCGCTAGTAGATCGATCGCTAGCTAGCTGACTAGTACGCTGCTAGTAGTCAGCTAGATCGATGCTAGTCA

And exons.txt as follows:

then by running the python program, we get a new file multiple.txt whose content is as follows:

ACTTACGTACTA

Add a comment

Answer 2

The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list...

Homework Answers

Add Answer to:
The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list...

Post as a guest

Earn Coins

Write a Python function makeDict() that takes a filename as a parameter. The file contains DNA...

The following genomic DNA sequence comes from the first exon of a human gene and contains...

PYTHON The text file motifFinding.txt contains two strings of DNA code, separated by a new-line character....

In python Attached is a file called sequences.txt, it contains 3 sequences (one sequence per line)....

The deoxyribonucleic acid (DNA) is a molecule that contains the genetic instructions required for the development...

Random accesses to a file. A file contains a formatted list of 9999 integers that are...

Write a Java program called EqualSubsets that reads a text file, in.txt, that contains a list...

In either Java or Python 3, write a program that simulates a deterministic FSM. It will read from two input files. The first is a file describing an FSM The first line contains the alphabet as a seri...

In either Java or Python 3, write a program that simulates a deterministic FSM. It will read from two input files. The first is a file describing an FSM The first line contains the alphabet as a seri...

Exercise 6: Program exercise for 2D List Write a complete Python program including minimal comments (file...

The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list...

Homework Answers

Add Answer to: The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list...

Post as a guest

Earn Coins

Add Answer to:
The file genomic_dna.txt contains a section of genomic DNA, and the file exons.txt contains a list...