This sequence encodes a protein that you wish to study further by cloning the gene. The protein of interest has a molecular weight of at least 10 kDa. The sequence was sequenced by a primer walking approach, but unfortunately the correct order of the sequence traces was lost (really bad lab notebook skills).
Trace 1
aacgagttaaggagccagcgtaccttcgcaccgccatacatgaattttcttggctttttctatgtggatggcaatagtctagagtcggacctgcaggcatgcaag
Trace 2
cagcctgttagtaggtcttactgagtcgggcgccgaattcgagctcggtacccggggatccatgagtgggcgccagttattcgtactattgggaggtcc
Trace 3
ccgggtattttgtattcaatattgaataaggaatttttcatgcagagaaaaggatgtttacggctcgagcgcactcgcacatataactgtcggcagaaacgagttaaggagc
Trace 4
tactattgggaggtccaaatggttttacgggagttgcactgggcgaatgctggagctattacgattcgtttaacatttccatatcgaattctcagacgactccgaatccgggtatttt
Trace 5
cctgcaggcatgcaagcttggcataggtcagttcatccagggtgatgggtgtatcgtttcaatgacccgattcggaacg
Answer the following--
Determine the correct order of the sequence traces and the entire sequence.
Locate an open reading frame that could encode the protein of interest.
Determine the amino acid sequence, the molecular weight, and the isoelectric point of this protein.
Design a set of PCR primers that will amplify at a minimum this entire coding region. These primers must pass all the requirements for good primers.
Develop a cloning strategy that would enable you to clone at a minimum the coding region from this sequence into the vector pUC19, so that the coding region is in a clockwise orientation. There are several ways to accomplish this cloning objective; the only requirement is that it is theoretically possible and that you can show that the coding region is in a clockwise orientation.
1. The trace sequences can be aligned together by pasting the entre set in a word document and searching the initial string of nucleotides of one trace sequence that would match with any of the other trace sequences. This way, The beginning and the end of each sequence would be known and the entire sequence can be merged. Once we do that, we find that the sequences should be in the following order: Trace 2 > Trace 4 > Trace 3 > Trace 1 > Trace 5. The matching sequences are highlighted in the same color in the image below:
2. The ORF for the merged entire sesquence can be identified using the software ORF finder online tool in the NCBI website. The results would indicate differerent possible ORFs chosen at different frames and also the corresponding amino acid sequence and the number of amino acids. Since the size of the protein is given as ~10 KDa, and the approximate molecular weight of an amino acid is 110 Daltons, we need to look for an ORF that is at least 10000 / 110 = 91 amino acids. The first ORF displayed is 100 amino acids and it starts from the nucleotide 62 and ends at 364. The ORF with start and stop codons is given below:
The corresponding amino acid sequence is :
>lcl|ORF1
MSGRQLFVLLGGPNGFTGVALGECWSYYDSFNISISNSQTTPNPGILYSI
LNKEFFMQRKGCLRLERTRTYNCRQKRVKEPAYLRTAIHEFSWLFLCGWQ
The molecular weight and isoelectric point of the above amino acid sequence can be determined using the online tool 'protparam' available in the ExPaSy website.
Protparam analysis of the above sequence has computed the molecular weight and pI of the protein to be
Molecular weight: 11609.37 Theoretical pI: 9.46
In order to design primers for the sequence, we would choose the region near the start codon to design the forward primer and the region near the stop codon to design the reverse primer. The primers should flank the entire ORF region.
>entire_sequence
Cagcctgttagtaggtcttactgagtcgggcgccgaattcgagctcggtacccggggatccatgagtgggcgccagttattcgtactattgggaggtccaaatggttttacgggagttgcactgggcgaatgctggagctattacgattcgtttaacatttccatatcgaattctcagacgactccgaatccgggtattttgtattcaatattgaataaggaatttttcatgcagagaaaaggatgtttacggctcgagcgcactcgcacatataactgtcggcagaaacgagttaaggagccagcgtaccttcgcaccgccatacatgaattttcttggctttttctatgtggatggcaatagtctagagtcggacctgcaggcatgcaagcttggcataggtcagttcatccagggtgatgggtgtatcgtttcaatgacccgattcggaacg
The primers should have more or less the same Tm. It is better for annealing if the 3' end of the primers have C or G nucleotides. Also, primers should have a GC content of about 40 to 60%. Tm or the melting temperature is calculated using the formula:
Tm = 2 * ( A + T) + 4 * (G + C)
Ideally, the Tm should be between 60C and 70C
Forward primer: 5'- ATGAGTGGGCGCCAGTTATTCG -3'
Tm = 2* (10) + 4*(12) = 20 + 48 = 68C
Reverse primer: 5'- CTAGACTATTGCCATCCACATAG -3'
Tm = 2* (13) + 4* (10) = 26 + 40 = 66C
Cloning:
Cloning of the above sequence can be done by identifying the right restriction enzymes. The restrction enzyme should not cut the ORF region, and for the integration of the DNA insert into the vector in the correct 5'>3' orientation, one should use directional cloning, where two restriction enzymes are used instead of one..
The first step to plan for cloning of the DNA is to identify the restriction sites in the MCS region of the given plasmid and analysing the above DNA sequence using the web tool, "webcutter".
Using the pUC19 vector map, we can identify the sites in the MCS region. The information can be obtained from NEB website. There are several restriction sites given but we could consider the commonly available enzyme sites. The (common) restriction sites in the MCS in 5' > 3' orientation are : Hind III > Sph1 > Pst1 > Sal1 > Xba1 > BamH1 > Xma1 > Sma1 > Kpn1 > Sac1 > Eco R1.
With Webcutter analysis, the list of restriction enzymes (that match with the MCS of pUC19) that cut the above sequence "only once" and those that do not cut the sequnence and their corresponding nucleotide positions are given below:
Hind III - 396
Sph 1 - 394
Pst1 - 388
Sal1- does not cut the sequence
Xba1- 371
BamH1- 62
Xma1- 57
Sma1- 59
Kpn1- 57
Sac1- 51
EcoR1- cuts twice, once once into the ORF
Now, we cannot use HindIII here as it would result in the DNA to integrate in the reverse orientation. The right set of enzymes to be used are Sal 1 and Xba1, Sal1 site can be inroduced in the forward primer and Xba 1 in the reverse primer, and the sequence can be amplified and cloned into pUC19.
This sequence encodes a protein that you wish to study further by cloning the gene. The...
Molecular Bio lab. HELP!! Here is the first part: the sequence traces and the entire sequence. i just need the last 3 tasks. i color coded the ends so you can see where it overlaps and connects In the files section for your group there is a simulated output from an automated DNA sequencer using a variation of the classic Sanger method. (If you want to print it, it is formatted for legal sized paper.) This sequence encodes a protein...
A1. The following is the DNA sequence of a hypothetical gene for the SMALL protein. It is called the SMALL gene. i atgggattac actgtcacga ccaaatagcc ttcattgtat 41 caaaaggato aatcgagtta tag Imagine you are doing a research project in a laboratory and your supervisor asks you to clone the SMALL gene into the PBR322 plasmid (shown below). You must use the Pstl and EcoRI sites for your cloning. HindIII EcoRI | EcoRV BamHI 4359 0 29 185 4000 375 Sall Psti...
molecular biology Section C (40 marks) Answer ALL questions from this Section 5. You have isolated total RNA from muscle cells and constructeda muscle cDNA library. You wish to study the regulatory region of a muscle-specific cDNA gene (gene M) that you have previously identified. 6 (a) For your study, you need to isolate a genomic clone of gene M. Why isa cDNA clone of gene M not appropriate for your study? (2 marks) (b) Outline the steps you would...