Genetics - Question 4 [15 marks]
Sequence alignment is a critical tool for the analysis of genome
data.
I. A high scoring segment pair means that there are no gaps and the given sequences of proteins have highest alignment scores. Alignment is when the two protein sequences have same amino acids at the same position, they are said to be aligned and alignment scores tells us about how much the two sequences are aligned. the higher the alignment score, the better is the alignment between two protein sequences. gaps inserted between the residues is for the purpose the similar characters are aligned in the successive columns.
II. General concept on which all scoring matrices are based is that they have a value for each possible substitution, and a alignment score that is the sum of all the entries of the matrix for each pair of amino acid that is aligned. A special gap score is also assigned in the case of gaps, a simple one just adds a penalty score. An optimal and perfect alignment is the one whose alignment score is high. Percent accepted mutation PAM is the most common scoring matrix used. PAM 250 means that 250 mutations per 100 amino acids, similarly PAM 10 means 10 mutations for 100 amino acids. if the score is in positive means that the sequences are related to other whereas a negative score means not related to each other.
III. The PAM scoring matrix is based in mutational model of evolution that is the Markov process whereas the BLOSSUM is based on the multiple alignment of blocks. Also PAM is based on sequences of 85% similarity whereas BLOSSUM is an excellent choice to compare distant sequences.PAM is specifically designed to track evolutionary origins while the BlOSSUM is used for finding the conserved domains of the proteins. The lesser the PAM score an dmore the BLOSSUM suggest that the sequences are less divergent and vice versa. Example PAM1 and BLOSSUM80 are less divergent as compared to PAM250 and BLOSSUM45.
IV.There are different types of BLAST like BLASTN, BLASTP, BLASTX, TBLASTN, TBLASTX. Each of them has unique features if BLASTN does not give the results we can use BLASTX in which a DNA query sequence translated into six reading frames is aligned against a protein sequence allowing for gaps. Also, TBLASTX can be used which compares a DNA query sequence translated into six reading frames against a DNA database translated into six reading frames, TBLASTX doesn't allow for gaps.
Genetics - Question 4 [15 marks] Sequence alignment is a critical tool for the analysis of genome data. Explain the pur...
x Assignment 1 - Database.pdf ... Learn how to access and use NCBI databases Question 1: Search Taxonomy database for: 1) Homo sapiens, 2) Heterodoxus macropus, 3) E. coli. a. What is the common name of the species? b. How many nucleotide or protein sequence records do you find (show your search results in cropped windows)? Question 2: Use the name "plague thrips" to search the Nucleotide database. a. What is the scientific name of the plague thrips? b. How...