Find the human BRCA1 gene transcript mRNA using NCBI refseq database.(hint: accession NM_007299.)
(a) What is the GI number of the protein?
(b) What is the length of the mRNA sequence?
(c) Write down the mRNA sequence as it is shown on the database.
(d) Also find and write the sequence in FASTA format.
(e) How many of each of the four nucleotides A, C, T and G, are there in the genome?
(f) How many occurrences of the DNA words CC, CG and GC occur in the genome?
(g) What are the last 50 nucleotides of the sequence?
(h) Provide the titles and Pubmed IDs (Pubmed ID is the number associated with the article and displayed after the title and Journal name in the annotations) of the First Three related Reference Journal Articles displayed?
(i) What does it say in the COMMENT field of its NCBI annotations record?
(j) Plot the Pie Chart representing frequencies of four nucleotides A, C, T and G in the genome
The GI number is 237681124
Length= 3,800 bp
mRNA-
CUUAGCGGUAGCCCCUUGGUUUCCGUGGCAACGGAAAAGCGCGGGAAUUACAGAUAAAUUAAAACUGCGACUG
CGCGGCGUGAGCUCGCUGAGACUUCCUGGACGGGGGACAGGCUGUGGGGUUUCUCAGAUAACUGGGCCCCU
GCGCUCAGGAGGCCUUCACCCUCUGCUCUGGUUCAUUGGAACAGAAAGAAAUGGAUUUAUCUGCUCUUCGCG
UUGAAGAAGUACAAAAUGUCAUUAAUGCUAUGCAGAAAAUCUUAGAGUGUCCCAUCUGUCUGGAGUUGAUCAA
GGAACCUGUCUCCACAAAGUGUGACCACAUAUUUUGCAAAUUUUGCAUGCUGAAACUUCUCAACCAGAAGAAA
GGGCCUUCACAGUGUCCUUUAUGUAAGAAUGAUAUAACCAAAAGGAGCCUACAAGAAAGUACGAGAUUUAGU
CAACUUGUUGAAGAGCUAUUGAAAAUCAUUUGUGCUUUUCAGCUUGACACAGGUUUGGAGUAUGCAAACAG
CUAUAAUUUUGCAAAAAAGGAAAAUAACUCUCCUGAACAUCUAAAAGAUGAAGUUUCUAUCAUCCAAAGUAU
GGGCUACAGAAACCGUGCCAAAAGACUUCUACAGAGUGAACCCGAAAAUCCUUCCUUGCAGGAAACCAGUC
UCAGUGUCCAACUCUCUAACCUUGGAACUGUGAGAACUCUGAGGACAAAGCAGCGGAUACAACCUCAAAAG
ACGUCUGUCUACAUUGAAUUGGGAUCUGAUUCUUCUGAAGAUACCGUUAAUAAGGCAACUUAUUGCAGUG
UGGGAGAUCAAGAAUUGUUACAAAUCACCCCUCAAGGAACCAGGGAUGAAAUCAGUUUGGAUUCUGCAAA
AAAGGCUGCUUGUGAAUUUUCUGAGACGGAUGUAACAAAUACUGAACAUCAUCAACCCAGUAAUAAUGAU
UUGAACACCACUGAGAAGCGUGCAGCUGAGAGGCAUCCAGAAAAGUAUCAGGGUGAAGCAGCAUCUGGG
UGUGAGAGUGAAACAAGCGUCUCUGAAGACUGCUCAGGGCUAUCCUCUCAGAGUGACAUUUUAACCACU
CAGCAGAGGGAUACCAUGCAACAUAACCUGAUAAAGCUCCAGCAGGAAAUGGCUGAACUAGAAGCUGUGU
UAGAACAGCAUGGGAGCCAGCCUUCUAACAGCUACCCUUCCAUCAUAAGUGACUCUUCUGCCCUUGAGG
ACCUGCGAAAUCCAGAACAAAGCACAUCAGAAAAAGUAUUAACUUCACAGAAAAGUAGUGAAUACCCUAUA
AGCCAGAAUCCAGAAGGCCUUUCUGCUGACAAGUUUGAGGUGUCUGCAGAUAGUUCUACCAGUAAAAAU
AAAGAACCAGGAGUGGAAAGGUCAUCCCCUUCUAAAUGCCCAUCAUUAGAUGAUAGGUGGUACAUGCACA
GUUGCUCUGGGAGUCUUCAGAAUAGAAACUACCCAUCUCAAGAGGAGCUCAUUAAGGUUGUUGAUGUGG
AGGAGCAACAGCUGGAAGAGUCUGGGCCACACGAUUUGACGGAAACAUCUUACUUGCCAAGGCAAGAUC
UAGAGGGAACCCCUUACCUGGAAUCUGGAAUCAGCCUCUUCUCUGAUGACCCUGAAUCUGAUCCUUCUG
AAGACAGAGCCCCAGAGUCAGCUCGUGUUGGCAACAUACCAUCUUCAACCUCUGCAUUGAAAGUUCCCCA
AUUGAAAGUUGCAGAAUCUGCCCAGAGUCCAGCUGCUGCUCAUACUACUGAUACUGCUGGGUAUAAUGC
AAUGGAAGAAAGUGUGAGCAGGGAGAAGCCAGAAUUGACAGCUUCAACAGAAAGGGUCAACAAAAGAAUG
UCCAUGGUGGUGUCUGGCCUGACCCCAGAAGAAUUUAUGCUCGUGUACAAGUUUGCCAGAAAACACCAC
AUCACUUUAACUAAUCUAAUUACUGAAGAGACUACUCAUGUUGUUAUGAAAACAGAUGCUGAGUUUGUGU
GUGAACGGACACUGAAAUAUUUUCUAGGAAUUGCGGGAGGAAAAUGGGUAGUUAGCUAUUUCUGGGUGA
CCCAGUCUAUUAAAGAAAGAAAAAUGCUGAAUGAGCAUGAUUUUGAAGUCAGAGGAGAUGUGGUCAAUGG
AAGAAACCACCAAGGUCCAAAGCGAGCAAGAGAAUCCCAGGACAGAAAGAUCUUCAGGGGGCUAGAAAUC
UGUUGCUAUGGGCCCUUCACCAACAUGCCCACAGGGUGUCCACCCAAUUGUGGUUGUGCAGCCAGAUGC
CUGGACAGAGGACAAUGGCUUCCAUGCAAUUGGGCAGAUGUGUGAGGCACCUGUGGUGACCCGAGAGU
GGGUGUUGGACAGUGUAGCACUCUACCAGUGCCAGGAGCUGGACACCUACCUGAUACCCCAGAUCCCCC
ACAGCCACUACUGACUGCAGCCAGCCACAGGUACAGAGCCACAGGACCCCAAGAAUGAGCUUACAAAGUG
GCCUUUCCAGGCCCUGGGAGCUCCUCUCACUCUUCAGUCCUUCUACUGUCCUGGCUACUAAAUAUUUUA
UGUACAUCAGCCUGAAAAGGACUUCUGGCUAUGCAAGGGUCCCUUAAAGAUUUUCUGCUUGAAGUCUCC
CUUGGAAAUCUGCCAUGAGCACAAAAUUAUGGUAAUUUUUCACCUGAGAAGAUUUUAAAACCAUUUAAAC
GCCACCAAUUGAGCAAGAUGCUGAUUCAUUAUUUAUCAGCCCUAUUCUUUCUAUUCAGGCUGUUGUUGG
CUUAGGGCUGGAAGCACAGAGUGGCUUGGCCUCAAGAGAAUAGCUGGUUUCCCUAAGUUUACUUCUCUA
AAACCCUGUGUUCACAAAGGCAGAGAGUCAGACCCUUCAAUGGAAGGAGAGUGCUUGGGAUCGAUUAUG
UGACUUAAAGUCAGAAUAGUCCUUGGGCAGUUCUCAAAUGUUGGAGUGGAACAUUGGGGAGGAAAUUCU
GAGGCAGGUAUUAGAAAUGAAAAGGAAACUUGAAACCUGGGCAUGGUGGCUCACGCCUGUAAUCCCAGC
ACUUUGGGAGGCCAAGGUGGGCAGAUCACUGGAGGUCAGGAGUUCGAAACCAGCCUGGCCAACAUGGU
GAAACCCCAUCUCUACUAAAAAUACAGAAAUUAGCCGGUCAUGGUGGUGGACACCUGUAAUCCCAGCUAC
UCAGGUGGCUAAGGCAGGAGAAUCACUUCAGCCCGGGAGGUGGAGGUUGCAGUGAGCCAAGAUCAUA
CCACGGCACUCCAGCCUGGGUGACAGUGAGACUGUGGCUCAAAAAAAAAAAAAAAAAAAGGAAAAUGA
AACUAGAAGAGAUUUCUAAAAGUCUGAGAUAUAUUUGCUAGAUUUCUAAAGAAUGUGUUCUAAAACAG
CAGAAGAUUUUCAAGAACCGGUUUCCAAAGACAGUCUUCUAAUUCCUCAUUAGUAAUAAGUAAAAUGU
UUAUUGUUGUAGCUCUGGUAUAUAAUCCAUUCCUCUUAAAAUAUAAGACCUCUGGCAUGAAUAUUUCA
UAUCUAUAAAAUGACAGAUCCCACCAGGAAGGAAGCUGUUGCUUUCUUUGAGGUGAUUUUUUUCCUU
UGCUCCCUGUUGCUGAAACCAUACAGCUUCAUAAAUAAUUUUGCUUGCUGAAGGAAGAAAAAGUGUU
UUUCAUAAACCCAUUAUCCAGGACUGUUUAUAGCUGUUGGAAGGACUAGGUCUUCCCUAGCCCCCCC
AGUGUGCAAGGGCAGUGAAGACUUGAUUGUACAAAAUACGUUUUGUAAAUGUUGUGCUGUUAACACU
GCAAAUAAACUUGGUAGCAAACACUUCCAAAAAAAAAAAAAAAAAA
FastA_
CTTAGCGGTAGCCCCTTGGTTTCCGTGGCAACGGAAAAGCGCGGGAATTACAGATAAATTAAAACTGCGA CTGCGCGGCGTGAGCTCGCTGAGACTTCCTGGACGGGGGACAGGCTGTGGGGTTTCTCAGATAACTGGGC CCCTGCGCTCAGGAGGCCTTCACCCTCTGCTCTGGTTCATTGGAACAGAAAGAAATGGATTTATCTGCTC TTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAAATCTTAGAGTGTCCCATCTGTCTGGA GTTGATCAAGGAACCTGTCTCCACAAAGTGTGACCACATATTTTGCAAATTTTGCATGCTGAAACTTCTC AACCAGAAGAAAGGGCCTTCACAGTGTCCTTTATGTAAGAATGATATAACCAAAAGGAGCCTACAAGAAA GTACGAGATTTAGTCAACTTGTTGAAGAGCTATTGAAAATCATTTGTGCTTTTCAGCTTGACACAGGTTT GGAGTATGCAAACAGCTATAATTTTGCAAAAAAGGAAAATAACTCTCCTGAACATCTAAAAGATGAAGTT TCTATCATCCAAAGTATGGGCTACAGAAACCGTGCCAAAAGACTTCTACAGAGTGAACCCGAAAATCCTT CCTTGCAGGAAACCAGTCTCAGTGTCCAACTCTCTAACCTTGGAACTGTGAGAACTCTGAGGACAAAGCA GCGGATACAACCTCAAAAGACGTCTGTCTACATTGAATTGGGATCTGATTCTTCTGAAGATACCGTTAAT AAGGCAACTTATTGCAGTGTGGGAGATCAAGAATTGTTACAAATCACCCCTCAAGGAACCAGGGATGAAA TCAGTTTGGATTCTGCAAAAAAGGCTGCTTGTGAATTTTCTGAGACGGATGTAACAAATACTGAACATCA TCAACCCAGTAATAATGATTTGAACACCACTGAGAAGCGTGCAGCTGAGAGGCATCCAGAAAAGTATCAG GGTGAAGCAGCATCTGGGTGTGAGAGTGAAACAAGCGTCTCTGAAGACTGCTCAGGGCTATCCTCTCAGA GTGACATTTTAACCACTCAGCAGAGGGATACCATGCAACATAACCTGATAAAGCTCCAGCAGGAAATGGC TGAACTAGAAGCTGTGTTAGAACAGCATGGGAGCCAGCCTTCTAACAGCTACCCTTCCATCATAAGTGAC TCTTCTGCCCTTGAGGACCTGCGAAATCCAGAACAAAGCACATCAGAAAAAGTATTAACTTCACAGAAAA GTAGTGAATACCCTATAAGCCAGAATCCAGAAGGCCTTTCTGCTGACAAGTTTGAGGTGTCTGCAGATAG TTCTACCAGTAAAAATAAAGAACCAGGAGTGGAAAGGTCATCCCCTTCTAAATGCCCATCATTAGATGAT AGGTGGTACATGCACAGTTGCTCTGGGAGTCTTCAGAATAGAAACTACCCATCTCAAGAGGAGCTCATTA AGGTTGTTGATGTGGAGGAGCAACAGCTGGAAGAGTCTGGGCCACACGATTTGACGGAAACATCTTACTT GCCAAGGCAAGATCTAGAGGGAACCCCTTACCTGGAATCTGGAATCAGCCTCTTCTCTGATGACCCTGAA TCTGATCCTTCTGAAGACAGAGCCCCAGAGTCAGCTCGTGTTGGCAACATACCATCTTCAACCTCTGCAT TGAAAGTTCCCCAATTGAAAGTTGCAGAATCTGCCCAGAGTCCAGCTGCTGCTCATACTACTGATACTGC TGGGTATAATGCAATGGAAGAAAGTGTGAGCAGGGAGAAGCCAGAATTGACAGCTTCAACAGAAAGGGTC AACAAAAGAATGTCCATGGTGGTGTCTGGCCTGACCCCAGAAGAATTTATGCTCGTGTACAAGTTTGCCA GAAAACACCACATCACTTTAACTAATCTAATTACTGAAGAGACTACTCATGTTGTTATGAAAACAGATGC TGAGTTTGTGTGTGAACGGACACTGAAATATTTTCTAGGAATTGCGGGAGGAAAATGGGTAGTTAGCTAT TTCTGGGTGACCCAGTCTATTAAAGAAAGAAAAATGCTGAATGAGCATGATTTTGAAGTCAGAGGAGATG TGGTCAATGGAAGAAACCACCAAGGTCCAAAGCGAGCAAGAGAATCCCAGGACAGAAAGATCTTCAGGGG GCTAGAAATCTGTTGCTATGGGCCCTTCACCAACATGCCCACAGGGTGTCCACCCAATTGTGGTTGTGCA GCCAGATGCCTGGACAGAGGACAATGGCTTCCATGCAATTGGGCAGATGTGTGAGGCACCTGTGGTGACC CGAGAGTGGGTGTTGGACAGTGTAGCACTCTACCAGTGCCAGGAGCTGGACACCTACCTGATACCCCAGA TCCCCCACAGCCACTACTGACTGCAGCCAGCCACAGGTACAGAGCCACAGGACCCCAAGAATGAGCTTAC AAAGTGGCCTTTCCAGGCCCTGGGAGCTCCTCTCACTCTTCAGTCCTTCTACTGTCCTGGCTACTAAATA TTTTATGTACATCAGCCTGAAAAGGACTTCTGGCTATGCAAGGGTCCCTTAAAGATTTTCTGCTTGAAGT CTCCCTTGGAAATCTGCCATGAGCACAAAATTATGGTAATTTTTCACCTGAGAAGATTTTAAAACCATTT AAACGCCACCAATTGAGCAAGATGCTGATTCATTATTTATCAGCCCTATTCTTTCTATTCAGGCTGTTGT TGGCTTAGGGCTGGAAGCACAGAGTGGCTTGGCCTCAAGAGAATAGCTGGTTTCCCTAAGTTTACTTCTC TAAAACCCTGTGTTCACAAAGGCAGAGAGTCAGACCCTTCAATGGAAGGAGAGTGCTTGGGATCGATTAT GTGACTTAAAGTCAGAATAGTCCTTGGGCAGTTCTCAAATGTTGGAGTGGAACATTGGGGAGGAAATTCT GAGGCAGGTATTAGAAATGAAAAGGAAACTTGAAACCTGGGCATGGTGGCTCACGCCTGTAATCCCAGCA CTTTGGGAGGCCAAGGTGGGCAGATCACTGGAGGTCAGGAGTTCGAAACCAGCCTGGCCAACATGGTGAA ACCCCATCTCTACTAAAAATACAGAAATTAGCCGGTCATGGTGGTGGACACCTGTAATCCCAGCTACTCA GGTGGCTAAGGCAGGAGAATCACTTCAGCCCGGGAGGTGGAGGTTGCAGTGAGCCAAGATCATACCACGG CACTCCAGCCTGGGTGACAGTGAGACTGTGGCTCAAAAAAAAAAAAAAAAAAAGGAAAATGAAACTAGAA GAGATTTCTAAAAGTCTGAGATATATTTGCTAGATTTCTAAAGAATGTGTTCTAAAACAGCAGAAGATTT TCAAGAACCGGTTTCCAAAGACAGTCTTCTAATTCCTCATTAGTAATAAGTAAAATGTTTATTGTTGTAG CTCTGGTATATAATCCATTCCTCTTAAAATATAAGACCTCTGGCATGAATATTTCATATCTATAAAATGA CAGATCCCACCAGGAAGGAAGCTGTTGCTTTCTTTGAGGTGATTTTTTTCCTTTGCTCCCTGTTGCTGAA ACCATACAGCTTCATAAATAATTTTGCTTGCTGAAGGAAGAAAAAGTGTTTTTCATAAACCCATTATCCA GGACTGTTTATAGCTGTTGGAAGGACTAGGTCTTCCCTAGCCCCCCCAGTGTGCAAGGGCAGTGAAGACT TGATTGTACAAAATACGTTTTGTAAATGTTGTGCTGTTAACACTGCAAATAAACTTGGTAGCAAACACTT CCAAAAAAAAAAAAAAAAAA
Find the human BRCA1 gene transcript mRNA using NCBI refseq database.(hint: accession NM_007299.) (a) What is...
x Assignment 1 - Database.pdf ... Learn how to access and use NCBI databases Question 1: Search Taxonomy database for: 1) Homo sapiens, 2) Heterodoxus macropus, 3) E. coli. a. What is the common name of the species? b. How many nucleotide or protein sequence records do you find (show your search results in cropped windows)? Question 2: Use the name "plague thrips" to search the Nucleotide database. a. What is the scientific name of the plague thrips? b. How...
x Assignment 1 - Database.pdf ... Learn how to access and use NCBI databases Question 1: Search Taxonomy database for: 1) Homo sapiens, 2) Heterodoxus macropus, 3) E. coli. a. What is the common name of the species? b. How many nucleotide or protein sequence records do you find (show your search results in cropped windows)? Question 2: Use the name "plague thrips" to search the Nucleotide database. a. What is the scientific name of the plague thrips? b. How...