Download presentation
Presentation is loading. Please wait.
Published byGladys Boyd Modified over 9 years ago
1
Introducing Database Mining to Molecular Genetics Students (Juniors & Seniors) Karl Wilson
2
Objectives: Introduce students to online protein and nucleotide databases (via GenBank at the NCBI website). Specific operations: –Use of BLAST to find similar sequences (protein & nucleotide) –Downloading and saving sequences –Comparison of sequences and alignment with ClustalW –Interpretation of phylogenetic data.
3
The “test” protein sequence: AAA92063AAA92063. cysteinyl endopep...[gi:1223922] LOCUS AAA92063 362 aa linear PLN 22-AUG-2002 DEFINITION cysteinyl endopeptidase [Vigna radiata]. ACCESSION AAA92063 VERSION AAA92063.1 GI:1223922 DBSOURCE locus VRU49445 accession U49445.1U49445.1 KEYWORDS. SOURCE Vigna radiata ORGANISM Vigna radiataVigna radiata Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Fabales; Fabaceae; Papilionoideae; Phaseoleae; Vigna. REFERENCE 1 (residues 1 to 362) AUTHORS Lee,K., Tan-Wilson,A.L. and Wilson,K.A. TITLE Direct Submission JOURNAL Submitted (16-FEB-1996) K. Lee, Department of Biological Sciences, State University of New York at Binghamton, P.O. Box 6000, Binghamton, NY 13902-6000, USA
4
Student given VRU49445 sequence (only) via e-mail or Blackboard Find sequence via Entrez, download in Fasta format VRU49445 sequence Submit to Protein-Protein BLAST (BLASTP) BLASTP results – related sequences
5
Score E Sequences producing significant alignments: (bits) Value gi|1223922|gb|AAA92063.1| cysteinyl endopeptidase [Vigna ra... 705 0.0 gi|118158|sp|P12412|CYSP_VIGMU Vignain precursor (Bean endo... 686 0.0 gi|445927|prf||1910332A Cys endopeptidase 684 0.0 gi|1223922|gb|AAA92063.1|705 gi|118158|sp|P12412|CYSP_VIGMU686 gi|445927|prf||1910332A684 gi|7435774|pir||S22502gi|7435774|pir||S22502 cysteine proteinase (EC 3.4.22.-) -... 677 0.0 gi|544129|sp|P25803|CYSP_PHAVU Vignain precursor (Bean endo... 674 0.0 gi|1345573|emb|CAA40073.1| endopeptidase (EP-C1) [Phaseolus... 673 0.0 gi|31559530|dbj|BAC77523.1| cysteine proteinase [Glycine ma... 657 0.0 gi|31559526|dbj|BAC77521.1| cysteine proteinase [Glycine ma... 653 0.0 gi|7435817|pir||T08122 cysteine endopeptidase (EC 3.4.22.-)... 580 e-164 gi|600111|emb|CAA84378.1| cysteine proteinase [Vicia sativa] 540 e-152 gi|3688528|emb|CAA06243.1| pre-pro-TPE4A protein [Pisum sat... 539 e-152 gi|18423124|ref|NP_568722.1| cysteine proteinase [Arabidops... 521 e-147 gi|30141021|dbj|BAC75924.1| cysteine protease-2 [Helianthus... 516 e-145 gi|1076552|pir||S49166 cysteine proteinase (EC 3.4.22.-) pr... 510 e-143 gi|7435811|pir||T06708 cysteine proteinase (EC 3.4.22.-) T2... 490 e-137 gi|1169186|sp|P43156|CYSP_HEMSP Thiol protease SEN102 precu... 490 e- 137 gi|25289998|pir||JC7787 carrot seed cysteine proteinase (EC... 485 e-136 gi|18408616|ref|NP_566901.1| cysteine proteinase, putative... 483 e-135 gi|1173630|gb|AAB37233.1| cysteine proteinase 470 e-131 gi|4731374|gb|AAD28477.1|AF133839_1 papain-like cysteine pr... 462 e-129 gi|22331686|ref|NP_680113.1| cysteine proteinase, putative... 462 e-129677 gi|544129|sp|P25803|CYSP_PHAVU674 gi|1345573|emb|CAA40073.1|673 gi|31559530|dbj|BAC77523.1|657 gi|31559526|dbj|BAC77521.1|653 gi|7435817|pir||T08122580 gi|600111|emb|CAA84378.1|540 gi|3688528|emb|CAA06243.1|539 gi|18423124|ref|NP_568722.1|521 gi|30141021|dbj|BAC75924.1|516 gi|1076552|pir||S49166510 gi|7435811|pir||T06708490 gi|1169186|sp|P43156|CYSP_HEMSP490gi|25289998|pir||JC7787485 gi|18408616|ref|NP_566901.1|483 gi|1173630|gb|AAB37233.1|470 gi|4731374|gb|AAD28477.1|AF133839_1462 gi|22331686|ref|NP_680113.1|462
6
BLASTP results – related sequences Copy most similar cDNA sequences (in FASTA format) cDNA sequences from P. vulgaris, V. mungo, G. max, V. sativa, etc. Submit sequences to CLUSTALW at Biology Workbench website.
7
gi_118158_sp_P12412_CYSP_VIG MAMKKLLWVVLSLSLVLGVANSFDFHEKDLESEESLWDLYERWRSHHTVS gi_1223922_gb_AAA92063.1__cy MAMKKLLWVVLSLSLVLGVANSFDFHEKDLASEESLWDLYERWRSHHTVS gi_31559526_dbj_BAC77521.1__ MAMKKLLWVVLSLSLVLGSANSFDFHDKDLASEESFWDLYERWRSHHTVS gi_31559530_dbj_BAC77523.1__ MAMKKFLWVVLSLSLVLGVANSFDFHDKDLESEESLWDLYERWRSHHTVS gi_600111_emb_CAA84378.1__cy MEMKKLLFISLSLALIFTVANTFDFNEHDLESEKSLWNLYERWRSHHTVT gi_118158_sp_P12412_CYSP_VIG RSLGEKHKRFNVFKANVMHVHNTNKMDKPYKLKLNKFADMTNHEFRSTYA gi_1223922_gb_AAA92063.1__cy RSLTEKHKRFNVFKENVMHVHNTNKMDKPYKLKLNKFADMTNHEFRSTYA gi_31559526_dbj_BAC77521.1__ RSLGDKHKRFNVFKANVMHVHNTNKMDKPYKLKLNKFADMTNHEFRSTYA gi_31559530_dbj_BAC77523.1__ RSLGDKHKRFNVFKANMMHVHNTNKMDKPYKLKLNKFADMTNHEFRSTYA gi_600111_emb_CAA84378.1__cy RNLDEKHNRFNVFKANVMHVHNTNKLDKPYKLKLNKFGDMTNYEFRRIYA gi_118158_sp_P12412_CYSP_VIG GSKVNHHKMFRGSQHGSGTFMYEKVGSVPASVDWRKKGAVTDVKDQGQCG gi_1223922_gb_AAA92063.1__cy GSKVNHHKMFRGTQHGNGTFMYEKVGSVPASVDWRKKGAVTDVKDQGQCG gi_31559526_dbj_BAC77521.1__ GSKVNHHRMFQGTPRGNGTFMYEKVGSVPPSVDWRKNGAVTGVKDQGQCG gi_31559530_dbj_BAC77523.1__ GSKVNHHRMFRDMPRGNGTFMYEKVGSVPASVDWRKKGAVTDVKDQGHCG gi_600111_emb_CAA84378.1__cy DSKISHHRMFRGMSHENGTFMYENAVDVPSSIDWRNKGAVTGVKDQGQCG Alignment of the Cysteine Proteases from Vigna, Phaseolus, Glycine, and Vicia.
8
Unrooted Phylogenetic Tree
9
Add more sequences (e.g. of non- legumes) and see how tree changes? Repeat, all of above, but this time do with nucleotide sequences of the same proteins (cDNA) sequences. Compare results.
10
Possible Additions: Add more sequences (e.g. of non- legumes) and see how tree changes? Repeat, all of above, but this time do with nucleotide sequences of the same proteins (cDNA) sequences. Compare results with those from protein sequences.
11
Compare the nucleotide sequences of the cDNA and gene pairs where available – exons/introns? ACGTGTGACGAATCAAAGGTGCATGTTAGGCCAAACATATTTTCCAATGA ACGTGTGACGAATCAAAGGTG----------------------------- ACCTGTGATGCATCAAAGGTGCATGTTCGGCCAAACTTTTTTTTTTTT–- ACCTGTGATGCATCAAAGGTG----------------------------- AACCACTATAATTAATAGATAACTTGAGAAACT--AAAGTGCCAAAAATC -------------------------------------------------- -TTTAATGAAACCAATA--TAACTTGAGAAATCTAAAATTGCCAAAAATC -------------------------------------------------- TTTCATGTGGTAGGTGAATGACCTAGCTGTGTCAATTGATGGTCATGAAA ----------------AATGACCTAGCTGTGTCAATTGATGGTCATGAAA TTGCATGTGGTAGGTGAATGACCTAGCTGTGTCAATTGATGGCCATGAGA ----------------AATGACCTAGCTGTGTCAATTGATGGCCATGAGA AATGACCTAGCTGTGTCAATTGATGGCCATGAGA ************************** ***** *
12
Examine targeting of cysteine protease – e.g. with TargetP or PSORT. PSORT : http://psort.ims.u-tokyo.ac.jp/http://psort.ims.u-tokyo.ac.jp/ With AAA92063 (Vigna radiata cysteine protease): endoplasmic reticulum (lumen) --- Certainty= 0.910(Affirmative) outside --- Certainty= 0.719(Affirmative) lysosome (lumen) --- Certainty= 0.190(Affirmative) endoplasmic reticulum (membrane) --- Certainty= 0.100(Affirmative)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.