Gene analysis Nucleotide BLAST : - 유전자 염기서열을 가지고 데이터베이스에 등록되어 있는 서열과 비교하여.

Slides:



Advertisements
Similar presentations
Progress in Transmembrane Protein Research 12 Month Report Tim Nugent.
Advertisements

Analysis of lytic proteins pinholin and Rz/Rz1 Sue Parks and Lawangin Khan.
Gene Prediction Preliminary Results Computational Genomics February 20, 2012.
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
Tools to analyze protein characteristics Protein sequence -Family member -Multiple alignments Identification of conserved regions Evolutionary relationship.
Machine Learning for Bioinformatics
© Wiley Publishing All Rights Reserved. Analyzing Protein Sequences.
Prediction of protein localization and membrane protein topology Gunnar von Heijne Department of Biochemistry and Biophysics Stockholm Bioinformatics Center.
PROTEIN SECONDARY STRUCTURE PREDICTION WITH NEURAL NETWORKS.
Tools to analyze protein characteristics Protein sequence -Family member -Multiple alignments Identification of conserved regions Evolutionary relationship.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
An Introduction to Bioinformatics Protein Structure Prediction.
Bio 465 Summary. Overview Conserved DNA Conserved DNA Drug Targets, TreeSAAP Drug Targets, TreeSAAP Next Generation Sequencing Next Generation Sequencing.
Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.
Genome Annotation BCB 660 October 20, From Carson Holt.
Gene Finding Genome Annotation. Gene finding is a cornerstone of genomic analysis Genome content and organization Differential expression analysis Epigenomics.
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
Presented by Liu Qi An introduction to Bioinformatics Algorithms Qi Liu
Advanced Tools and Algorithms in Bioinformatics Chittibabu Guda Summer, 2004 UCSD Extension, Department of Biosciences.
Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
BME 110L / BIOL 181L Computational Biology Tools October 29: Quickly that demo: how to align a protein family (10/27)
Functional Annotation Episode 2: Preliminary Results The Group 127th Feb 2012 Lavanya Rishishwar Artika Nath Lu Wang Haozheng Tian Shengyun Peng Ashwath.
BME 110L / BIOL 181L Computational Biology Tools February 19: In-class exercise: a phylogenetic tree for that.
Lab7 QRNA, HMMER, PFAM. Sean Eddy’s Lab
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Functional Annotation of Proteins via the CAFA Challenge Lee Tien Duncan Renfrow-Symon Shilpa Nadimpalli Mengfei Cao COMP150PBT | Fall 2010.
BLAST Basic Local Alignment Search Tool (Altschul et al. 1990)
Functional Annotation 基因功能预测 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you.
What is a Project Purpose –Use a method introduced in the course to describe some biological problem How –Construct a data set describing the problem –Define.
Wellcome Trust graduate course. - Computational Methods series. --- Sequence-based bioinformatics. Dr. Hyunji Kim Department of Biochemistry, University.
From Genomes to Genes Rui Alves.
Basic Overview of Bioinformatics Tools and Biocomputing Applications II Dr Tan Tin Wee Director Bioinformatics Centre.
RBP1 Splicing Regulation in Drosophila Melanogaster Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla This presentation available at
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
Chapter 3 Gene Alignments: Investigating Antibiotic Resistance.
Hidden Markov Model and Its Application in Bioinformatics Liqing Department of Computer Science.
Group discussion Name this protein. Protein sequence, from Aedes aegypti automated annotation >25558.m01330 MIHVQQMQVSSPVSSADGFIGQLFRVILKRQGSPDKGLICKIPPLSAARREQFDASLMFE.
TrypDB Analysis Workflow Common Analysis T Cruzi Analysis T Brucei Analysis L Braziliensis Analysis L Infantum Analysis L Major Analysis Mercator.
(H)MMs in gene prediction and similarity searches.
Part 4. Inferring Relationships Ch15. Computational Approaches in Comparative Genomics IDB Lab. Seoul National University Presented by Kangpyo Lee Bioinformatics:
Introducing Hidden Markov Models First – a Markov Model State : sunny cloudy rainy sunny ? A Markov Model is a chain-structured process where future states.
Legend Global = Subgraph call Make Data Dir = Step Load Genomic Sequence & Annotation = Subgraph reference Proteome Analysis = Optional step [Taxon] Pk.
Protein motif /domain Structural unit Functional unit Signature of protein family How are they defined?
TrypDB Analysis Workflow Common Analysis T Cruzi Analysis T Brucei Analysis L Braziliensis Analysis L Infantum Analysis L Major Analysis Mercator.
Supplementary Fig 1. Multiple sequence alignment of nucleotide sequences from amplified region of different Oryza sativa lines and wild species of rice.
Supplementary Fig. 1 ClustalW (2.1) multiple sequence alignment and comparison of deduced partial protein sequences of SOS1 in root tissues of wheat genotypes.
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
bacteria and eukaryotes
Protein Families, Motifs & Domains.
Functional manual annotation including GO
Sequence based searches:
Hidden Markov Models (HMM)
Functional Annotation of Transcripts
[Rz/Rz1, LysB/LysC, gp u/v] proteins of Lytic Cassette
Supplementary Figure 1 ORF2 ORF3 Nelorpivirus (Group I Negevirus)
Have (y)Our Protein Explained
Functional Annotation Final Results
Combining HMMs with SVMs
Nucleic Acid Interactions Practicalities
Bioinformatics Biological Data Computer Calculations +
Gene Annotation with DNA Subway
Sequence Based Analysis Tutorial
S.N.U. EECS Jeong-Jin Lee Eui-Taik Na
Bioinformatics 김유환, 문현구, 정태진, 정승우.
Modeling of Spliceosome
Gene Structure Prediction Using Neural Networks and Hidden Markov Models June 18, 권동섭 신수용 조동연.
Multiple alignment of type I and III IFNs from Xenopus, chicken, and human. Multiple alignment of type I and III IFNs from Xenopus, chicken, and human.
Presentation transcript:

Gene analysis Nucleotide BLAST : - 유전자 염기서열을 가지고 데이터베이스에 등록되어 있는 서열과 비교하여 유사한 서열을 찾아줌 Protein BLAST : - 아미노산서열을 가지고 데이터베이스에 등록되어 있는 서열과 비교하여 domain 및 유사 서열 찾아줌 Blastx : - 유전자 염기서열을 아미노산 서열로 바꿔 유사 아미노산 서열을 찾아줌 Dnassist version 2.2 program - 유전자 혹은 아미노산 서열 비교, ORF 찾을때, 제한효소 site 확인시, 물리적 특성 등 ClustalW multiple alignment 1.8 program - 유전자 혹은 아미노산 서열 비교 MEGA 계통도 작성시 사용 SignalP program : - 신호서열을 찾아줌 LipoP 1.0 server : - lipoprotein site 찾아줌 ( 그람 음성균만 해당 ) Motif scan prediction program : - 모티브들을 찾아줌 TMHMM server v.2.0 : - transmembrane ( 막에 붙는 단백질 ) site 찾아줌 NetNGlyc 1.0 server : - N glycosilation site 찾을 때 사용 ( 진핵세포만 해당 ) NetOGlyc 3.1 server : - O glycosilation site 찾을 때 사용 ( 진핵세포만 해당 ) Oligo Analyzer 3.1 : - primer 제작시 Tm, GC% 등 설정

ATGAAGATTAAATTTTTATCTGCAGCAATCGCTGCAAGCTTAGCATTGCCATTAAGTGCTGCTACCTTAGTCACCTCTTTTGAGGAAGCCGACTA CAGCAGCTCTGAAAACAATACTGAATTCTTGGAAGTGTCTGGAGATGCCACTTCTGAAGTTTCAACTGAACAAGCTACCGATGGTAATCAATCCA TTAAAGCGTCTTTTGACGCGGCTTTCAAACCAATGGTTGTTTGGAACTGGGGAAGTTGGAACTGGGGCGCTGAAGATGTGATGTCAGTAGATGT TGTTAACCCTAACGACACTGACGTCACCTTTGCTATTAAGCTAATTGATAGTGATATTCTTCCTGATTGGGTAGATGAGTCTCAAACCTCATTGGA CTACTTTACGGTTTCAGCTAATACCACGCAGACCTTTAGCTTTAACTTAAATGGCGGCAACGAGTTCCAAACTCATGGCGAAAACTTTAGTAAAG ATAAAGTTATCGGTGTGCAGTTCATGCTCTCTGAAAACGATCCTCAAGTGTTGTACTTTGACAACATTATGGTTGATGGCGAAACAGTCACTCCG CCACCAAGTGATGGTGCAGTGAATACACAAACCGCGCCTGTAGCCACCTTAGCGCAAATCGAAGACTTTGAAACCATTCCAGATTACTTACGAC CTGATGGTGGGGTAAACGTTTCAACTACTACTGAGATTGTGACTAAAGGCGCTGCAGCAATGGCTGCCGAGTTTACTGCAGGTTGGAACGGTTT AGTGTTTGCAGGTACTTGGAATTGGGCTGAACTAGGTGAACACACCGCAGTTGCCGTTGACGTTTCAAATACTAGCGATAGCAATATCTGGTTG TACTCACGTATCGAAGATGTAAATAGCCAAGGCGAAACAGCGACTCGCGGCGTATTGGTTAAAGCTGGCGAATCGAAAACCATCTACACCAGCT TAAATGACAACCCTTCATTGCTTACTCAAGATGAGCGCGTGTCAGCTTTAGGTTTACGTGATATTCCAGCTGACCCAATGAGCGCTCAAAATGGC TGGGGTGATTTTGTTGCTTTAGACAAATCTCAAATTACCGCTATTCGTTACTTCATTGGCGAATTAGCCAGCGGTGAGACTAGCCAAACACTTGT GTTTGATAACATGCGTGTGATTAAAGACCTTAACCACGAATCAGCCTATGCAGAAATGACTGATGCTATGGGGCAAAACAACTTAGTCACTTATG CAGGTAAAGTTGCCAGCAAAGAAGAGTTAGCTAAGTTAAGTGATCCGGAAATGGCTGTTTTGGGTGAGTTAACCAATCGCAATATGTACGGTGG TAACCCAGATTCGTCGCCAACTACAGACTGTGTGCTCGCTACGCCTGCCTCGTTTAACGCTTGTAAAGACGCTGATGGTAACTGGCAATTGGTA GACCCTGCTGGTAATGCGTTCTTCTCAACCGGTGTTGATAACATTCGTTTGCAAGATACTTACACCATGACCGGCGTGTCGAGTGACGCCGAAT CTGAGTCTGCACTTCGCCAGTCAATGTTTACAGAAATTCCAAGTGATTATGTAAATGAAAACTATGGCCCTGTGCATAGTGGACCTGTTTCTCAA GGCCAAGCTGTAAGTTTTTACGCTAATAACTTAATTACCCGCCACGCTAGCGAAGACGTATGGCGAGACATTACTGTTAAGCGCATGAAAGACT GGGGCTTTAACACCTTAGGTAACTGGACCGATCCAGCGTTGTATGCAAACGGTGACGTTCCTTACGTGGCAAATGGTTGGTCAACCTCTGGTGC CGATCGTCTTCCCGTTAAACAAATTGGCAGCGGCTACTGGGGACCACTTCCTGATCCGTGGGATGCTAACTTTGCTACCAATGCCGCCACAATG GCTGCAGAGATCAAAGCTCAGGTTGAAGGCAACGAAGAGTACTTAGTGGGTATTTTTGTTGATAACGAAATGAGCTGGGGCAATGTCACTGATG TTGAAGGCTCTCGTTATGCGCAAACGCTAGCAGTGTTCAATACCGACGGCACTGATGCAACAACTAGCCCTGCTAAAAATAGCTTTATTTGGTTC TTAGAAAACCAGCGTTATACCGGTGGCATTGCTGACCTAAACGCAGCCTGGGGAACCGATTATGCGTCTTGGGATGCGATGCGCCCAGCGCAA GAGTTAGCTTATGTGGCTGGCATGGAAGCTGATATGCAGTTCCTTGCTTGGCAGTTTGCGTTCCAATACTTCAACACCGTAAACACGGCATTAAA AGCTGAGTTACCAAACCACTTGTACTTGGGCTCTCGCTTTGCAGATTGGGGACGTACTCCTGATGTAGTAAGTGCTGCTGCGGCTGTTGTTGAT GTGATGAGTTACAACATCTACAAAGACAGTATTGCAGCTGCCGATTGGGATGCTGATGCCTTAAATCAAATTGAAGCCATTGATAAGCCAGTAAT TATTGGTGAGTTCCACTTCGGTGCGCTTGATAGCGGTTCGTTTGCAGAAGGTGTAGTAAATGCCACTTCGCAACAAGATCGTGCAGACAAAATG GTTAGCTTCTACGAATCAGTAAATGCCCATAAAAACTTTGTAGGTGCGCATTGGTTCCAATACATCGATTCACCATTAACGGGTCGTGCATGGGA TGGCGAGAACTACAACGTTGGTTTTGTTAGCAATACTGACACGCCATATACATTGATGACAGATGCTGCGCGTGAGTTTAACTGTGGTATGTACG GCACTGACTGCTCTAGCTTAAGCAATGCTACTGAAGCTGCTTCGAGAGCCGGTGAGTTGTATACCGGTACCAATATTGGTGTTAGCCACTCTGG CCCAGAAGCGCCAGATCCAGGTGAGCCAGTTGATCCTCCAATTGATCCGCCAACACCACCAACAGGTGGCGTAACTGGCGGTGGCGGTAGCG CAGGTTGGTTATCGCTACTAGGTTTGGCCGGCGTATTTTTACTAAGACGTCGTAAAGTGTAA AG17 agarase - ORF

ATGGTTGTTTGGAACTGGGGAAGTTGGAACTGGGGCGCTGAAGATGTGATGTCAGTAGATGTTGTTAACCCTAACGACACTGACGTCAC CTTTGCTATTAAGCTAATTGATAGTGATATTCTTCCTGATTGGGTAGATGAGTCTCAAACCTCATTGGACTACTTTACGGTTTCAGCTAATA CCACGCAGACCTTTAGCTTTAACTTAAATGGCGGCAACGAGTTCCAAACTCATGGCGAAAACTTTAGTAAAGATAAAGTTATCGGTGTGC AGTTCATGCTCTCTGAAAACGATCCTCAAGTGTTGTACTTTGACAACATTATGGTTGATGGCGAAACAGTCACTCCGCCACCAAGTGATG GTGCAGTGAATACACAAACCGCGCCTGTAGCCACCTTAGCGCAAATCGAAGACTTTGAAACCATTCCAGATTACTTACGACCTGATGGT GGGGTAAACGTTTCAACTACTACTGAGATTGTGACTAAAGGCGCTGCAGCAATGGCTGCCGAGTTTACTGCAGGTTGGAACGGTTTAGT GTTTGCAGGTACTTGGAATTGGGCTGAACTAGGTGAACACACCGCAGTTGCCGTTGACGTTTCAAATACTAGCGATAGCAATATCTGGTT GTACTCACGTATCGAAGATGTAAATAGCCAAGGCGAAACAGCGACTCGCGGCGTATTGGTTAAAGCTGGCGAATCGAAAACCATCTACA CCAGCTTAAATGACAACCCTTCATTGCTTACTCAAGATGAGCGCGTGTCAGCTTTAGGTTTACGTGATATTCCAGCTGACCCAATGAGCG CTCAAAATGGCTGGGGTGATTTTGTTGCTTTAGACAAATCTCAAATTACCGCTATTCGTTACTTCATTGGCGAATTAGCCAGCGGTGAGA CTAGCCAAACACTTGTGTTTGATAACATGCGTGTGATTAAAGACCTTAACCACGAATCAGCCTATGCAGAAATGACTGATGCTATGGGGC AAAACAACTTAGTCACTTATGCAGGTAAAGTTGCCAGCAAAGAAGAGTTAGCTAAGTTAAGTGATCCGGAAATGGCTGTTTTGGGTGAGT TAACCAATCGCAATATGTACGGTGGTAACCCAGATTCGTCGCCAACTACAGACTGTGTGCTCGCTACGCCTGCCTCGTTTAACGCTTGT AAAGACGCTGATGGTAACTGGCAATTGGTAGACCCTGCTGGTAATGCGTTCTTCTCAACCGGTGTTGATAACATTCGTTTGCAAGATACT TACACCATGACCGGCGTGTCGAGTGACGCCGAATCTGAGTCTGCACTTCGCCAGTCAATGTTTACAGAAATTCCAAGTGATTATGTAAAT GAAAACTATGGCCCTGTGCATAGTGGACCTGTTTCTCAAGGCCAAGCTGTAAGTTTTTACGCTAATAACTTAATTACCCGCCACGCTAGC GAAGACGTATGGCGAGACATTACTGTTAAGCGCATGAAAGACTGGGGCTTTAACACCTTAGGTAACTGGACCGATCCAGCGTTGTATGC AAACGGTGACGTTCCTTACGTGGCAAATGGTTGGTCAACCTCTGGTGCCGATCGTCTTCCCGTTAAACAAATTGGCAGCGGCTACTGGG GACCACTTCCTGATCCGTGGGATGCTAACTTTGCTACCAATGCCGCCACAATGGCTGCAGAGATCAAAGCTCAGGTTGAAGGCAACGAA GAGTACTTAGTGGGTATTTTTGTTGATAACGAAATGAGCTGGGGCAATGTCACTGATGTTGAAGGCTCTCGTTATGCGCAAACGCTAGCA GTGTTCAATACCGACGGCACTGATGCAACAACTAGCCCTGCTAAAAATAGCTTTATTTGGTTCTTAGAAAACCAGCGTTATACCGGTGGC ATTGCTGACCTAAACGCAGCCTGGGGAACCGATTATGCGTCTTGGGATGCGATGCGCCCAGCGCAAGAGTTAGCTTATGTGGCTGGCA TGGAAGCTGATATGCAGTTCCTTGCTTGGCAGTTTGCGTTCCAATACTTCAACACCGTAAACACGGCATTAAAAGCTGAGTTACCAAACC ACTTGTACTTGGGCTCTCGCTTTGCAGATTGGGGACGTACTCCTGATGTAGTAAGTGCTGCTGCGGCTGTTGTTGATGTGATGAGTTAC AACATCTACAAAGACAGTATTGCAGCTGCCGATTGGGATGCTGATGCCTTAAATCAAATTGAAGCCATTGATAAGCCAGTAATTATTGGT GAGTTCCACTTCGGTGCGCTTGATAGCGGTTCGTTTGCAGAAGGTGTAGTAAATGCCACTTCGCAACAAGATCGTGCAGACAAAATGGT TAGCTTCTACGAATCAGTAAATGCCCATAAAAACTTTGTAGGTGCGCATTGGTTCCAATACATCGATTCACCATTAACGGGTCGTGCATG GGATGGCGAGAACTACAACGTTGGTTTTGTTAGCAATACTGACACGCCATATACATTGATGACAGATGCTGCGCGTGAGTTTAACTGTG GTATGTACGGCACTGACTGCTCTAGCTTAAGCAATGCTACTGAAGCTGCTTCGAGAGCCGGTGAGTTGTATACCGGTACCAATATTGGT GTTAGCCACTCTGGCCCAGAAGCGCCAGATCCAGGTGAGCCAGTTGATCCTCCAATTGATCCGCCAACACCACCAACAGGTGGCGTAA CTGGCGGTGGCGGTAGCGCAGGTTGGTTATCGCTACTAGGTTTGGCCGGCGTATTTTTACTAAGACGTCGTAAAGTG AG17 agarase – ORF( 찾기 )

MKIKFLSAAIAASLALPLSAATLVTSFEEADYSSSENNTEFLEVSGDAT SEVSTEQATDGNQSIKASFDAAFKPMVVWNWGSWNWGAEDVMSV DVVNPNDTDVTFAIKLIDSDILPDWVDESQTSLDYFTVSANTTQTFSF NLNGGNEFQTHGENFSKDKVIGVQFMLSENDPQVLYFDNIMVDGET VTPPPSDGAVNTQTAPVATLAQIEDFETIPDYLRPDGGVNVSTTTEIVT KGAAAMAAEFTAGWNGLVFAGTWNWAELGEHTAVAVDVSNTSDSNI WLYSRIEDVNSQGETATRGVLVKAGESKTIYTSLNDNPSLLTQDERV SALGLRDIPADPMSAQNGWGDFVALDKSQITAIRYFIGELASGETSQT LVFDNMRVIKDLNHESAYAEMTDAMGQNNLVTYAGKVASKEELAKLS DPEMAVLGELTNRNMYGGNPDSSPTTDCVLATPASFNACKDADGN WQLVDPAGNAFFSTGVDNIRLQDTYTMTGVSSDAESESALRQSMFT EIPSDYVNENYGPVHSGPVSQGQAVSFYANNLITRHASEDVWRDITV KRMKDWGFNTLGNWTDPALYANGDVPYVANGWSTSGADRLPVKQI GSGYWGPLPDPWDANFATNAATMAAEIKAQVEGNEEYLVGIFVDNE MSWGNVTDVEGSRYAQTLAVFNTDGTDATTSPAKNSFIWFLENQRY TGGIADLNAAWGTDYASWDAMRPAQELAYVAGMEADMQFLAWQFA FQYFNTVNTALKAELPNHLYLGSRFADWGRTPDVVSAAAAVVDVMS YNIYKDSIAAADWDADALNQIEAIDKPVIIGEFHFGALDSGSFAEGVVN ATSQQDRADKMVSFYESVNAHKNFVGAHWFQYIDSPLTGRAWDGE NYNVGFVSNTDTPYTLMTDAAREFNCGMYGTDCSSLSNATEAASRA GELYTGTNIGVSHSGPEAPDPGEPVDPPIDPPTPPTGGVTGGGGSA GWLSLLGLAGVFLLRRRKV Ag17 agarase protein

AG17 agarase -Using neural networks (NN) and hidden Markov models (HMM) trained on Gram-negative bacteria

The output format is essentially in GFF format. The default (long) output format looks like this: # ANIA_NEIGO SpII score= margin= cleavage=18-19 Pos+2=G # Cut-off=-3 ANIA_NEIGOLipoP1.0:BestSpII ANIA_NEIGOLipoP1.0:MarginSpII ANIA_NEIGOLipoP1.0:ClassSpI ANIA_NEIGOLipoP1.0:ClassCYT ANIA_NEIGOLipoP1.0:SignalCleavII # FALAA|CGGEQ Pos+2=G ANIA_NEIGOLipoP1.0:SignalCleavI # GGEQA|AQAPA ANIA_NEIGOLipoP1.0:SignalCleavI # LAACG|GEQAA ANIA_NEIGOLipoP1.0:SignalCleavI # EQAAQ|APAET ANIA_NEIGOLipoP1.0:SignalCleavI # GEQAA|QAPAE ANIA_NEIGOLipoP1.0:SignalCleavI # QAAQA|PAETP Ag17 –lipoprotein site

CGTTAGAACGCGTAATACGACTCACTATAGGGAGACACAAGTTATGGTTGGGGTGATTGGTGTTATAATAATGGAAAC CGACGTTATATGCGTATGGGTGTAAACTGGATTAGTCCAAAACATTTTGAGTATTATATTGATGGTGAGTTAGTTAGAG TGATGTATTATAATGCAATTGCCACTAATTACAACGGAACTTGGCAATACACATATTTTAATTCTATGAATTGGAATGTA AATGGATATAATCTTCCTACTAACAACGGATCTGGATATACAGATGTAACTACTTATGCTACATCTAATGCATACGATTT TGAAAAATTAAAGGAAGCATCTAATGCATCTAACGGTTTTAATGTAATTGATCCGGCTTGGTTCCAAGGGGGAGATGA TAGTGATACAGATGGAAATGGAGTAACACAAGAGGCTAGAGGATTCACTAAAGAATTAGATATTATTATTAATATGGAA TCACAAACGTGGTTAACAGCTTCTACACCATCACAAAGTGATTTAGAAAACCCAGCGAAAAATCAAATGAAAGTAGATT GGGTACGTGTTTACAAACCTGTATCATCTAATCCAGGTTCAGATGTAGCGGTACAAAGCGTCTCTTTATCACCTGCTA ATTTAACTATGTCAGAAGGAGAGACGAGTAACTTAACAGGTAGAGTGCTGCCTTCGAATGCTACAATCCAAACAATTG CTTTTACTTCTAATAATACAAATGTAGTTTCTGTTAATCAATCAGGCTTACTAACTGCAAACGGAATTGGTACAGCAATA ATTACAGCTACATCTACAGACGGTGGTTATACTGCAACTTCTAATATTACTGTAGAGGCTGAAGATGTTGGAGGTCCA ATAAGCTCTTTAGAAATTGAAGCTGATGATTTCTCATCAACAGGCGGTACATTTAATGACGGTGTTGTTCCTTTTGGTG CAAATAAATCATCAATTGGTGTTAATTATATTAATGCTGGTGATTACATGGAATATGTAGTCGCAATTGCTGAAATGGG AGACTACTCTCTTACTTATCAAATATCTACTCCAAGTGATAATGCTAAAATTGCATGTTATGTTGATGGTAATTTAGTAG CAGATGATAATGTTCAAAACAATGGACAGTGGGATGCATACCAAGCATTAACTGCTTCTAATAATTTATCGCTAACAAC TGGTAATCATACAATTAAAATTGAAGCTTCAGGCAGCAATGATTGGCAATGGAATCTTGATAAAATGAATTTAGAAAAA TTAGGTTCAGGAACGAATCCTGAAGAACCAACGCCTCCTTTAGCCGAAGATTTTGTAATTCAAGCGGAAGACTATAAT GAAACAAGTGGTAGTTTTAATGACGGTTTTGTCCCTTTTGGTGTTAACGCATCTGCAAATGGAATTAATTATGTTAATG CAGAAGATTGGGCGGATTATGAAGTTTATCTTCCAGAGGCAGGTACATTTAACGTAACCTACACAATTGCAACGCCAA GCGATAATGCACAAATTGAAATTGTAGTCTCCCTATAGTGAGTCGTATTACGCGTTCTAGCGACAATATGTACAATCAC TAGGAATTCGCGGCCGCCTG AG31 agarase full sequence ORF

ag-31-LipoP

GGTATTTTCATAAGCTTGAGTTTGAATATGGATACAAATAATAGAAGGTACACACAAAAG AGATTGTTTCATCTAGGGCCTGTTTATCTTTCGATGATTAAATTCACAAAAGTCACTCGC ACTAGTTAAAGAAGCATATCTACATTAATTTGCATGGAGATTTTATATGAATATATTAAAA CTACTATCCTGTTCTACTTGCGCAATACTCTGCACAGCAACACATGCTGCAGATTGGGA CGCATATAGTATTCCGGCTTCTGCTGGATCAGGTAAAACATGGCAATTACAAACTGTTT CCGACCAATTTAACTACCAAGCCGGTACTTCAAATAAACCGGCAGCATTTACCAATCGT TGGAATGCTTCGTATATTAATGCTTGGCTTGGGCCTGGTGATACTGAATTCAGTTCAGG TCATTCCTACACTACTGGTGGTGCGTTAGGCCTTCAGGCAACTGAAAAAGCAGGAACA AATAAAGTGCTTGCAGGAATTGTTTCTTCAAAAGCAACTTTTACATACCCACTTTATCTT GAGGCAATGGTAAAACCGAGTAATAACACTATGGCTAATGGTGTATGGATGTTGAGCT CTGATTCAACTCAGGAAATTGATGCGATGGAGGCATACGGCAGTGATCGTGTAGGGCA AGAGTGGTTTGACCAACGTATGCACGTAAGTCACCATGTTTTTATACGTGAGCCATTTC AAGATTACCAACCAAAAGATGCAGGCTCTTGGGTATACAATAACGGCGAAACATACCG AAATAAATTTCGTCGCTACGGTGTTCATTGGAAGGACGCGTGGAACCTAGATTACTATA TTGATGGTGTATTAGTTCGCAGCGTTTCAGGTCCTAATATAATTGATCCTGAAGGCTAT ACCGGTGGCACAGGGCTAAGTAAACCAATGCACATCCTTTTAGATATGGAACATCAAC CTTGGCGTGATGTAAAACCAAATTCAGCCGAGCTAGCTGATTCAAACAAAAGTATATTT TGGATTGACTGGATACGTGTCTACAAAGCAAACTAAGTCATTCTAAAATATTTGTAATAT TAGGTTTTATTGCTTCTCGTTATACGACACGGAGCAATAAACTTTAAGGTCCCCAAAACT ACTTAATGCGGCTATTACAGCCGCATTAAGTATAATTAACCTGAACTCTGGATAGTAAAT CTATCTCGAGCAGCTATTGACGCGTGAATTCTCTCCCTATAGTGAGTCGTA AG52 agarase full sequence ORF

GGGGATTGTAGAGTTCTTTCCCTTAGAAGATTAAAGATGGGGCGGCGACCAGCCCGTTGCTACCGCTACTGTAAA AACAAGCCGTATCCCAAGTCACGGTTTTGCCGTGGTGTCCCAGATCCAAAGATCCGAATCTTTGATTTGGGCAGAA AGAAGGCTCGTGTTGATGAGTTCCCTCTCTGCATCCACTTGATCTCCGATGAGTATGAACAGCTGTCATCGGAAGC TCTGGAGGCTGGACGTATCTGCGCCAACAAGTACCTGGTGAAGGTCTGCGGCAAGGATTCCTTCCACTTACGTGT GCGCTTACATCCCTTCCACGTCATCAGGATCAACAAGATGTTGTCCTGTGCCGGTGCTGATCGACTTCAGACCGGT ATGCGTGGTGCTTGGGGTAAACCTCAAGGGACTGTGGCCCGTGTAAACATTGGACAGCCGATCATGTCGGTCCGT TCCAGAGAGCAGAACGAGTCTGCTGTGATTGAGGCCCTCCGGAGAGCCAAGTTCAAGTACCCTGGCAGACAGAA GATTGTGCTCTCCAAGAAGTGGGGATTCACCAAGTGGCCAAGAGACTGCTATGAGGATATGTGCGCTGATGGACG CCTGATTCCTGACGGTGTCAGTGTTCAGTACAGACCCAACCGAGGCCCTCTCAACAAGTGGAGAAAGGACCAGAC CAACCTCAGGTCCtagACAGACTGTTGGGACGAACATGGTCATGTGAATAAAGAGCTGGAGAAATAAAAAAAGAAAA AAAAAAAAAAAAAAAAAAAAAAAAA 전복 QM protein - full sequence ORF GDCRVLSLRRLKMGRRPARCYRYCKNKPYPKSRFCRGVPDPKIRIFDLGR KKARVDEFPLCIHLISDEYEQLSSEALEAGRICANKYLVKVCGKDSFHLRVR LHPFHVIRINKMLSCAGADRLQTGMRGAWGKPQGTVARVNIGQPIMSVRS REQNESAVIEALRRAKFKYPGRQKIVLSKKWGFTKWPRDCYEDMCADGR LIPDGVSVQYRPNRGPLNKWRKDQTNLRS

ATGGACATTCCCATTACAAAACACTGGGATGGTGGTTTTCGGTCGGACTTCTGTGAACCTATTACT CAAACCATGCATTCCTGGAAAGCGCATGTCATCTTTGACCATCACGTCGATACGCTAGACATCTGG GTTGCTGATGTTCAGCAAACTCTGAATGGAGGTAAAGAGTTTGTCCTGGTCAACAAGGCATCTTAT GGCGAGCAGAAGGCTGGCGACAAACTGTGTGTGAAACTAATTGGACGTGTCAACGGTGACATCGT TCCAAAAGGCCGGTTCTACATTGAAGGCATGGACGGCCCGGTATCAGCTACCGAGAAACCCATCA GACATACATACAAACCGGGCACGCCGACAACGTCTAGCCATGTCACAGGCAAAGTTCTTTATGAAT ATTATGGATTCGATCCGAGTGATTACAAAAAGGGCATAACTGTGCTACAACATGGAGGCTTCGATG AAGACTCTGGCTCCGTTGTCCTTGACCCTGCCGGTACCGGGGAACATGTCCTCAAAGTGTTCTAC GAGAAGGGACACTATATCAAAGTTCGGGGCCACCGTGGGATTCAGTTCTACTGGACCCCTATCCA TCCCCAAACGACGCTGACGTTGAGCTACGACATCTACTTCGACCCCAACTTTGACTGGGTTAAGG GAGGCAAGCTTCCGGGTCTGTGGGGGAGGGTCCCAAACCCTGTTCTGGGGGACGTCACAACGAG GAATGTTTCTCAACACGCTTCATGTGGAGGACTGGTGGGGGTGGGGAACTGTACGCCTACATCCC CTCTGGACAACGCGCCGACTTCTGCACTAAGAACATTTGCAACTTCGACTACGGTAACTCTTtag 전복 alginate lyase - ORF