Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using MATLAB to identify genes in novel genomes based on homology

Similar presentations


Presentation on theme: "Using MATLAB to identify genes in novel genomes based on homology"— Presentation transcript:

1 Using MATLAB to identify genes in novel genomes based on homology
Christine DeGennaro Postdoc, Springer Lab

2 Major points You can use MATLAB to automate repetitive tasks
You can integrate existing software into your MATLAB scripts With some patience and the help of Google / MATLAB documentation, this is very achievable with a basic level of MATLAB skill

3 Project background Motivation: Want to understand thermostability and folding characteristics of proteins from cryophiles Goals: Clone and express proteins from several cryophilic organisms for in vitro study

4 Project background 37°C 30°C 25°C 17°C 13°C S. cerevisiae C. saitoi
C. socialis C. victoriae C. vishnaicii G. martinii L. antarcticum 4

5 How would you do this by hand?
Antarctic yeast contigs

6 How would you do this by hand?
S. cerevisiae HIS3 BLAST

7 How would you do this by hand?
S. pombe HIS3 BLAST again

8 How would you do this by hand?

9 How would you do this by hand?

10 Identify possible start and stop codons
How would you do this by hand? Identify possible start and stop codons

11 How would you do this by hand?
Identify possible start and stop codons Identify possible splice sites

12 How would you do this by hand?
Identify possible start and stop codons Identify possible splice sites Identify the most likely gene features/boundaries

13 How would you do this by hand?
Identify possible start and stop codons Identify possible splice sites Identify the most likely gene features/boundaries Design primers to amplify the region for cloning

14 How can MATLAB make this easier?
Ortholog sequences Ortholog sequences Ortholog sequences Ortholog sequences Assembled contigs MATLAB ANALYSIS 1.) Identify region with BLAST 2.) Gene feature predictions 3.) Amplification primer optimization YFG1

15 Running BLAST with MATLAB
RUN BLAST blastlocal('InputQuery','FASTA/HIS3/Scer_YOR202W.fasta',... 'database', 'C:/Users/cmd16/Genomes/C_socialis.fa',... 'BlastPath','C:/Program Files/blast /bin/blastall.exe',... 'program','tblastn',... 'Format',8);

16 FASTA/HIS3/Scer_YOR202W.fasta
Running BLAST with MATLAB RUN BLAST blastlocal('InputQuery','FASTA/HIS3/Scer_YOR202W.fasta',... 'database', 'C:/Users/cmd16/Genomes/C_socialis.fa',... 'BlastPath','C:/Program Files/blast /bin/blastall.exe',... 'program','tblastn',... 'Format',8); FASTA/HIS3/Scer_YOR202W.fasta C:/Program Files/blast /bin/blastall.exe C:/Users/cmd16/Genomes/C_socialis.fa tBLASTn Output format 8

17 Running BLAST with MATLAB
blastlocal('InputQuery','FASTA/HIS3/Scer_YOR202W.fasta',... 'database', 'C:/Users/cmd16/Genomes/C_socialis.fa',... 'BlastPath','C:/Program Files/blast /bin/blastall.exe',... 'program','tblastn',... 'Format',8); FASTA/ADE2/Scer_YOR128C.fasta FASTA/ADE2/Scer_YOR128C.fasta FASTA/ADE2/Scer_YOR128C.fasta FASTA/ADE2/Scer_YOR128C.fasta FASTA/HIS3/Scer_YOR202W.fasta C:/Program Files/blast /bin/blastall.exe C:/Users/cmd16/Genomes/C_socialis.fa tBLASTn Output format 8

18 Gene prediction output

19 Cryptococcus neoformans HIS3

20 MATLAB and Primer3

21 MATLAB and Primer3 PRIMER3 INPUT FILE
SEQUENCE_ID=Cryptococcus_socialis_HIS3 SEQUENCE_TEMPLATE=CACCCTGATAGGGGAATCCT... SEQUENCE_INCLUDED_REGION=528,848 PRIMER_TASK=pick_cloning_primers PRIMER_PICK_ANYWAY=0 PRIMER_PICK_LEFT_PRIMER=1 PRIMER_PICK_INTERNAL_OLIGO=0 PRIMER_PICK_RIGHT_PRIMER=1 PRIMER_OPT_SIZE=18 PRIMER_MIN_SIZE=15 PRIMER_MAX_SIZE=21 PRIMER_NUM_RETURN=1 =

22 MATLAB analysis outputs
a. MATLAB objects: BLAST data, summary of analysis, list of primers b. MATLAB figure: showing all BLAST hits c. FASTA file: containing sequence of contig/region d. Genbank file: contains sequence + annotation

23 MATLAB outputs: Genbank file

24 Major points You can use MATLAB to automate repetitive tasks
You can integrate existing software into your MATLAB scripts With some patience and the help of Google / MATLAB documentation, this is very achievable with a basic level of MATLAB skill


Download ppt "Using MATLAB to identify genes in novel genomes based on homology"

Similar presentations


Ads by Google