Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005.

Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005

How do proteins evolve?  Point mutations – exchange of one nucleotide for another Silent – same amino acid Missense – different amino acid Nonsense – stop  Insertions and deletions (indels) – addition or removal of one or more nucleotides

Frameshift Mutations http://www.cancerquest.org/images/frameshift.gif

Frameshift Mutations (cont.)  An insertion or deletion of some number of nucleotides that is not divisible by three  Leads to a shift in reading frame  Generally renders the original protein nonfunctional, perhaps through a stop codon (nonsense mutation)  But what if it led to a functional protein?

Frameshift Errors Pellegrini,M. and Yeates,T.O. (1999) Searching for frameshift evolutionary relationships between protein sequence families. Proteins, 37, 278–283

Goal  To see if frameshift mutations can account for evolution of some proteins  Analysis will be based on amino acid scoring matrices created by Drs. Pellegrini and Yeates in a previously published paper (“Searching for frameshift evolutionary relationships between protein sequence families”. Proteins, 37, 278–283 1999; http://www.doe- mbi.ucla.edu/~yeates/frameshift/)

Methods  Using a database of closely related genomes, pull out genes matching the following pattern:  If genes on either side of X and Y were conserved, one probably arose from the other Genome 1 Genome 2 Gene A Gene X Gene Y Gene B

Methods (cont.)  Compile list of ‘X and Y’ genes  Run comparisons on underlying amino acid sequences, based on amino acid tables that take into account frameshift mutations  See if relationships in fact exist between the seemingly unrelated genes

Database  Peter Bowers had two databases (prokaryotic and fungal) culled from various internet sources  Started with prokaryotic database because it was more complete  Dr. Yeates felt sequences had diverged too much  Switched to fungal databases – more incomplete but more closely related genomes

Coding  Wrote programs in Perl to update the fungal database Nucleotide stop and start positions Contig numbering  Started with complete genomes and pulled lists of bidirectional best hits Too few to be of use

Bidirectional Best Hit Gene 1 Genome 1 Genome 2 Genome 1 Genome 2 Gene 5 Gene 10 Gene 13 Gene 1 -> Gene 13 gives best e-score Gene 13 Gene 1 Gene 4 Gene 13 -> Gene 1 gives best e-score

Coding (cont.)  Compiled lists of all alignments between two genomes, then took any bidirectional hits  Filtered for those alignments that match the desired pattern  Have sequences for eight pairs of genomes (ranging from 4 to 82 sequences per pair)

Analysis  Ran local alignments on the obtained sequences, using scoring matrices from the website  Used different gap penalties  Also tried test sequences that have been shifted by one or two frames

Future Work  So far the results have been inconclusive  Would probably need to do a full statistical estimation of alignment scores according to the extreme value distribution  Could also work with underlying nucleotide sequences

Acknowledgements  Peter Bowers  Mike Thompson  Todd Yeates  Nam Tonthat  SoCalBSI

Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005.

Similar presentations

Presentation on theme: "Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005.

Similar presentations

Presentation on theme: "Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005."— Presentation transcript:

Similar presentations

About project

Feedback