Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005.

Slides:



Advertisements
Similar presentations
Introduction to Bioinformatics
Advertisements

An analysis of “Bioinformatics analysis of SARS coronavirus genome polymorphism” by Pavlović-Lažetić, et. al Angela Brooks July 9, 2004 SoCalBSI Article.
Sequence similarity.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Chapter 17.5 Gene expression and Mutations
Mutations.
12.4 Mutations. Complete the 2 tables on the first page of your handout. Try this without using your notes first and only refer to your notes on transcription.
Construction of Substitution Matrices
Mutations Chapter 12.4.
Mutations. DNA Mistakes DNA is a molecule that replicates, works and copies with very high accuracy DNA has enzymes that make sure that it works with.
Mutations Gene Mutations Change in the nucleotide sequence of a gene May only involve a single nucleotide May be due to copying errors, chemicals, viruses,
Gene Mutations Chapter 11.
13-3 Mutations Can be good, bad or nothing!!. What is a mutation? The word is Latin for “to change”. There are 2 types: – 1) Single gene changes – 2)
Mutations.
GENE REGULATION Gene regulation: The ability of an organism to control which genes are transcribed in response to the environment.
Sequence Alignment.
Construction of Substitution matrices
Introduction A mutation is a change in the normal DNA sequence. They are usually neutral, having no effect on the fitness of the organism. Sometimes,
Point Mutations Silent Missense Nonsense Frameshift.
Mutations in DNA changes in the DNA sequence that can be inherited can have negative effects (a faulty gene for a trans- membrane protein leads to cystic.
Step 3: Tools Database Searching
Evolution at the Molecular Level. Outline Evolution of genomes Evolution of genomes Review of various types and effects of mutations Review of various.
DNA Mutations What is a gene mutation? Often times, parts of DNA will have a base (or more) missing, added, or incorrect Can be caused by: errors in.
MUTATIONS. Mutations are heritable changes in genetic information Only mutation in the GAMETES can be passed on from generation to generation There can.
 During replication (in DNA), an error may be made that causes changes in the mRNA and proteins made from that part of the DNA  These errors or changes.
MUTATIONS. Mutations  errors/changes in the DNA sequence that are inherited.  May have a negative effect, a positive effect, or no effect.
Fantasy Mutations Reality. Mutations: a permanent and heritable change in the nucleotide sequence of a gene. Are caused by mutagens (x-rays and UV light)
Genetic Mutations Occur in any organism, from people and other animals to plants, bacteria, fungi, and protists. A mutation is any change in the nucleotide.
Mutation. What you need to know How alteration of chromosome number or structurally altered chromosomes can cause genetic disorders How point mutations.
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
A change in the nucleotide sequence of DNA Ultimate source of genetic diversity Gene vs. Chromosome.
DNA Mutations. Base Substitution: A substitution is a mutation that exchanges one base for another (switch A with a G) 1. changes a codon to one that.
MOLECULAR GENETICS Mutations Definition
Lesson Four Structure of a Gene.
Lesson Four Structure of a Gene.
DNA/GENE MUTATIONS.
“How does it affect the protein?”
Gene Mutations.
Mutations.
Gene Mutations.
Mutations.
Mutations Chapter 12-4.
Mutations.
Types of Mutations.
Gene Mutations Chapter 11.
Do Now 2/12.
DNA and mutations SC.912.L.16.4.
Gene Mutations.
MUTATIONS.
Mutations.
Mutations.
Types of point mutations
Do Now 2/12.
Mutations changes in the DNA sequence that can be inherited
Entry Task Apply: Suppose a template strand of DNA had the following sequence: DNA: T A C G G A T A A C T A C C G G G T A T T C A A What would.
Entry Task Apply: Suppose a template strand of DNA had the following wild-type gene sequence: DNA: T A C G G A T A A C T A C C G G G T A T T C.
Mutations (Section 17-5) Now, that you know how gene expression works, let’s see how changes in the gene affect how the protein is made.
Do Now What is the central dogma of biology?
What can you infer from this cartoon?
Lesson 1: Evolution.
Mutations.
MUTATIONS.
Mutations.
MUTATIONS.
Mutation Notes.
Mutations.
Mutations The fat cat sad.
Section 20.4 Mutations and Genetic Variation
Mutations: Changes in Genes
Gene Mutations.
Presentation transcript:

Protein Evolution Jean Yeh, SoCalBSI Mike Thompson, UCLA Summer 2005

How do proteins evolve?  Point mutations – exchange of one nucleotide for another Silent – same amino acid Missense – different amino acid Nonsense – stop  Insertions and deletions (indels) – addition or removal of one or more nucleotides

Frameshift Mutations

Frameshift Mutations (cont.)  An insertion or deletion of some number of nucleotides that is not divisible by three  Leads to a shift in reading frame  Generally renders the original protein nonfunctional, perhaps through a stop codon (nonsense mutation)  But what if it led to a functional protein?

Frameshift Errors Pellegrini,M. and Yeates,T.O. (1999) Searching for frameshift evolutionary relationships between protein sequence families. Proteins, 37, 278–283

Goal  To see if frameshift mutations can account for evolution of some proteins  Analysis will be based on amino acid scoring matrices created by Drs. Pellegrini and Yeates in a previously published paper (“Searching for frameshift evolutionary relationships between protein sequence families”. Proteins, 37, 278– ; mbi.ucla.edu/~yeates/frameshift/)

Methods  Using a database of closely related genomes, pull out genes matching the following pattern:  If genes on either side of X and Y were conserved, one probably arose from the other Genome 1 Genome 2 Gene A Gene X Gene Y Gene B

Methods (cont.)  Compile list of ‘X and Y’ genes  Run comparisons on underlying amino acid sequences, based on amino acid tables that take into account frameshift mutations  See if relationships in fact exist between the seemingly unrelated genes

Database  Peter Bowers had two databases (prokaryotic and fungal) culled from various internet sources  Started with prokaryotic database because it was more complete  Dr. Yeates felt sequences had diverged too much  Switched to fungal databases – more incomplete but more closely related genomes

Coding  Wrote programs in Perl to update the fungal database Nucleotide stop and start positions Contig numbering  Started with complete genomes and pulled lists of bidirectional best hits Too few to be of use

Bidirectional Best Hit Gene 1 Genome 1 Genome 2 Genome 1 Genome 2 Gene 5 Gene 10 Gene 13 Gene 1 -> Gene 13 gives best e-score Gene 13 Gene 1 Gene 4 Gene 13 -> Gene 1 gives best e-score

Coding (cont.)  Compiled lists of all alignments between two genomes, then took any bidirectional hits  Filtered for those alignments that match the desired pattern  Have sequences for eight pairs of genomes (ranging from 4 to 82 sequences per pair)

Analysis  Ran local alignments on the obtained sequences, using scoring matrices from the website  Used different gap penalties  Also tried test sequences that have been shifted by one or two frames

Future Work  So far the results have been inconclusive  Would probably need to do a full statistical estimation of alignment scores according to the extreme value distribution  Could also work with underlying nucleotide sequences

Acknowledgements  Peter Bowers  Mike Thompson  Todd Yeates  Nam Tonthat  SoCalBSI