Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence analysis: what is a sequence? Linear arrangement of chemical subunits Contains information: 3-D arrangement determined by the sequence; 3-D defines.

Similar presentations


Presentation on theme: "Sequence analysis: what is a sequence? Linear arrangement of chemical subunits Contains information: 3-D arrangement determined by the sequence; 3-D defines."— Presentation transcript:

1 Sequence analysis: what is a sequence? Linear arrangement of chemical subunits Contains information: 3-D arrangement determined by the sequence; 3-D defines function Linear arrangement of chemical subunits Contains information: 3-D arrangement determined by the sequence; 3-D defines function

2 Gene Confers some trait : i.e., unit of information How is this information used? –passed on to next generation –put into a form useful for doing cellular work Specific sequence of DNA (a molecule) In vivo, found in one specific place Confers some trait : i.e., unit of information How is this information used? –passed on to next generation –put into a form useful for doing cellular work Specific sequence of DNA (a molecule) In vivo, found in one specific place

3 Nucleic acid sequences store information Linear arrangement of chemical subunits; chemistry confers direction 3-D arrangement determined by sequence; 3-D arrangement defines function DNA: subunits = nucleotides A,T,G,C Linear arrangement of chemical subunits; chemistry confers direction 3-D arrangement determined by sequence; 3-D arrangement defines function DNA: subunits = nucleotides A,T,G,C

4 Sources of sequence information Chemical reactions on polymers (sequential degradation into monomers) Translation (more later) Chemical reactions on polymers (sequential degradation into monomers) Translation (more later)

5 Sequencing gel -- primary source of sequence data Automated sequencing uses Sanger method (also called dideoxy or enzymatic method) Relies on enzyme DNA polymerase and chemically modified nucleotides = dideoxynucleotides (ddNMP) When ddNMP is added to growing DNA chain, chain stops; fix [ddNMP] such that chain stops once every occurrence of A,T,G or C Automated sequencing uses Sanger method (also called dideoxy or enzymatic method) Relies on enzyme DNA polymerase and chemically modified nucleotides = dideoxynucleotides (ddNMP) When ddNMP is added to growing DNA chain, chain stops; fix [ddNMP] such that chain stops once every occurrence of A,T,G or C

6 Diagram of sequencing gel

7 Traces of sequencing gel

8 In-class exercise I: nucleic acid polymer 1) draw chemical bonds of sequence AGTCAGTC 2) predict complementary sequence 3) sketch 3-D structure 4) view sequence of actual gene in GCG 5) view 3-D structure in file 1) draw chemical bonds of sequence AGTCAGTC 2) predict complementary sequence 3) sketch 3-D structure 4) view sequence of actual gene in GCG 5) view 3-D structure in file

9 Protein sequences store information Directional sequence of subunits = amino acids, 20 of them abbreviated as letters Function depends on structure depends on sequence Proteins (enzymes) do the work of life; work defined by sequence Directional sequence of subunits = amino acids, 20 of them abbreviated as letters Function depends on structure depends on sequence Proteins (enzymes) do the work of life; work defined by sequence

10 5 10 15 20 25 30 1 A A S X D X S L V E V H X X V F I V P P X I L Q A V V S I A 31 T T R X D D X D S A A A S I P M V P G W V L K Q V X G S Q A 61 G S F L A I V M G G G D L E V I L I X L A G Y Q E S S I X A 91 S R S L A A S M X T T A I P S D L W G N X A X S N A A F S S 121 X E F S S X A G S V P L G F T F X E A G A K E X V I K G Q I 151 T X Q A X A F S L A X L X K L I S A M X N A X F P A G D X X 181 X X V A D I X D S H G I L X X V N Y T D A X I K M G I I F G 211 S G V N A A Y W C D S T X I A D A A D A G X X G G A G X M X 241 V C C X Q D S F R K A F P S L P Q I X Y X X T L N X X S P X 271 A X K T F E K N S X A K N X G Q S L R D V L M X Y K X X G Q 301 X H X X X A X D F X A A N V E N S S Y P A K I Q K L P H F D 331 L R X X X D L F X G D Q G I A X K T X M K X V V R R X L F L 361 I A A Y A F R L V V C X I X A I C Q K K G Y S S G H I A A X 391 G S X R D Y S G F S X N S A T X N X N I Y G W P Q S A X X S 421 K P I X I T P A I D G E G A A X X V I X S I A S S Q X X X A 451 X X S A X X A A protein sequence So what?

11 Flow of molecular information

12

13 In-class exercise II: translation Given the RNA sequence UUUUGUAGACUUCAUCGACCC predict the amino acid sequence coded for.

14 In-class exercise III: protein chemistry Draw chemical bonds of protein sequence KRETWA

15 Information Theory Primer – Tom Schneider http://www- lmmb.ncifcrf.gov/~toms/paper/primer /latex/index.html

16 Evolution -- general principles Individuals in a population of any species vary in many heritable traits Any population of a species has the potential to produce far more offspring than the environment can support; this leads to competition. Individuals with traits favorable to winning the competition will reproduce more, leading to higher representation of such traits in the population. (Natural selection) Individuals in a population of any species vary in many heritable traits Any population of a species has the potential to produce far more offspring than the environment can support; this leads to competition. Individuals with traits favorable to winning the competition will reproduce more, leading to higher representation of such traits in the population. (Natural selection)

17 Genetics and evolution Evolution happens in populations, not in individuals Variability seen in populations is a result of genetics; especially sexual recombination Variability of populations is nonlinear Evolution happens in populations, not in individuals Variability seen in populations is a result of genetics; especially sexual recombination Variability of populations is nonlinear

18 Molecular evolution DNA changes lead to protein changes Protein changes can lead to new functions Molecular changes are linear: accumulation of mutations over time Mixing of different forms of molecules = sexual recombination; but sexual recombination does not affect the molecules DNA changes lead to protein changes Protein changes can lead to new functions Molecular changes are linear: accumulation of mutations over time Mixing of different forms of molecules = sexual recombination; but sexual recombination does not affect the molecules

19 ? ? ?? Victoria, Queen of England grandsonAlexis, Tsarevich Of Russia Alfonso, Crown Prince of Spain Present British Royals (unaffected) Prince Albert King Edward 7 Duke Leopold Princess Alice Princess Beatrice Tracing Hemophilia in the Royal Houses Of Europe

20 Molecular evolution: what changes actually happen? Substitutions Deletions, insertions Rearrangements (inversions, transpositions) Repeats Substitutions Deletions, insertions Rearrangements (inversions, transpositions) Repeats

21 Substitutions ACCTGAACTTTACCT ACCTGAAATTTACCT

22 Insertions/deletions ACCTGAACTTACCT ACCTGAAACCT ACCTGAA---ACCT

23 Rearrangements INVERSION: ACCTGAACTTACCT ACCTGAATTCACCT

24 Repeats ACCTGAACTTACCT ACCTGAACTTCTTACCT

25 Similarity I Quantifiable attribute: e.g., % identity or alignment score Evolutionarily related regions will be similar in some measurable way (structure or sequence); similar regions are not necessarily evolutionarily related Quantifiable attribute: e.g., % identity or alignment score Evolutionarily related regions will be similar in some measurable way (structure or sequence); similar regions are not necessarily evolutionarily related

26 Similarity II High degrees of sequence similarity (30% identity) indicate evolutionary relationship; intermediate degrees of sequence similarity (15% identity) don’t necessarily; evolutionarily related molecules may show low degrees of sequence similarity (but high structural similarity)

27 Homology Homology is the conclusion from similarity data that structures and/or sequences share a common evolutionary pathway Divergence from a common ancestor via substitutions, deletions, insertions, etc. Conserved regions indicate sequences/structures important to function Homology is the conclusion from similarity data that structures and/or sequences share a common evolutionary pathway Divergence from a common ancestor via substitutions, deletions, insertions, etc. Conserved regions indicate sequences/structures important to function

28 Modular nature of proteins Many proteins are modular: some regions have one evolutionary pathway, others have another; the different regions interact to form a new function Example: NOS Many proteins are modular: some regions have one evolutionary pathway, others have another; the different regions interact to form a new function Example: NOS

29 NOS modular structure PDZ domain Oxygenase domain CaM site FMN binding domain 45 amino acid insert FAD binding domain NADPH binding domain

30 Caveats to homology Very closely related species might not have had time to diverge -- high similarity doesn’t indicate importance to function evolutionary relationships evident in sequence give history, but not always relevant to current function convergent evolution: similar form but not same pathway Convergent evolution of active sites common – cytochrome P450, chlorooxygenase, NOS active site Convergent evolution of protein sized sequence astronomically unlikely- We’ll get a taste of this when we do BLAST and Karlin-Altschul statistics Very closely related species might not have had time to diverge -- high similarity doesn’t indicate importance to function evolutionary relationships evident in sequence give history, but not always relevant to current function convergent evolution: similar form but not same pathway Convergent evolution of active sites common – cytochrome P450, chlorooxygenase, NOS active site Convergent evolution of protein sized sequence astronomically unlikely- We’ll get a taste of this when we do BLAST and Karlin-Altschul statistics


Download ppt "Sequence analysis: what is a sequence? Linear arrangement of chemical subunits Contains information: 3-D arrangement determined by the sequence; 3-D defines."

Similar presentations


Ads by Google