Presentation is loading. Please wait.

Presentation is loading. Please wait.

Signals in Sequences The number of sequences available for analysis rapidly approaches infinite. We need new ways to look at all this information.

Similar presentations


Presentation on theme: "Signals in Sequences The number of sequences available for analysis rapidly approaches infinite. We need new ways to look at all this information."— Presentation transcript:

1 Signals in Sequences The number of sequences available for analysis rapidly approaches infinite. We need new ways to look at all this information.

2 Rule 1 First rule of sequence analysis: If a residue is conserved, it is important.

3 Rule 2 Second rule of sequence analysis: If a residue is very conserved, it is very important.

4 GPCR Project GPCR is THE drug target. Lots of data available. You have ~630 GPCRs. Little structure data. 2000 sequences known. ‘Easy’ to align.

5 The GPCR (rhodopsin)

6 1 conserved aa / helix!

7 Laerte about modelling: “Use the sequence, Luke”

8 Conserved, CMA, variable QWERTYASDFGRGH QWERTYASDTHRPM QWERTNMKDFGRKC QWERTNMKDTHRVW Black = conserved White = variable Green = correlated mutations(CMA)

9 CMA and tree 1 ASASDFDFGHKM 2 ASASDFDFRRRL 3 ASLPDFLPGHSI 4 ASLPDFLPRRRV

10 CMA versus tree 1 ASASDFDFGHKMGHS 2 ASASDFDFRRRLRHS 3 ASLPDFLPGHSIGHS 4 ASLPDFLPRRRVRIT 5 ASASDFDFRRRLRIT 6 ASLPDFLPGHSIGIT Red : 1,2,5 vs 3,4,6 Black : 1,3,6 vs 2,4,5 Yellow: 1,2,3 vs 4,5,6

11 CMA on GPCR CMA on GPCR

12 CMA on GPCR

13 Florence Horn

14 Class B Ligands

15 Class B – ligand docking

16 G protein-coupling?

17 Sequence Signals Three classes of residues 1) Conserved 2) CMA 3) Variable

18 Conservation Artefacts Conservation can result from Not enough sequences Too conserved sequences Over-alignment

19 Variability Artefacts Variability can result from Wrong sequence choice Variable loops Alignment errors

20 CMA Artefacts CMA can result from Wrong sequence choice Poor sequence homogeneity Over-fitting

21 Recalcitrant residues

22 Sequence Entropy 20 E i =  p i ln(p i ) i=1

23 Sequence Variability Sequence variability is the number of residues that is present in more than 0.5% of all sequences.

24 Entropy - Variability Entropy = Information Variability = Chaos

25 Entropy - Variability Variability is result of evolution. Entropy is the protein’s break on evolutionary speed.

26 GPCR Entropy - Variability 11 Red 12 Orange 22 Yellow 23 Green 33 Blue

27 GPCR Location 11 Red 12 Orange 22 Yellow 23 Green 33 Blue

28 Ras Entropy - Variability

29 Ras Location 11 Red 12 Orange 22 Yellow 23 Green 33 Blue

30 Protease Entropy - Variability

31 Protease Location 11 Red 12 Orange 22 Yellow 23 Green 33 Blue

32 Globin Entropy - Variability GPCR

33 Globin Location 11 Red 12 Orange 22 Yellow 23 Green 33 Blue

34 GPCR Again….

35 GPCR Location (Again) 11 Red 12 Orange 22 Yellow 23 Green 33 Blue

36 GPCR signaling 11 Purple 12 Red 22 ‘Yellow’ 23 Green 33 Blue

37 Summary Given infinitely many sequences: Every residues role known. Signaling paths detectable. So, sequences contain many signals

38 Thanks to: Laerte OliveiraSao Paulo Wilma KuipersWeesp Florence HornSan Francisco Bob BywaterCopenhagen Nora vd WendenThe Hague Mike SingerNew Haven Ad IJzermanLeiden Margot BeukersLeiden Amos BairochGeneva Fabien CampagneNew York


Download ppt "Signals in Sequences The number of sequences available for analysis rapidly approaches infinite. We need new ways to look at all this information."

Similar presentations


Ads by Google