Presentation is loading. Please wait.

Presentation is loading. Please wait.

What can (many) sequences tell us?

Similar presentations


Presentation on theme: "What can (many) sequences tell us?"— Presentation transcript:

1 What can (many) sequences tell us?

2 Nuclear receptor function

3 Nuclear receptor family
NR1C1-PPAR NR1C2-PPAS NR1C3-PPAT NR1D1-EAR1 NR1D2-BD73 NR1I3-MB67 NR1I4-CAR1-MOUSE- NR1H2-NER NR1H3-LXR NR1H4-FAR NR4A2-NOT NR4A3-NOR1 NR4A1-NGFI NR2F1-COTF NR2F2-ARP1 NR2F6-EAR2 NR2E3-PNR NR2B1-RRXA NR2B2-RRXB NR2A2-HN4G NR3C1-GCR NR3C4-ANDR NR3C3-PRGR NR3A1-ESTR NR3A2-ERBT NR3B1-ERR1 NR3B2-ERR2 NR5A1-SF1 NR5A2-FTF NR1I1-VDR NR1B3-RRG1 NR2E1-TLX NR2C1-TR2-11 NR2C2-TR4 NR6A1-GCNF NR2B3-RRXG NR2A1-HNF4 NR2A5-HN4 d? NR0B1-DAX1 NR0B2-SHP NR3C2-MCR NR1F3-RORG NR1F2-RORB NR1F1-ROR1 NR1A2-THB1 NR1A1-THA1 NR1I2-PXR NR1B2-RRB2 NR1B1-RRA1

4 Nuclear receptor structure
A-B C D E F AF-1 DNA LBD C DNA binding domain highly conserved > 90% similarity E Ligand binding domain conserved protein fold > 20% sequence similarity

5 The questions As Organon is paying the bills, question one is,
of course☺, how do ligands relate to activity? With and without ligand being present, NRs can bind co-activators and co-repressors, so what is an agonists, an antagonists, or an inverse agonists? What is the role of each amino acid in the NR LBD? Which data handling is needed to answer these questions?

6 3D structure LBD (hER)

7 Available NR data 56 structures in (PDB) (>200 now)
>500 sequences (scattered) (>1500 now) >1000 mutations (very scattered) >10000 ligand-binding studies (secret) Disease patterns, expression, >1000 SNPs, genetic localization, etc., etc., etc. This data must be integrated, sorted, combined, validated, understood, and used to answer our questions.

8 Step 1 The first important step is a common numbering scheme.
Whoever solves that problem once and for all should get three Nobel prices.

9 Large data volumes Large data volumes allow us to develop new data analysis techniques. Entropy-variability analysis is a novel technique to look at very large multiple sequence alignments. Entropy-variability analysis requires ‘better’ alignments than routinely are obtained with ‘standard’ multiple sequence alignment programs.

10 Part of the big alignment

11 Vriend’s first rule of sequence analysis
If it is conserved, it is important

12 Vriend’s second rule of sequence analysis
If it is very conserved, it is very important

13 QWERTYASDFGRGH QWERTYASDTHRPM QWERTNMKDFGRKC QWERTNMKDTHRVW
What is CMA? QWERTYASDFGRGH QWERTYASDTHRPM QWERTNMKDFGRKC QWERTNMKDTHRVW Red = conserved Green = variable Blue = correlated

14 Wilma Wilma Kuipers Thesis

15 Correlation analysis Correlate sequences with ligand binding affinities Alignments showed 100% correlation of affinity for pindolol and the absence/presence of Asn386 Obviously, Asn386 plays an important role in ligand binding Receptor 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ... Affinity + - res. 386 N T A V L Y 1 = 5HT-1a 2 = 5HT-1b 3 = 5HT-1d ....

16 Wilma Wilma Kuipers Thesis

17 Wilma Wilma Kuipers Thesis

18 Entropy Sequence entropy Ei at position i is calculated from the frequency pi of the twenty amino acid types (p) at position i: 20 Ei = S pi ln(pi) i=1

19 Variability Sequence variability Vi is the number of amino acid types observed at position i in more than 0.5% of all sequences.

20 Ras Entropy-Variability
11 Red 12 Orange 22 Yellow 23 Green 33 Blue

21 Protease Entropy-Variability
11 Red 12 Orange 22 Yellow 23 Green 33 Blue

22 Globin Entropy-Variability
11 Red 12 Orange 22 Yellow 23 Green 33 Blue

23 GPCR Entropy-Variability; signalling path
11 G protein 12 Support 22 Signaling 23 Ligand in 33 Ligand out

24 NR LBD Entropy-Variability
11 main function 12 first shell around main function 22 core residues (signal transduction) 23 modulator 33 mainly surface 33 23 12 22 11

25 Mutation data 1095 entries 41 receptors 12 species 3D numbers
7 sources

26 Mutation data

27 Mutation data

28 Ligand binding data Ligand-binding positions extracted from PDB files (nomenclature) Categorized in ‘very frequent’ to ‘not so frequent’ binder Type of ligand (agonist/antagonist=inverse agonist…)

29 Ligand-binding residues
LIG 1 more than 50 of 56 LIG of 56 LIG of 56 LIG out of 56 H-bonds (~35,15,15,15)

30 Example: role of Asp 351 agonist antagonist

31 Ligand, cofactor and dimerization data combined with entropy-variability analysis

32 Conclusions: Data is difficult, but we need it (sic); life would be so nice if we could do without it. PDB files are the worst. Nomenclature is not homogeneous. Ontologies…. Much data has been carefully hidden in the literature, where it can only be found back with great difficulty. Residue numbering is difficult but very necessary. Variability-entropy analysis is powerful, but requires very 'good' alignments.

33 A short break for a word from our sponsors
Laerte Oliveira Adje Margot F L O R E N C H O R N Our industrial sponsor: Wilma Kuipers Weesp Bob Bywater Copenhagen Nora vd Wenden The Hague Mike Singer New Haven Ad IJzerman Leiden Margot Beukers Leiden Fabien Campagne New York Øyvind Edvardsen TromsØ Simon Folkertsma Frisia Henk-Jan Joosten Wageningen Joost van Durma Brussels David Lutje Hulsik Utrecht Tim Hulsen Goffert Manu Bettler Lyon David Tim Elmar Krieger Fabien Manu Simon Folkertsma


Download ppt "What can (many) sequences tell us?"

Similar presentations


Ads by Google