Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational systems biology: from the generation of testable hypothesis to uncovering organizing principles in living systems Computational systems biology:

Similar presentations


Presentation on theme: "Computational systems biology: from the generation of testable hypothesis to uncovering organizing principles in living systems Computational systems biology:"— Presentation transcript:

1 Computational systems biology: from the generation of testable hypothesis to uncovering organizing principles in living systems Computational systems biology: from the generation of testable hypothesis to uncovering organizing principles in living systems NCBI, NLM National Institutes of Health NCBI, NLM National Institutes of Health M. Madan Babu, PhD

2 Explosion of information about living systems Major Challenge – Integration of information Generate experimentally testable hypothesis Uncover organizing principles of specific processes at the systems level Expression 5,000 different conditions 20 organisms (ArrayExpress, SMD, GEO) Interaction 100,000 interactions 5 organisms (Bind, DIP, publications) Structure 33,000 structures from 300 organisms (PDB, MSD) Sequence 45,000,000 sequences from 160,000 organisms (EBI, NCBI)

3 Integration of data to uncover general organizing principles Integration of gene expression data reveals dynamics in transcriptional networks Integration of data to generate testable hypotheses Sequence, Structure, Expression and Interaction data provides convincing support Regulation in Biological Systems Introduction to transcriptional regulatory networks Discovery of sequence specific transcription factors in the malarial parasite

4 Mosquito Human Mosquito Liver RBC www.cdc.gov Previous comparative genomic analysis of eukaryotes suggested lack of detectable transcription factors in Plasmodium 5300 genes with over 700 metabolic enzymes Extensive complement of chromosomal regulatory proteins Extensive complement signaling proteins (GTPases, kinases) Large number of genes Complex life cycle Genes need to be regulated

5 Alternative regulatory mechanisms Chromatin-level regulation Post-translational modification RNA based regulation Undetected transcription factors Distantly related or unrelated to known DNA binding domains Possible explanations for the paradoxical observation Proteome of Plasmodium Profiles & HMMs of known DBDs bZIP Homeo MADs AT-hook Forkhead ARID PF14_0633 + ? AT-Hook SEG Uncharacterized Globular domain ~60 aa

6 Characterization of the globular domain – sequence analysis I Non-redundant database +....... Lineage specific expansion in Apicomplexa Plasmodium falciparum Plasmodium vivax Cryptosporidium parvum Theileria annulata Cryptosporidium hominis Profiles + HMM of this region Non-redundant database + Floral Homeotic protein Q (Triticum) 49L, an endonuclease (X. oryzae phage Xp10) Globular region maps to AP2 DNA-binding domain

7 Non-redundant database + AP2 DNA-binding Domain from D. Psychrophila DP2593 MAL6P1.287 (Plasmodium falciparum) Cgd6_1140/Chro.60146 (Cryptosporidium) AP2 DNA-binding domain maps to the Globular region Characterization of the globular domain – sequence analysis II Multiple sequence alignment of all globular domains JPRED/PHD Sequence of secondary structure is similar to the AP2 DNA-binding domain Homologs of the conserved globular domain constitutes a novel family of the AP2 DNA-binding domain S1S2S3H1

8 Characterization of the globular domain – structural analysis I A. thaliana ethylene response factor (ATERF1 - 1gcc – NMR structure) Binds GC rich sequences S1S2S3H1 S1S2S3H1 Predicted SS of ApiAP2 SS of ATERF1 S1S2S3 H1 12 residues show a strong pattern of conservation and these are involved in key stabilizing hydrophobic interactions that determine the path of the backbone in the three strands and helix of the AP2 domain Core fold of the ApiAp2 domain will be similar to the plant AP2 DNA-binding domain

9 Characterization of the globular domain – structural analysis II Y186 R147 K156 T175 R170 W154 R152 R150 E160 W172 W162 G5 C6 C7 G21 G20 G18 G17 R152 --- G5 (oxo group) D/N --- A (amino group) R150 --- G20 (oxo group) S/T --- A (amino group) Changes in base-contacting residues suggest binding to AT-rich sequence S2S3 Charged residues in the insert may contact multiple phosphate groups to provide affinity ApiAp2 domain binds DNA in a sequence specific manner

10 RBC infection & merozoite burst Characterization of the globular domain – expression analysis I Mosquito Human Mosquito www.cdc.gov Complex life cycle Liver RBC Intra-erythrocyte developmental cycle DeRisi Lab mRNA expression profiling Using microarray (sorbitol syncronization)

11 Characterization of the globular domain – expression analysis II Co-expressed genes Average expression profile of all genes 0 46 Time points Ring stage Trophozoite stage Early Schizont stage Schizont stage 22 Transcription factors 046 Time points Striking expression pattern in specific developmental stages suggests that they could mediate transcriptional regulation of stage specific genes

12 Characterization of the globular domain – interaction analysis I Protein interaction network of P. falciparum Protein (1267) Physical interaction (2846) LaCount et. al. Nature (2005) Modified Y2H: Gal4 DBD + Protein + auxotrophic gene RNA isolated from mixed stages of Intra-erythrocyte developmental cycle Guilt by association Function of interacting neighbors provides clues about function of the protein

13 Characterization of the globular domain – interaction analysis II Protein interaction network of P. falciparum Guilt by association supports the role of ApiAp2 proteins to be involved in regulation of gene expression ApiAp2 proteins (13) Chromatin proteins (8) 50% hypothetical Nucleosome assembly HMG protein Glycolytic enzymes Antigenic proteins Have a PPint domain MAL8P1.153 (ES) PFD0985w (S) PF10_0075 (T) PF07_0126 (R) Network of ApiAp2 proteins (97 interactions, 93 proteins) Gcn5

14 SequenceStructureExpressionInteraction Conclusion - I Integration of different types of experimental data allowed us to discover potential transcription factors in the Plasmodium genome Integration of data can generate experimentally testable hypotheses Balaji S, Madan Babu M, Lakshminarayan Iyer, Aravind L Nucleic Acids Research (2005)

15 Integration of data to uncover general organizing principles Integration of gene expression data reveals dynamics in transcriptional networks Integration of data to generate testable hypotheses Sequence, Structure, Expression and Interaction data provides convincing support Regulation in Biological Systems Introduction to biological networks & transcriptional regulatory networks Discovery of sequence specific transcription factors in the malarial parasite

16 Networks in Biology Nodes Links Interaction A B Network Proteins Physical Interaction Protein-Protein A B Protein Interaction Metabolites Enzymatic conversion Protein-Metabolite A B Metabolic Transcription factor Target genes Transcriptional Interaction Protein-DNA A B Transcriptional

17 Structure of the transcriptional regulatory network Scale free network (Global level) all transcriptional interactions in a cell Albert & Barabasi Madan Babu M, Luscombe N, Aravind L, Gerstein M & Teichmann SA Current Opinion in Structural Biology (2004) Motifs (Local level) patterns of Interconnections Uri Alon & Rick Young Basic unit (Components) transcriptional interaction Transcription factor Target gene

18 Properties of transcriptional networks Local level: Transcriptional networks are made up of motifs which perform information processing task Global level: Transcriptional networks are scale-free conferring robustness to the system

19 Transcriptional networks are made up of motifs Single input Motif - Co-ordinates expression - Enforces order in expression - Quicker response ArgR ArgD ArgEArgF Multiple input Motif - Integrates different signals - Quicker response TrpRTyrR AroM AroL Network Motif “Patterns of interconnections that recur at different parts and with specific information processing task” Feed Forward Motif - Responds to persistent signal - Filters noise Crp AraC AraBAD Function Shen-Orr et. al. Nature Genetics (2002) & Lee et. al. Science (2002)

20 N (k)  k  1 Scale-free structure Presence of few nodes with many links and many nodes with few links Transcriptional networks are scale-free Scale free structure provides robustness to the system Albert & Barabasi, Rev Mod Phys (2002)

21 Scale-free networks exhibit robustness Robustness – The ability of complex systems to maintain their function even when the structure of the system changes significantly Tolerant to random removal of nodes (mutations) Vulnerable to targeted attack of hubs (mutations) – Drug targets Hubs are crucial components in such networks Haiyuan Yu et. al. Trends in Genetics (2004)

22 Summary I - Introduction Transcriptional networks are made up of motifs that have specific information processing task Transcriptional networks are scale-free which confers robustness to such systems, with hubs assuming importance Madan Babu M, Luscombe N et. al Current Opinion in Structural Biology (2004)

23 Are there differences in the sub-networks under different conditions? Cell cycle Sporulation Stress Static network Across all cellular conditions Dynamic nature of the regulatory network in yeast How are the networks used under different conditions?

24 Dataset - gene regulatory network in Yeast 3,962 genes (142 TFs +3,820 TGs) 7,074 Regulatory interactions Individual experiments TRANSFAC DB + Kepes dataset 288 genes + 477 genes 356 interactions + 906 interactions ChIp-chip experiments Snyder lab + Young lab 1560 genes + 2416 gene 2124 interactions + 4358 interaction

25 Integrating gene regulatory network with expression data 142 TFs 3,820 TGs 7,074 Interactions Transcription Factors 1 condition 2 conditions 3 conditions 4 conditions 5 conditions 142 TFs 1,808 TGs 4,066 Interactions Target Genes Gene expression data for 5 cellular conditions Cell-cycle Sporulation DNA damage Diauxic shift Stress

26 Back-tracking method to find active sub-networks Gene regulatory network Identify differentially regulated genes Find TFs that regulate the genesFind TFs that regulate these TFs Active sub-network

27 DNA damage Cell cycle Sporulation Diauxic shift Stress Active sub-networks: How different are they ? Multi-stage processes Binary Processes

28 Network Motifs Milo et.al (2002), Lee et.al (2002) Single Input Motif (SIM) – 23% Feed-forward Motif (FF) – 27% Multi-Input Motifs (MIM) – 50%

29 Sub-networks : Network motifs Network motifs are used preferentially in the different cellular conditions

30 - Do different proteins become hubs under different conditions? - Is it the same protein that acts as a regulatory hub? Cell cycleSporulationDiauxic shiftDNA damageStress Condition specific networks are scale-free

31 Regulatory hubs change with conditions Cluster TFs according to the number of target genes active in each condition Different TFs become key regulators in different conditions CCSPDSDDSRTF 25045203015 Swi6

32 Hubs regulate other hubs to initiate cellular events Suggests a structure which transfers weight between hubs to trigger cellular events

33 Network Parameters Connectivity Path length Clustering coefficient

34 Network Parameters - Connectivity Outgoing connections = 49.8 on average, each TF regulates ~50 genes Changes Incoming connections = 2.1 on average, each gene is regulated by ~2 TFs Remains constant

35 Network parameters : Connectivity Binary: Quick, large-scale turnover of genes Multi-stage: Controlled, ticking over of genes at different stages “Binary conditions”  greater connectivity “Multi-stage conditions”  lower connectivity

36 Number of intermediate TFs until final target Path length 1 intermediate TF = 1 Indication of how immediate a regulatory response is Average path length = 4.7 Network Parameters – Path length Starting TF Final target

37 “Binary conditions”  shorter path-length  “faster”, direct action “Multi-stage” conditions  longer path-length  “slower”, indirect action  intermediate TFs regulate different stages Binary Multi-stage Network parameters : Path length

38 Clustering coefficient = existing links/possible links = 1/6 = 0.17 Measure of inter-connectedness of the network Average coefficient = 0.11 6 possible links 1 existing link 4 neighbours Network Parameters – Clustering coefficient Ratio of existing links to maximum number of links for neighboring nodes

39 “Binary conditions”  smaller coefficients  less TF-TF inter-regulation “Multi-stage conditions”  larger coefficients  more TF-TF inter-regulation Binary Multi-stage Network parameters : Clustering coeff

40 Sub-networks have evolved both their local structure and global structure to respond to cellular conditions efficiently multi-stage conditions fewer target genes longer path lengths more inter-regulation between TFs binary conditions more target genes shorter path lengths less inter-regulation between TFs

41 Implications First overview of the dynamics of the transcriptional regulatory network of a eukaryote Identification of key regulatory hubs under different conditions can serve as good drug targets Provides insights into engineering regulatory interactions Methods developed to reconstruct and compare active networks are generically applicable

42 Conclusions - II Sub-networks have evolved both their local structure and global structure to respond to cellular conditions efficiently Luscombe N, Madan Babu M et. al Nature (2004) Network motifs are preferentially used under the different cellular conditions and different proteins act as regulatory hubs in different cellular conditions Integration of data can uncover organizing principles in living systems

43 BalajiLakshminarayan Aravind Acknowledgements National Center for Biotechnology Information National Institutes of Health MRC Laboratory of Molecular Biology Nick Luscombe Sarah Teichmann Haiyuan Yu Mike Snyder Mark Gerstein


Download ppt "Computational systems biology: from the generation of testable hypothesis to uncovering organizing principles in living systems Computational systems biology:"

Similar presentations


Ads by Google