C enter For C omputational B iology and B ioinformatics Protein Intrinsic Disorder, Cell Signaling and Alternative Splicing
Outline of Talk Examples of intrinsically disordered proteins Prediction of natural disordered regions Disorder and cell signaling Disorder and molecular recognition Disorder and alternative splicing Protein isoforms and functional diversity via the linkage of alternative splicing and intrinsic disorder
Molecular Recognition Element (MoRE) Cyclin A CDK p27kip1 3D structure from: Russo A et al., Nature 382: (1996)
Disorder and Function CategoryChangeExamplesDescriptions Molecular Recognition D O 113 Inter- and Intra-protein, ssDNA, dsDNA, tRNA, rRNA, mRNA, nRNA, bilayers, ligands, co-factors, metals Protein Modification Variable 36 Acetylation, fatty acylation, glycosylation, methylation, phosphorylation, ADP- ribosylation, ubiquitination, proteolytic digestion Entropic Chains Variable17 Linkers, spacers, bristles, clocks, springs, detergents, self-transport Dunker AK et al., Adv Protein Chem 62: (2002)
Prediction of Disorder Predictor Validation on Out-of-Sample Data Prediction Attribute Selection or Extraction Separate Training and Testing Sets Predictor Training Disordered Sequence Data
p53 MoREs PONDR ® VL-XT Score Oldfield et al., Biochemistry 44: (2005)
Protein Interaction Domains
GYF Domain and CD2 Chain B Freund et al., (2002) Embo J. 21:
GYF Domain of CD2 Binding Protein Freund et al., (1999) Nat. Struct. Biol. 6:
CD2: Binding Partner of GYF Domain Freund et al., (1999) Nat. Struct. Biol. 6: Consensus sequence (GYF binding sites) has the sequence: ppppghr. The peptide in the crystal structure has the aa sequence: shrppppghrv.
Analysis of Signaling Interactions Examined each interaction on Pawson’s website. Almost all of the interactions involved ordered regions binding to disordered partners. Conclusion: if Pawson’s examples are typical, then a very significant proportion of protein-protein signaling interactions use disordered regions.
Parallel Paradigms Catalysis AA seq → 3-D Structure → Function Signaling AA seq → Disordered → Function Ensemble
Alternative Splicing and Intrinsic Disorder Find proteins with both ordered and disordered regions. Find mRNA alternative splicing information for these proteins and map to the ordered and disordered regions. For alternatively spliced regions of mRNA, do they code for ordered protein more often or do they code for disordered protein more often?
Alternative Splicing 5’ UTR3’ UTR Coding Sequence
Alternative Splicing 5’ UTR3’ UTR Coding Sequence mRNA Protein sequence Transcription Translation
Alternative Splicing 5’ UTR3’ UTR Coding Sequence mRNA 1 mRNA 2 Isoform 1Isoform 2 Transcription Translation
Alternative Splicing 5’ UTR3’ UTR Coding Sequence mRNA 1 mRNA 2 Isoform 1Isoform 2 Transcription Translation Folding AS region
Structural Studies of AS Pyrophosphorylase Sulphotransferase RAC1 Tumor necrosis factor Glutathione S-transferase Disordered AS regions Structured AS regions
Studying the Relationship ID AS DisProt Database of proteins with experimentally determined structure and disorder ASG (AS Gallery) SwissProt (VarSplic) ASED dataset: 46 proteins 74 characterized AS regions >19,000 charaterized residues, 35% ID
Results on ASED Distribution of structurally characterized AS regions
Enlarging the Dataset PONDR® VSL1 ID predictor (> 80% accuracy) ASED dataset Validation ASSP dataset 558 AS human proteins from SwissProt 1,266 AS regions Analysis
Global Results AS regions disorder distributions in ASED and ASSP
Alternative Splicing and Disorder Ordered Proteins: active site residues non-local in sequence, become associated by protein folding Disordered Proteins and regions: functional residues localized in squence Functional regions for signaling and regulation are located one after another Alternative splicing edits functional sets and thereby leads to regulatory and signaling diversity
Breast Cancer Protein 1 (BRCA1)
Summary Protein signaling interactions involve intrinsic disorder (ID) a high percentage of the time. Alternative splicing (AS) often occurs in regions of pre-mRNA that code for intrinsic disorder. AS + ID facilitate regulatory and signaling diversity. Is AS + ID the critical combination for the evolution of multi-cellular organisms?
Acknowledgements Indiana University Predrag Radivojac Pedro Romero Marc Cortese Gerard Go Amrita Mohan Jie Sun Siama Zaida Jack Yang University of Idaho Celeste J. Brown Chris Williams Molecular Kinetics Vladimir Uversky Yugong Cheng Temple University Zoran Obradovic Slobodan Vucetic Vladimir Vacic Kang Peng Rockefeller University Lilia Iakoucheva Sebat University of Wisconsin John Markley Chris Oldfield UCSF Ethan Garner PNNL Richard Smith Eric Ackerman
Support NSF CSE II NIH R01 LM USDA INGEN ®, Lilly Endowment Molecular Kinetics