Quantitative analysis of domain interactomes Jason Lee Capstone presentation Sp `07
Protein domain Domain architecture of proteins A protein with three domains Protein PKC Each domain carries out certain function Modular nature confers protein a capability to compose domains to effect desired functions
Domain interaction Ex) Pkinase domain Different interfaces mediating domain- domain interaction –distinct ways of interaction Possible units of interaction (a) pdb_1ung: cell division kinase 5 and CDK5 activator
Domain interaction (cont’d) (b) pdb_1buh: CDK2 and CKSHS1 (c) pdb_1b6c: TGF-beta receptor R4 and FK506 binding protein
Purpose and pertinence of current study Characterize domain interactions Characterize protein interactions that are mediated by domain interaction Use domain interaction information to predict protein interactions Gain evolutionary perspective
Data and methods Databases: ipfam, BIND BIND: a compilation of known protein interactions Ipfam: known domain interactions obtained from structural information Five species were examined: human, mouse, fruit fly, yeast and E. coli protein interactions among proteins Take intersection between ipfam and BIND: compile protein interactions that involve known ipfam pair HumanMouseFruit flyYeastE. ColiTotal proteins interaction
Example Protein TGFBR1 interacts with 132 other proteins according to BIND Domains activin_recp and pkinase comprise TGFBR1 Each BIND interaction is checked to see if it involves any of ipfam DDI pairs 44 protein interactions are found to have ipfam pair TGFBR1+GI – Pkinase+PBD TGFBR1+GI Pkinase+FKBP_C TGFBR1+GI Activin_recp+TGF_beta …
Obtained domain interactions 1884 domain-domain interactions among 1587 domains HumanMouseFruit flyYeastE. Coli Domains Interaction
Low coverage of DDI over PPI 1650 PPI in human involve known domain pairs, while 9900 did not 14.29% of total human protein interactions From 5 species, 4604 PPI’s involve at least one domain pair, while did not have any 9.39% of total interactions HumanMouseFruit flyYeastE. coliTotal Ipfam intrn Non-ipfam intrn % Total intrn
Possible explanations of low coverage of DDI High FP rate in PPI data Incomplete coverage of ipfam DDI data Many PPI’s are not mediated by DDI Possible expedience of protein interactions –Domain interaction may be too restricting to answer all physiological and molecular demands from organisms
Domain interaction graph (H. sapiens) Entire domain interactome
Protein interaction graph (H. sapiens) Many subgraphs, only the largest subgraph is shown
Comparison of node degree distribution Both show power-law distribution
Comparison of graph topologies Both domain and protein interaction graphs show scale- free property Domains on average interacts with half the number of partners a protein interacts with PPIDDI Subgraphs Avg. node degree Avg. node degree (excl. single partner nodes) Largest subgraph5422 (68.50)71 (15.14) Nodes
Phylogenetic tree of five species Human a Mouse E. coli Fruit fly Yeast Mammal Multi-cellularEukaryotes Prokaryote single-cell
Measuring commonality of domain composition and interactomes between species Inner product of domains and domain pairs between two species S and T IP_domain =|Common_domains| / sqrt (|Domains_S| * |Domains_T|) IP_pair = |Common_domain_pairs| / sqrt (|Domains_pairs_S| *|Domains_pairs_T|)
Evolutionary consideration Common domains Common domain pairs HumanMouseFruit flyYeastE. coli Human Mouse Fruit fly Yeast E. Coli HumanMouseFruit flyYeastE. coli Human Mouse Fruit fly Yeast E. Coli Common domains and domain pairs reflect evolutionary relationship
Ontological characterization Use GO controlled vocabulary and compare physiological reflection of domain compositions of species Correlation between physiology and domain composition Differential domains – domains that are present exclusively in one lineage or species and not in the other Multi-cellularsingle-cell Response to stimulus 103 Cell communication105 Regulation of cellular process 84 Signal transducer113 Enzyme regulator92 transport1729
Ontological characterization (cont’d) Categories of other differential domains unique to multicellular species –Cell adhesion (2) –Regulation of biological processes (2) –Cell differentiation (1) –Cell death (1) –Cell homeostasis (1) –Coagulation (1) Domains involved in multi-cellularity are conspicuous
Domain node degree and DomainDegreeInstances (copy number) Occur. in intrn. (interaction frequency) Associativity RAS Pkinase RNA_pol_RPB1_ Ubiquitin RNA_pol_RPB2_ Trypsin AAA RNA_pol_L GTP_EFTU SNARE Ten domains with largest degrees
Correlation among node degree, copy number, etc. (all five species) Correlation between DDI node degree and interaction frequency: Correlation between DDI node degree and number of instances: When RNA polymerase domains are excluded –Degree and interaction frequency: –Degree and number of instances: Associativity: number of domains a domain appears together in peptide sequences –Ex) domain pkinase associates with 45 domains Node degree and associativity: Having a large number of domain partners does not mean a domain mediates many protein interactions nor it is associated with many other domains
Interaction propensity Between a pair of domains Only hetero-domain pairs are considered due to possible crystallization artifacts of homo-domain pairs Interaction propensity = |pair_occurrences| / ( |domain_0| * |domain_1| ) Domain0Domain1Pairs| Domain_0 || Domain_1|i-prop (%) Cyclin_NPkinase AnkPkinase PHRas Cyclin_CPkinase FGFIG ANKTIG ANKRHD SH2STAT_bind Sufficient selectivity can be encoded at the molecular level onto domain interaction Protein interactions mediated by domain interactions are very specific
Discussion A domain on average has a smaller number of interaction partners than proteins Only small number of protein interactions are mediated by domain interactions Domain composition and domain interactomes reflect evolutionary relationship between species Correlation among domain node degree, domain copy number, occurrences in interaction and number of associated domains were all very low Domain interaction is a scaffold and specificity is tuned up by atomic and residue level coding
Acknowledgement Prof. Sun Kim Prof. Haixu Tang Prof. Predrag Radivojac Prof. Mehmet Dalkilic Dr. John Colburne Prof. Marty Siegel Linda Hostetter