Download presentation
Presentation is loading. Please wait.
Published byLouis Lesage Modified over 6 years ago
1
Functional Coherence in Domain Interaction Networks
Prof. Ananth Grama
2
Dept. of Computer Science, Purdue University
Outline Motivation Protein and Domain Interaction Networks Formal framework Properties for term-, set- similarity measures New Similarity Metric Results Comparison of measures Comparison of PPI, DDI networks Dept. of Computer Science, Purdue University
3
Dept. of Computer Science, Purdue University
Motivation Extracting functional information from protein-protein interactions Noisy, incomplete, generic, static data from high throughput experiments Typical proteins are composed of multiple domains Independent unit (function, evolution, folding) Behind protein-protein interactions there are protein domains interacting physically with one another. understanding protein interactions at the domain level gives a global view of the protein interaction network and possibly of protein functions p1 d1 d2 d3 d4 p2 Domain-domain interaction Dept. of Computer Science, Purdue University
4
Dept. of Computer Science, Purdue University
Motivation How does functional modularity manifests itself in a network of molecular interactions? Explore relationship between functional similarity and network proximity Functional annotations available for domains and proteins vastly differ Do current similarity measures work in unbiased manner? (due to incompleteness of annotation) Are they statistically meaningful and biologically interpretable? Annotation for domains is derived from proteins: as such its more general, scarce and incomplete Dept. of Computer Science, Purdue University
5
Dept. of Computer Science, Purdue University
Formal Framework C = { ci | 0 ≤ i < N } is a finite partially ordered set of concepts (Ontology). Concepts are related by binary relationship, denoted by eg: c3 c1, c6 c3, c5 r Set of Ancestors Ai = { ck | ci ck } Two concepts (ci, cj) are comparable (~) if either ci cj or cj ci All concepts in Ai may not be comparable as the ontology is a DAG (as opposed to a tree) C3 C0 = r C2 C1 C4 C5 C6 Dept. of Computer Science, Purdue University
6
Properties for term-similarity
Similarity (δ) of two terms based on underlying taxonomical relationship Existing measures Distance based: Count the number of edges between the nodes δE(ci,cj)=2*MAX-min[len(ci,cj)] Fails property (4) as distance is uniform over all edges symmetric. more specific terms should have at least as much self-similarity as more general terms. 3. a term should not be less similar to itself than to any other term. 4. that terms with more specific common ancestors should be more similar to each other, compared to those with less specific common ancestors. Dept. of Computer Science, Purdue University
7
Existing metrics for term-similarity
Information Content: If Gc be set of molecules associated with concept c, then IC(c) = - log2 (|Gc|/|Gr|) δR(ci,cj)= max [ -log2 ( c ) ], c Є Ai and c Є Aj (c is common Ancestor) Normalization: δL(ci,cj)= 2 * δR(ci,cj) / (IC(ci) + IC(cj)) Hybrid approach: δJC(ci,cj)= (1 - 2 * δR(ci,cj) + IC(ci) + IC(cj))-1 All three satisfy term-similarity properties Dept. of Computer Science, Purdue University
8
Properties for set-similarity
Let S be set of concepts, we want a measure ρ(Si, Sj) to access the semantic similarity of two sets Symmetric adding a common annotation for two molecules should not decrease the similarity between these two molecules. if new annotations are added for a new molecule, the similarity of this molecule to any other molecule should not decrease. a set of annotations should be at least as similar to itself as it is to any other set. Dept. of Computer Science, Purdue University
9
Existing metrics for set-similarity
Average Violates properties (ii), (iii) and (iv) Maximum Weakly satisfies (ii) Average of Maximums Fails properties (ii), (iii) and (iv) Dept. of Computer Science, Purdue University
10
IC based set similarity
Extend the notion of minimum common ancestor (λ) to sets of terms as Information content of a set is defined as: Where is set associated with all terms in MCA of Si, Sj This satisfies all 4 properties, can be extended and Dept. of Computer Science, Purdue University
11
Dept. of Computer Science, Purdue University
Datasets Protein-Protein interactions Extract physical interactions from BioGRID database Binary data (no reliability score) Domain-Domain interactions DOMINE database Confidence score used to split dataset Struct: Only structure based interactions HC+NA : High Confidence (HC) and Structure based interactions HC+MC : High Confidence (HC) and Medium Confidence (MC) interactions Comp-2: Interactions predicted by at least two computational approaches Comp-1: Interactions predicted by at least one computational approach Dept. of Computer Science, Purdue University
12
Comparison of Semantic Similarity Measures
Negative relation between network distance and functional similarity The proposed information content based measure (ρJC) provides the sharpest decline in semantic similarity for distance<4 For each network, we compute the distance between all pairs of molecules (proteins or domains) in the network. Then, we group molecule pairs according to their distance and compute the average semantic similarity for each group. C. elegans PPI network Dept. of Computer Science, Purdue University
13
Comparison of Semantic Similarity Measures
Proposed metric (ρJC) provides large similarity score for larger fraction of pairs at close distances (1,2), and low similarity score for large fraction at distance>2 A comparison of the distribution of semantic similarity scores for the average information content (resnik) and self-normalized information content (rho_JC) measures is shown. Structural DDI network Dept. of Computer Science, Purdue University
14
Comparison of PPI and DDI Networks
we compare the relationship between network proximity and functional similarity comprehensively, using several PPI and DDI networks. Relation between network proximity and semantic similarity with respect to molecular function Dept. of Computer Science, Purdue University
15
Comparison of PPI and DDI Networks
Relation between network proximity and semantic similarity with respect to biological process Dept. of Computer Science, Purdue University
16
Comparison of PPI and DDI Networks
Immediate and Indirect neighbors perform similar functions Functional similarity is stronger in Struct DDI network After normalization, the relationship between functional similarity and network distance is stronger in computationally inferred DDI networks than that in PPI networks network proximity in DDI networks is likely to be a better indicator of functional modularity, than that in PPI networks. DDI networks that are based on structural information are relatively more reliable than PPI networks, which may come from noisy high-throughput screening. Dept. of Computer Science, Purdue University
17
Dept. of Computer Science, Purdue University
Summary We present necessary properties for any admissible metric for term- and set-similarity Current metrics are not admissible, develop new metric for set-similarity Proposed metric provides highly intuitive biological interpretation Comprehensive comparative analysis of PPIs and DDIs validates the role of DDIs in quantifying functional coherence Dept. of Computer Science, Purdue University
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.