Download presentation
Presentation is loading. Please wait.
1
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING
2
BACKGROUND Completion of sequencing projects Need for functional discovery Emerging area of study: Large scale genomic analysis Similarity of living systems
3
GENETIC NETWORKS Modelling genetic networks Interaction of genes and proteins Relationship between topology and function
4
MOTIVATION Common biological processes Comparison of networks Discovering missing interactions Discovering missing genes
5
GRAPH MATCHING mpn132mpn124mpn141mpn145mpn134mpn133mge234mge235mge236mge312mge314mge310mge313mge336mge337 Search-based Algorithm Pruning Techniques G1 G2
6
ROADMAP Scale-Free Networks Modelling Genetic Networks Graph Matching Algorithm Results
7
SCALE-FREE NETWORKS
8
COMPLEX NETWORKS Small-world model –WWW –Human acquaintances network –Citation networks –Biological networks
9
SMALL-WORLD Features: –Characteristic path length –Clustering coefficient –Sparseness
10
SMALL-WORLD Somewhere in between regular & random graphs
11
SMALL-WORLD Highly clustered Short diameter
12
SCALE-FREE NETWORKS Complex networks: biological, social, www, power grid, citation etc. Power low connectivity: P(k) = k - Hubs - authorities
13
SCALE-FREE NETWORKS Application for testing scale free behavior Yeast Helicobacter Pylori Mycoplasma Pnuemonia Mycoplasma Genitelium Linear log-log graph Slope =
14
SCALE-FREE NETWORKS Slope is calculated by least mean square method
15
TOPOLOGY & FUNCTIONALITY Small diameter – ease of dissemination of information – ease of restoring after disturbance Cliquishness –Alternate paths are found Heterogeneity –Random removal does not effect the network –Hubs are vulnerable to attack
16
BIOLOGICAL ASPECTS Multifunctionality –Grouped into functional units Stability Reason: Most of the interactions are between hubs and authorities
17
MODELLING GENETIC NETWORKS
18
TYPES OF GENETIC NETWORKS Categorized by data sources –Metabolic pathways –Gene expression arrays –Protein interactions –Gene interactions
19
INTERACTION MAPS High level perspective –Nodes: Genes or proteins –Edges: Presence of an interaction Data sources –Two-hybrid analysis –Fusion analysis –Chromosomal proximity –Phylogenetic analysis
20
GRAPH MATCHING
21
PROBLEM DEFINITION Attributed Relational Graph (ARG) G = { V, E, X}. V = {v 1, v 2, …, v n } Nodes E = {e 1, e 2, …, e m } Edges X = {x 1, x 2,…,x n } Attributes
22
INEXACT SUBGRAPH MATCHING Allow for : Mismatching attribute values Missing nodes Missing links Also called error-correcting subgraph isomorphism NP-Complete
23
SEARCH TECHNIQUES Cost function Pruning (Structure Constraints) Backtracking
24
ATTRIBUTED GRAPH MATCHING TOOL
25
ATTRIBUTE MATCHING -Amino Acid Sequence Content Composition – array of 20, percentage of each aa –Amino acid grouped into classes: array of 6 –Amino acid triples grouped into classes: array of 216 MKVLNKNEL 6 x 6 x 6
26
ATTRIBUTE MATCHING Difference in amino acid composition values of gene pairs for M. Genitalium and M. Pneumoniae. Score observations
27
STRUCTURAL CONSTRAINTS Effect of scale-free behaviour –Connectivity information: Highly heterogeneous, thus start with most connected and work around it –Pruning strategy: comparibility is determined by power low
28
STRUCTURAL CONSTRAINTS Neigborhood connectivity –Choose the neighbor at the next stage Backtracking –Component by component –Go back to the neighbor with the most connectivity within the component
29
TEST CASE Mycoplasma Genitalium: –smallest genome (470 ORFs) Mycoplasma Pnuemoniae: –Very similar, superset (688 ORFs)
30
TEST CASE... Mycoplasma Genitalium: –232 nodes –211 links Mycoplasma Pnuemoniae: –267 nodes –257 links Inputs: MGE links MPN links MGE synonyms MPN synonyms MGE amino acid sequence MPN amino acid sequence
31
RESULTS MGEMPN
32
DISCOVERY OF MISSING DATA Missing link Link between in MPN632 and MPN637 is missing in our data but exists in literature
33
DISCOVERY OF MISSING DATA Missing node with known COG MPN236--- MPN237---MPN238---MPN678 MG098 ----MG099-----MG100----MG459 MG459 is ortholog of MPN678
34
DISCOVERY OF MISSING DATA Missing node without known ortholog
35
CONCLUSION Large-scale genomics Interaction data captures system structure and dynamics Graph matching exploits the scale-free characteristics Novel interactions and genes can be identified
36
ACKNOWLEDGEMENT YASEMİN TÜRKELİ
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.