Download presentation
Presentation is loading. Please wait.
Published byEllen Skinner Modified over 6 years ago
1
BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data
Pavan Kumar A Big Data Analytics Team C-DAC KP, Bengaluru
2
Outline Pharmacogenomics Biological Data Repositories
Graph Databases (What and Why) Big Data Platform for Pharmacogenomcis Databases Neo4j & Pharmacogenomics Graph Database MapReduce : BLAST Web Application for querying and visualization
3
Pharmacogenomics Pharmacogenomics = Pharma + Gene + Omics
Drug therapy consists of three major processes Pharmacokinetic process Pharmacodynamic process Therapeutic process
4
Pharmacogenomics Pharmacogenomics led us Personalized Medicine
What and Why Personalized Medicine?
5
Pharmacogenomics : ADR
ERRORS’s in Health Care Some Facts of ADR 1.6–41.4 % of patients undergo therapy prone to ADR’s $17–29 billion spent annually to preventable ADR In US, ADRs responsible for ~100,000 deaths annually
6
Pharmacogenomics : ADR
Factors for ADR’s Genetic Factors Pharmacokinetics Pharmacodynamics SNPs (Single Nucleotide Polymorphism) Environmental Factors Tobacco, Alcohol, Pollution, Diet habits and so on Physiological Factors Age, Gender, Disease state, Pregnancy, Starvation, Microbial Composition and so on
7
Pharmacogenomics : Pharmacokinetics
What the body does to drug. This is captured by actions like Movement of drug into the body, through the body and out of the body, which is referred as ABSORBTION, BIOAVAILABILTY, DISTRIBUTION, METABOLISM and EXCRETION
8
Pharmacogenomics : Pharmacodynamics
What the drug does to body. This is captured by actions like Receptor Binding Post-receptor effects Chemical Interactions
9
Pharmacogenomics : SNPs
Single Nucleotide Polymorphisms Most common way type of Genetic Variation among people
10
Pharmacogenomics : SNPs
Diseases caused SNPs Autoimmune Diseases Genetic Diseases Cancers Neurodegenerative Disorders Cardiovascular Diseases Neuro-psychological Neuro-psychological Digestive Disorders Addiction Dependence Female-Specific Diseases
11
Pharmacogenomics : SNPs
12
Pharmacogenomics : Microbial Composition
Microbes in our body makeup to 100 Trillion cells (10 fold the number of human cells) Image source: Digestive System Connection.htm
13
Protein structural variations Microbial Composition
Pharmacogenomics Pharmacogenomics Finally… Protein structural variations SNPs Metabolomics Gene Expression Environmental factors: Chemicals, Diet, Tobacco, Alcohol etc Physiological factors: Age, Gender, Disease state, Pregnancy, Circadian rhythm, Starvation Microbial Composition
14
How We Study ? Data related to different domains are stored as Open Data Repositories Download the data Data Format : XML, CSV or Excel Query a database via web application
15
Biological Data Repositories
Following are some of Pharmacogenomics Databases PharmGKB – Pharmacogenomics Knowledge Base DrugBank - chemical, pharmaceutical and pharmacological data IGVdb - Indian Genome Variation Database CTD - Comparative Toxicogenomics Database STITCH (Search Tool for Interactive Chemicals) – Chemical Protein Interaction Networks TTD - Therapeutic Target Database KEGG (Kyoto Encyclopaedia of Genes and Genomes)
16
Integration of Biological Data Repositories
Data is spread across many repositories. User has to navigate many pages on the web or across many websites. So there is a need to integrate all the data to get consolidated information on place
17
Interlinked Biological Data
Databases Consortiums Tools Information from Articles, Literature Pasha and Scaria etal Omics for personalized medicine
18
Integrating Databases
Integrating many databases based on Internationalized Resource Identifiers (IRI) Sample for SCN5A(Sodium channel protein type 5 subunit alpha) Database Gene Organism Len Interacting Chemical Disease/ Disorder Pathway/s Chrom_Start Chrom_End Uniprot SCN5A Human 2016 CTD SCN5A sodium arsenite Atrial Fibrillation Developmental Biology PharmGKB SCN5A Database Gene Organism Len Interacting Chemical Disease/ Disorder Pathway/s Chrom_Start Chrom_End My_DB SCN5A Human 2016 sodium arsenite Atrial Fibrillation Developmental Biology
19
NoSQL database family
20
Graph Databases Graph Databases are NoSQL databases Family.
Pictorial representation of data in the form of Nodes and Edges (with or without properties) Image Source :
21
Why Graph Databases? Ref : https://neo4j.com/use-cases/
Graph Databases are well suited for interconnected data. Some of the use cases of Graph Databases Fraud Detection Graph-Based Search Network and IT Operations Real-Time Recommendations Engines Social Network Identity and Access Managements Ref :
22
Graph Databases : Properties
Two important properties of graph databases technologies Native Graph Storage Some serialize to RDMS Native Graph Processing (a.k.a “index-free adjacency”) Connected nodes physically “point” to each other
23
Graph Databases
24
Graph Database : Neo4j Most of the Biological data is interconnected, Graph databases are well suited. World’s Leading Graph Database : Open Source and Welcoming UI Native graph storage with Native GPE(Graph Processing Engine) Easy to represent connected data Faster to retrieve/traversal/navigation of more Connected data Represents Semi-structured data
25
Graph Database : Neo4j In Neo4j, Cypher Query Language (CQL) is used to create nodes, labels, edges and properties Example:
26
Pharmacogenomics Graph Database
27
Pharmacogenomics Graph Database
28
Pharmacogenomics Graph Database
29
Pharmacogenomics Graph Database
30
Pharmacogenomics Graph Database
31
BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data
Tools and Technologies:
32
BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data
BDPGx has Nodes Properties Relationships 15 Relationship types
33
Conclusion Biological data is generated from various sources and available in different formats Finding correlations among the available data can give better insights BDPGx User-friendly access to get most appropriate information to the researcher
34
THANK YOU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.