BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data

Slides:



Advertisements
Similar presentations
Behaviour Diseases Environment Genetics Infectious diseases.
Advertisements

Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
Wrapup. NHGRI strategic plan What does the NIH think genomics should be for the next 10 years? [Nature, Feb. 2011]
Yeast - why it simply has a lot to say about human disease.
1 Biomedical Sciences Public and Environmental Health Regenerative Medicine Translational Research.
Bioscience/Biomedical Research at Idaho State University Christopher Daniels, Ph.D. Director, ISU Biomedical Research Institute and Professor of Pharmaceutical.
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Development of Bioinformatics and its application on Biotechnology
Analyzing DNA Differences PHAR 308 March 2009 Dr. Tim Bloom.
Networks and Interactions Boo Virk v1.0.
Orientation to Pharmacology
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
Copyright OpenHelix. No use or reproduction without express written consent1.
Pharmacogenetics & Pharmacogenomics Personalized Medicine.
Personalized Medicine Dr. M. Jawad Hassan. Personalized Medicine Human Genome and SNPs What is personalized medicine? Pharmacogenetics Case study – warfarin.
Technology, Biology and the Future KEY CONCEPT Technology continually changes the way biologists work.
Genomes To Life Biology for 21 st Century A Joint Initiative of the Office of Advanced Scientific Computing Research and Office of Biological and Environmental.
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Genomics and Forensics
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Pathway: a collection of genes, proteins, and /or small molecules that modulate a cellular process or disease state Growing demand in biological sciences.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
Bioinformatics and Computational Biology
INTRODUCTION. This powerpoint works best only if you are using office Some features of the given presentation may not work if you are using older.
Introduction to Pharmacology Yacoub Irshaid MD, PhD, ABCP Department of Pharmacology.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Graph Database - Neo4j ISQS3358, Spring Graph Database A graph database is a database that uses graph structures for semantic queries with nodes,
1 Modelling and Simulation EMBL – Beyond Molecular Biology Physics Computational Biology Chemistry Medicine.
Chapter 13 Section 13.3 The Human Genome. Genomes contain all the information needed for an organism to grow and survive The Human Genome Project (HGP)
Introduction to PubChem BioAssay
KEY CONCEPT Biology is the study of all forms of life.
Neo4j: GRAPH DATABASE 27 March, 2017
NoSQL: Graph Databases
OMICS Journals are welcoming Submissions
Chapter 5 Drug Metabolism
Solutions to Clinical Data Visualization and Analysis
Online BIOS QTL atlases
Development of an interactive pipeline for Genome wide association analysis Falola Damilare & Adigun Taiwo – Covenant University Bioinformatics research.
Georgios Pavlopoulos Data integration & knowledge management group
Every Good Graph Starts With
GO : the Gene Ontology & Functional enrichment analysis
UCSD Neuron-Centered Database
OMICS Journals are welcoming Submissions
Introduction to Pharmacology
Data Exchange & Public Reference Data
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
ATOM Accelerating Therapeutics for Opportunities in Medicine
Visualization of Adverse effect pathways
David Ostrovsky | Couchbase
NOSQL databases and Big Data Storage Systems
Understanding the Basics of Pharmacology
OMICS Journals are welcoming Submissions
Human Cells Human genomics
Environmental Sensing Monitoring and Analyzing Water Temperatures
Mahla sattarzadeh Kerman University of Medical Sciences
KEY CONCEPT Technology continually changes the way biologists work.
Orientation to Pharmacology
Alcohol and toxicity Journal of Hepatology
Pharmacogenomics Genes and Drugs.
Introduction to Bioinformatic
INTRODUCTION Nutrigenomics Dr. Muhamad Firdaus
Welcome to SQLSaturday #767! Hosted by Lincoln SQL Server User Group
Service-enabling Biomedical Research Enterprise
Pathway Visualization
Introduction to Pharmacogenetics
Presentation transcript:

BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data Pavan Kumar A Big Data Analytics Team C-DAC KP, Bengaluru

Outline Pharmacogenomics Biological Data Repositories Graph Databases (What and Why) Big Data Platform for Pharmacogenomcis Databases Neo4j & Pharmacogenomics Graph Database MapReduce : BLAST Web Application for querying and visualization

Pharmacogenomics Pharmacogenomics = Pharma + Gene + Omics Drug therapy consists of three major processes Pharmacokinetic process Pharmacodynamic process Therapeutic process

Pharmacogenomics Pharmacogenomics led us Personalized Medicine What and Why Personalized Medicine?

Pharmacogenomics : ADR ERRORS’s in Health Care Some Facts of ADR 1.6–41.4 % of patients undergo therapy prone to ADR’s $17–29 billion spent annually to preventable ADR In US, ADRs responsible for ~100,000 deaths annually

Pharmacogenomics : ADR Factors for ADR’s Genetic Factors Pharmacokinetics Pharmacodynamics SNPs (Single Nucleotide Polymorphism) Environmental Factors Tobacco, Alcohol, Pollution, Diet habits and so on Physiological Factors Age, Gender, Disease state, Pregnancy, Starvation, Microbial Composition and so on

Pharmacogenomics : Pharmacokinetics What the body does to drug. This is captured by actions like Movement of drug into the body, through the body and out of the body, which is referred as ABSORBTION, BIOAVAILABILTY, DISTRIBUTION, METABOLISM and EXCRETION

Pharmacogenomics : Pharmacodynamics What the drug does to body. This is captured by actions like Receptor Binding Post-receptor effects Chemical Interactions

Pharmacogenomics : SNPs Single Nucleotide Polymorphisms Most common way type of Genetic Variation among people

Pharmacogenomics : SNPs Diseases caused SNPs Autoimmune Diseases Genetic Diseases Cancers Neurodegenerative Disorders Cardiovascular Diseases Neuro-psychological Neuro-psychological Digestive Disorders Addiction Dependence Female-Specific Diseases

Pharmacogenomics : SNPs

Pharmacogenomics : Microbial Composition Microbes in our body makeup to 100 Trillion cells (10 fold the number of human cells) Image source: http://www.freegrab.net/Immune Digestive System Connection.htm

Protein structural variations Microbial Composition Pharmacogenomics Pharmacogenomics Finally… Protein structural variations SNPs Metabolomics Gene Expression Environmental factors: Chemicals, Diet, Tobacco, Alcohol etc Physiological factors: Age, Gender, Disease state, Pregnancy, Circadian rhythm, Starvation Microbial Composition

How We Study ? Data related to different domains are stored as Open Data Repositories Download the data Data Format : XML, CSV or Excel Query a database via web application

Biological Data Repositories Following are some of Pharmacogenomics Databases PharmGKB – Pharmacogenomics Knowledge Base DrugBank - chemical, pharmaceutical and pharmacological data IGVdb - Indian Genome Variation Database CTD - Comparative Toxicogenomics Database STITCH (Search Tool for Interactive Chemicals) – Chemical Protein Interaction Networks TTD - Therapeutic Target Database KEGG (Kyoto Encyclopaedia of Genes and Genomes)

Integration of Biological Data Repositories Data is spread across many repositories. User has to navigate many pages on the web or across many websites. So there is a need to integrate all the data to get consolidated information on place

Interlinked Biological Data Databases Consortiums Tools Information from Articles, Literature Pasha and Scaria etal 2013 Omics for personalized medicine

Integrating Databases Integrating many databases based on Internationalized Resource Identifiers (IRI) Sample for SCN5A(Sodium channel protein type 5 subunit alpha) Database Gene Organism Len Interacting Chemical Disease/ Disorder Pathway/s Chrom_Start Chrom_End Uniprot SCN5A Human 2016 CTD SCN5A sodium arsenite Atrial Fibrillation Developmental Biology PharmGKB SCN5A 38564558 38666167 Database Gene Organism Len Interacting Chemical Disease/ Disorder Pathway/s Chrom_Start Chrom_End My_DB SCN5A Human 2016 sodium arsenite Atrial Fibrillation Developmental Biology 38564558 38666167

NoSQL database family

Graph Databases Graph Databases are NoSQL databases Family. Pictorial representation of data in the form of Nodes and Edges (with or without properties) Image Source : https://www.3pillarglobal.com/insights/exploring-the-different-types-of-nosql-databases

Why Graph Databases? Ref : https://neo4j.com/use-cases/ Graph Databases are well suited for interconnected data. Some of the use cases of Graph Databases Fraud Detection Graph-Based Search Network and IT Operations Real-Time Recommendations Engines Social Network Identity and Access Managements Ref : https://neo4j.com/use-cases/

Graph Databases : Properties Two important properties of graph databases technologies Native Graph Storage Some serialize to RDMS Native Graph Processing (a.k.a “index-free adjacency”) Connected nodes physically “point” to each other

Graph Databases

Graph Database : Neo4j Most of the Biological data is interconnected, Graph databases are well suited. World’s Leading Graph Database : Open Source and Welcoming UI Native graph storage with Native GPE(Graph Processing Engine) Easy to represent connected data Faster to retrieve/traversal/navigation of more Connected data Represents Semi-structured data

Graph Database : Neo4j In Neo4j, Cypher Query Language (CQL) is used to create nodes, labels, edges and properties Example:

Pharmacogenomics Graph Database

Pharmacogenomics Graph Database

Pharmacogenomics Graph Database

Pharmacogenomics Graph Database

Pharmacogenomics Graph Database

BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data Tools and Technologies:

BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data BDPGx has 4 107 474 Nodes 3 994 226 Properties 46 840 614 Relationships 15 Relationship types

Conclusion Biological data is generated from various sources and available in different formats Finding correlations among the available data can give better insights BDPGx User-friendly access to get most appropriate information to the researcher

THANK YOU