Canadian Bioinformatics Workshops www.bioinformatics.ca.

Slides:



Advertisements
Similar presentations
STRING Prediction of protein networks through integration of diverse large-scale data sets Lars Juhl Jensen EMBL Heidelberg.
Advertisements

Molecular Systems Biology 3; Article number 140; doi: /msb
Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
5 EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI.
Pathways & Networks analysis COST Functional Modeling Workshop April, Helsinki.
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Modeling Functional Genomics Datasets CVM Lessons 4&5 10 July 2007Bindu Nanduri.
1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.
Protein-protein interactions Chapter 12. Stable complex Transient Interaction Transient Signaling Complex Rap1A – cRaf1 Interface 1310 Å 2 Stable complex:
Cytoscape A powerful bioinformatic tool Mathieu Michaud
EnrichNet: network-based gene set enrichment analysis Presenter: Lu Liu.
Network Analysis and Application Yao Fu
Hyun Seok Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University College of Medicine Lecture 13. Network Analysis MES
Bioinformatics Dr. Víctor Treviño BT4007
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Networks and Interactions Boo Virk v1.0.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
Creating Metabolic Network Models using Text Mining and Expert Knowledge J.A. Dickerson, D. Berleant, Z. Cox, W. Qi, and E. Wurtele Iowa State University.
Network & Systems Modeling 29 June 2009 NCSU GO Workshop.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Construction of cancer pathways for personalized medicine | Presented By Date Construction of cancer pathways for personalized medicine Predictive, Preventive.
Reactome - a curated knowledgebase of human biological pathways and processes.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
COMPUTATIONAL ANALYSIS OF MULTILEVEL OMICS DATA FOR THE ELUCIDATION OF MOLECULAR MECHANISMS OF CANCER Presented by Azeez Ayomide Fatai Supervisor: Junaid.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
A curated database of biological pathways.
A collaborative tool for sequence annotation. Contact:
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
Introduction to biological molecular networks
A database of biological pathways and processes (borrowed from a presentation created by Steve Jupe)
GO based data analysis Iowa State Workshop 11 June 2009.
Copyright OpenHelix. No use or reproduction without express written consent1.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Set Analysis using R and Bioconductor Daniel Gusenleitner
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Canadian Bioinformatics Workshops
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
Pathway Informatics 30 th March, 2016 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 3.
NCRI Cancer Conference November 1, 2015.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Reactome pathway knowledgebase Connecting pathways, networks, and disease Robin Haw, PhD Project Manager and Outreach Coordinator Ontario Institute for.
David Amar, Tom Hait, and Ron Shamir
Networks and Interactions
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Pathway Analysis June 13, 2017.
Canadian Bioinformatics Workshops
Optimizing Biological Data Integration
GO : the Gene Ontology & Functional enrichment analysis
Ingenuity Knowledge Base
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Anastasia Baryshnikova  Cell Systems 
Network biology An introduction to STRING and Cytoscape
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Pathway Analysis July 9, 2019.
Presentation transcript:

Canadian Bioinformatics Workshops

2Module #: Title of Module

Module 3 Pathway and Network Analysis

Module 3 bioinformatics.ca Classes of Gene Set Analysis Khatri et al. PLOS Comp Bio. 8: DAVID GSEA Reactome FI network PARADIGM Reactome FI network PARADIGM

Module 3 bioinformatics.ca Limitations of Gene Set Enrichment Analysis Many possible gene sets – diseases, molecular function, biological process, cellular compartment, pathways... Gene sets are heavily overlapping; need to sort through lists of enriched gene sets! “Bags of genes” obscure regulatory relationships among them.

Module 3 bioinformatics.ca Pathway Databases Advantages: – Usually curated. – Biochemical view of biological processes. – Cause and effect captured. – Human-interpretable visualizations. Disadvantages: – Sparse coverage of genome. – Different databases disagree on boundaries of pathways.

Module 3 bioinformatics.ca KEGG

Module 3 bioinformatics.ca Reactome Hand-curated pathways in human. Rigorous curation standards – every reaction traceable to primary literature. Automatically-projected pathways to non-human species. 22 species; 1112 human pathways; 5078 proteins. Features: – Google-map style reaction diagrams with overlays; – Find pathways containing your gene list; – Calculate gene overrepresentation in pathways; – Find corresponding pathways in other species. Open access.

Module 3 bioinformatics.ca Reactome

Module 3 bioinformatics.ca Pathway Commons

Module 3 bioinformatics.ca Pathway Colorization Main feature offered by all pathway databases. Upload a gene list Database calculates an enrichment score on each pathway and displays ranked list. Browse into pathways of interest; download colorized pictures.

Module 3 bioinformatics.ca Example from Reactome

Module 3 bioinformatics.ca Example from Reactome

Module 3 bioinformatics.ca

Module 3 bioinformatics.ca Networks Pathways capture only the “well understood” portion of biology. Networks cover less well understood relationships: – Genetic interactions – Physical interaction – Coexpression – GO term sharing – Adjacency in pathways

Module 3 bioinformatics.ca

Module 3 bioinformatics.ca

Module 3 bioinformatics.ca

Module 3 bioinformatics.ca

Module 3 bioinformatics.ca

Module 3 bioinformatics.ca Network Databases Can be built automatically or via curation. Popular sources of curated networks: – BioGRID – Curated interactions from literature; 529,000 genes, 167,000 interactions. – InTact – Curated interactions from literature; 60,000 genes, 203,000 interactions. – MINT – Curated interactions from literature; 31,000 genes, 83,000 interactions.

Module 3 bioinformatics.ca Uncurated Interaction Sources Text mining approaches – Computationally extract gene relationships from text, such as PubMed abstracts. – Much faster than hand curation. – Not perfect: Problems recognizing gene names. Is hedgehog a gene or a species? Natural language processing is difficult. – Popular resources: iHOP PubGene

Module 3 bioinformatics.ca Uncurated Interaction Sources Experimental techniques – Yeast 2 hybrid protein interactions. – Protein complex pulldowns/mass spec. – Genetic screens, such as synthetic lethals, enhancer/suppressor screens. – NOT perfect Y2H interactions have taken proteins out of natural context; physical interaction != biological interaction. Protein complex pulldowns plagued by “sticky” proteins such as actin. Genetic screens highly sensitive to genetic background (“network effects”).

Module 3 bioinformatics.ca Integrative Approaches Combine multiple sources of evidence to increase accuracy. Simple example: – “Party hubs” are Y2H interactions that have been filtered for those partners that share the same temporal-spatial location. Complex example: – Combine multiple sources of curated and uncurated evidence.

Example: Reactome FI Network Curated Human Data – Version proteins 4166 reactions 3870 complexes 1112 pathways Only ~25% of genome! Goal: add a “corona” of uncurated interaction data around scaffold of curated pathway data.

Expanding Reactome’s Coverage Curated PathwaysUncurated Information human PPI PPI inferred from fly, worm & yeast PPI from text mining Gene co-expression GO annotation on biological processes Protein domain- domain interactions CellMap TRED GeneWays Annotated Functional Interactions Naïve Bayes Classifier Predicted Functional Interactions Wu et al. (2010) Genome Biology

Integrated Functional Interaction (FI) Network 10,956 proteins (9,542 genes). 209,988 FIs. ~50% coverage of genome. False (+) rate < 1% False (-) rate ~80% 5% of network shown here

Module 3 bioinformatics.ca Active Network Extraction & Analysis Reactome Functional Interaction network Disease subnetwork Extract mutated, overexpressed, undexpressed, expanded/deleted genes Add Linker genes Disease “modules” Disease gene prediction Sample classification Hypothesis generation Apply community clustering algorithms

Module 3 bioinformatics.ca p53, SMAD, TGFβ, TNF signaling KRAS, MAPK signaling Integrin signaling Heterotrimeric G-protein signaling Rho GTPase signaling Transcription & translation Cell cycle Wnt & Cadherin signaling Hedgehog signaling Transcription Zinc fingers Ca2+ Signaling Non-silent mutations blue – in primary tumour only green – in xenograft only red – in primary & xenograft Pancreatic Cancer Module Map (43 Cases) Christina Yung

Glioblastoma stem cells (GSC) in collaboration with Peter Dirks lab (SickKids) Irina Kalatskaya

Glioblastoma Stem Cell Network collagen GPCR Beta-catenin complement IL-1 BMP TP53/RB1/JUN/SP1 CREB1 FGF Small Rho proteins Ribosomal proteins HOX GLI2

Module 3 bioinformatics.ca

Module 3 bioinformatics.ca Network Classification of Disease Traditional: Associate active genes with clinical behavior to create gene-based prognostic signatures. Limitations: Too many genes reduces statistical power New idea: Look for associations between active modules and clinical behavior.

Module 3 bioinformatics.ca Using the Reactome FI Network to Find a Breast Cancer Survival Signature Disease Module Map Correlate principal components with clinical parameters Principal component analysis on modules Expression Analysis of tumours from multiple patients Guanming Wu

Module 3 bioinformatics.ca Module-Based Signatures of Breast Cancer Survival Nejm: van de Vijver et al 2002 – 295 Samples, ~12,000 genes – Event: death GSE4922: Ivshina et al. Cancer Res – 249 Samples, ~13,000 genes – Event: recurrence or death

Module 3 bioinformatics.ca Building the Network Built based on the Nejm data set – 27 modules selected based on size cutoff 7 and average correlation cutoff Validated using GSE4922.

Module 3 bioinformatics.ca PC Analysis Identifies Module 2 as Explaining Much of Variation in Survival

Module 3 bioinformatics.ca Same Signature Predicts Survival in Independent Data Set

Module 3 bioinformatics.ca And Three More Data Sets as Well…

Module 3 bioinformatics.ca Module 2: Kinetochore + Aurora B Signaling

Module 3 bioinformatics.ca Integration of Multiple Data Sets Experimental samples can be interrogated many ways: – RNA expression – Genome/exome sequencing – Copy number changes/loss of heterozygosity – shRNA knockdown screens Integrate multiple functional data types using network/pathway relationships?

Module 3 bioinformatics.ca Vaske, Benz et al. Bioinformatics 26:i PARADIGM

Module 3 bioinformatics.ca Vaske, Benz et al. Bioinformatics 26:i Factor graph: directed graph connecting genes; each gene is activated, inactivated, or unchanged in a single patient.

Module 3 bioinformatics.ca Vaske, Benz et al. Bioinformatics 26:i

Module 3 bioinformatics.ca PARADIGM: The Bad News Distributed in source code form only – Requires several third-party math/graph libraries (all open source). – I have not gotten it to compile yet! No documentation. No repositories of formatted pathway data. No examples of converting experimental data into input files.

Module 3 bioinformatics.ca Take Home Messages Pathway/network analysis can provide context to altered gene lists. Pathway/network analysis differs greatly in complexity, power, and usability: – SIMPLE: Pathway diagram colorization – MODERATE: Reactome FI network extraction – COMPLEX: PARADIGM This type of analysis is work-in-progress, but promises ability to integrate data across many dimensions.

Module 3 bioinformatics.ca URLs KEGG – Biocarta – WikiPathways – Reactome – NCI/PID – pid.nci.nih.gov Ingenuity – Pathway Commons – PARADIGM --

Module 3 bioinformatics.ca URLs BioGrid – InTact – MINT – mint.bio.uniroma2.it iHOP – PubGene –

Module 2 bioinformatics.ca We are on a Coffee Break & Networking Session