Distinguishing Regulators of Biomolecular Pathways Mentor: Dr. Xiwei Wu City of Hope Sean Caonguyen SoCalBSI 8/21/08.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Weixi Zhong Mentor: Dr. Andrew Cameron Center for Computational Regulatory Genomics California Institute of Technology.
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Finding detailed relationships between proteins specific to phenotypes among microbial organisms Daniel Park Molecular Biology Institute, UCLA Yeates lab.
Computational characterization of biomolecular networks in physiology and disease Kakajan Komurov, Ph.D Department of Systems Biology University of Texas.
Aug. 20, JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,
ProInt Finder to Search Protein Interactions Shwe S. Lin Mentor: Matteo Pellegrini, UCLA.
Southern California Bioinformatics Summer Institute Wendie Johnston, Beverly Krilowicz, Jamil Momand, Sandra Sharp, Nancy Warter- Perez.
Microarray Analysis with a Small Number of Replicates By Kung-Hua Chang & Dhondup Pemba By Kung-Hua Chang & Dhondup Pemba Mentors: Cecilie Boysen, Ph.D.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Southern California Bioinformatics Summer Institute Wendie Johnston, Beverly Krilowicz, Jamil Momand, Sandra Sharp, Nancy Warter-Perez.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics
Southern California Bioinformatics Summer Institute Wendie Johnston, Beverly Krilowicz, Jamil Momand, Sandra Sharp, Nancy Warter-Perez.
Data visualization in the post-genomics era Carol Morita Genentech, Inc.
Identification of compounds to affect radiosensitivity of cells Pellegrini Lab—UCLA SoCalBSI 2007 Joshua Smith Bazyl Nettles.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Enhancing the C-48 STAT3 Inhibitor
Larry Lam Southern California Bioinformatics Summer Institute 2009 Graeber Lab – Crump Institute for Molecular Imaging UCLA A Data Management and Analysis.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Evaluation of Two Methods to Cluster Gene Expression Data Odisse Azizgolshani Adam Wadsworth Protein Pathways SoCalBSI.
Microarray-based Disease Prognosis using Gene Annotation Signatures Michael Kovshilovsky Swapna Annavarapu SoCalBSI 2005.
Bioinformatics Tools for Microarray Analysis Connie Wu Dr. Jim Breaux Dr. Sandeep Gulati ViaLogy Southern California Bioinformatics Institute Summer 2004.
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
Genetic Effects of Stress in Vervet Monkey Olivera Grujic Dr. Eleazar Eskin’s Lab, UCLA Dr. Nelson Freimer’s Lab,UCLA SoCalBSI, 2008.
Introduction The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics.
>>> Korean BioInformation Center >>> KRIBB Korea Research institute of Bioscience and Biotechnology GS2PATH: Linking Gene Ontology and Pathways Jin Ok.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Genome-scale Metabolic Reconstruction and Modeling of Microbial Life Aaron Best, Biology Matthew DeJongh, Computer Science Nathan Tintle, Mathematics Hope.
Gene Set Enrichment Analysis (GSEA)
Jesse Gillis 1 and Paul Pavlidis 2 1. Department of Psychiatry and Centre for High-Throughput Biology University of British Columbia, Vancouver, BC Canada.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
On utility of gene set signatures in gene expression-based class prediction Minca Mramor, Marko Toplak, Gregor Leban, Tomaž Curk, Janez Demšar and Blaž.
Gene expression analysis
PaLS: Pathways and Literature Strainer Filtering common literature, ontology terms and pathway information. Andrés Cañada Pallarés Instituto Nacional de.
BIOS6660 shRNAseq Gene Set Enrichment Analysis Tzu L Phang PhD Robert Stearman PhD April 16, 2014.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Pathway: a collection of genes, proteins, and /or small molecules that modulate a cellular process or disease state Growing demand in biological sciences.
Statistical Testing with Genes Saurabh Sinha CS 466.
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
1 Bioinformatics at Norwegian University of Science and Technology Professor Finn Drabløs Department of Cancer Research and Molecular Medicine Finn Drabløs.
1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,
Copyright OpenHelix. No use or reproduction without express written consent1.
GO enrichment and GOrilla
Copyright OpenHelix. No use or reproduction without express written consent1.
Microarray Data Analysis The Bioinformatics side of the bench.
Bioinformatics Chem 434 Dr. Nancy Warter-Perez Computer Engineering Dr. Jamil Momand Chemistry & Biochemistry.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Title: Assign Pathways to Gene Set June 21, 2007 Guanming Wu.
Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Gene Set Analysis using R and Bioconductor Daniel Gusenleitner
Canadian Bioinformatics Workshops
Center for Bioinformatics and Genomic Systems Engineering Bioinformatics, Computational and Systems Biology Research in Life Science and Agriculture.
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
1 Bioinformatics Tools for Genotyping Frances Tong Dr. Garry Larson, Ph.D City of Hope Department of Molecular Medicine Southern California Bioinformatics.
Day 2: Session 8: Questions and follow-up…. James C. Fleet, PhD
Pathweavers Elizabeth McClellan Ribble, Ph.D.
Schedule for the Afternoon
Bioinformatic analyses suggest that PI3K/AKT signaling may be a key downstream pathway of tazarotene signaling. Bioinformatic analyses suggest that PI3K/AKT.
Presentation transcript:

Distinguishing Regulators of Biomolecular Pathways Mentor: Dr. Xiwei Wu City of Hope Sean Caonguyen SoCalBSI 8/21/08

Expression Pattern Analysis Microarray technology is a powerful tool for investigating cellular activity at different levels DNA microarrays can be used to identify genetic ‘‘signatures’’ for disease 07/09/ jpg Pan et al. (2005)

A Traditional Approach to DNA Microarray Analysis Gene Expression Data Gene Selected Biological Interpretation Threshold Individual Gene Analysis Two step process Selects genes from an arbitrarily chosen cut-off From the selected genes, one infers biological meaning of gene expression data Jiang Z and Gentlemen R. (2006) and Nam D, et al. (2007)

Emerging Approach to DNA Microarray Analysis Gene Set Analysis (GSA) Rank all genes based on their phenotype association Calculate a maximal enrichment score for each gene set Rank each gene set score for biological interpretation Gene Expression Data Gene Set Database Biological Interpretation Assess gene set directly Jiang Z and Gentlemen R. (2006) and Nam D, et al. (2007)

Biological Significance of Gene Set Analyses Ability to identify subtle changes in gene expression that are undetectable by traditional approaches No arbitrary threshold Generate results that are easier to interpret

Current Problem with GSA Reduces gene set into a list of names No difference in up- regulation and down- regulation Directionality is lost A B D E F C P G A D F P E B Suggests that the pathway is activatedSuggests a lower probability of pathway activation HIGHER up-regulation down- regulation up-regulation

Enriched Gene Set Analysis Gene Set Database Biological Interpretation Assess gene set directly Curated Analysis Gene Expression Data

Useful Tools for the Pathway Analysis Program National Cancer Institutes (NCI) Pathway Interaction Database ( contains information about molecular interactions and biological processes in signaling pathways focuses on cancer research in human cells searches for biomolecules, processes, or by viewing pathways Data format Graphics: SVG or GIF Texts: XML or BioPax

Segment of the Phosphoinositide 3- Kinases (PI3K) Signaling Pathway Key to Icons non-lipid kinase pathway of Class IB PI3K XML Script

Project Objective Create a program to distinguish the activators and inhibitors in each signaling pathway Requires extensive use of XML Parser in Python

Approach to Project 1. Identify all the elements in the pathway 2. Record the pairwise interactions Linking each interaction 3. Determine the role of each molecule Finding each leaf node Using a traceback method A B D E F C P G

1) Identify the Elements in the Pathway Properly assign each ID to reference a “preferred symbol” Locate each interaction ID

2) Record the Pairwise Interactions How to can we store each interaction? Memory efficient Easy extraction of data A B D E F C P G Sparse Matrix!

Sparsing Matrix Initialization A B D E F C P G ABCDEFGP A B C D E F G P Sparse Matrix Regulators Output

3) Determine the Role of Each Molecule A B D E F C P G ABCDEFGP A B C D E F G P Regulators Output Traceback each leaf node Leaf Node P RoleActivatorInhibitor ProteinA,B,C,D,F Identify each leaf node Leaf Node G ActivatorInhibitor A,BE

Locate Activated Pathways for Better Biological Interpretation Gene Expression Data Up-regulation of B and D Down-regulation of E Enriched Gene Set Analysis Leaf Node PLeaf Node G RoleActivatorInhibitorActivatorInhibitor ProteinA,B,C,D,FA,BE Possible activation of Pathway A B D E F C P G D E B down- regulation up-regulation

Results For each pathway menu, one can: find a list of proteins with associated roles for each node look at each protein in an interaction find a list of all interactions in a pathway

Percentage of Inhibitors Number of PathwaysPercentage 0%5546.6% 0-5%3832.2% >=5%2521.2% >=10%86.8% >=20%21.7% Total118100%

Conclusion Successfully parse XML files Pathway analysis program works ~50% of pathways include inhibitors 20% of the pathways contains >=5% of inhibitors Average total molecules = 60

Future Directions Improvements to Software Ambiguous roles Proteins in different Complex may have different roles Fine tune the overall role of proteins in each pathway Run program with real expression data set Improve prognoses and drugs for diseases A B D E F C P G

References Pan KH, Lih Cj, Cohen SN. Effects of threshold choice on biological conclusions reached during analysis of gene expression by DNA microarrays. Proc Natl Acad Sci 2005, 102: Subramanian A, Tamayo P, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 2005, 102: Nam D, Kim SY. Gene-set approach for expression pattern analysis. Brief Bioinform 2008, 9: Dupuy A, Simon RM. Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst 2007, 99: Jiang Z, Gentleman R. Extensions to gene set enrichment. Bioinformatics 2007,23: Dinu I, Potter JD, et al. Improving gene set analysis of microarray data by SAM-GS. BMC Bioinformatics 2007, 8:242. Liu Q, Dinu I, et al. Comparative evaluation of gene-set analysis methods. BMC Bioinformatics 2007,8:431.

Acknowledgements Mentor Xiwei Wu SoCalBSI Faculty and Staff Jamil Momand Sandy Sharp Nancy Warter-Perez Wendie Johnston Funding for SoCalBSI: DOE and NASA LA / Orange County Biotechnology Center NSF, NIH, and Economic & Workforce Development Funding at City of Hope: National Cancer Institute National Institute of Health