Rank order ontologies based on p values Remove duplicate probe IDs Select primary GO annotation; determine full GO ancestry for each probe ID Enumerate.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker.
Analysis of microarray data
>>> Korean BioInformation Center >>> KRIBB Korea Research institute of Bioscience and Biotechnology GS2PATH: Linking Gene Ontology and Pathways Jin Ok.
Getting the story – biological model based on microarray data Once the differentially expressed genes are identified (sometimes hundreds of them), we need.
Gene Set Enrichment Analysis (GSEA)
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Proliferation cluster (G12) Figure S1 A The proliferation cluster is a stable one. A dendrogram depicting results of cluster analysis of all varying genes.
Methodology Control (no treatment) Estrogen (5 uM) 4-nonylphenol (5 uM) Cultured Cells, Isolated RNA, RTed to cDNA Data analyzed by Spotfire software RT-PCR.
Experiment Data Model for ImmPort Jennifer Cai UT Southwestern Medical Center 9/18/2006.
Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research.
Scenario 6 Distinguishing different types of leukemia to target treatment.
Benner, Subramaniam and Glass. 2003
Apostolos Zaravinos and Constantinos C Deltas Molecular Medicine Research Center and Laboratory of Molecular and Medical Genetics, Department of Biological.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Ligand Screen for cAMP Assays in Primary B Cells and RAW264.7 Cells Keng-Mean Lin, Robert Hsueh, Madhusudan Natarajan, Paul Sternweis Alliance for Cellular.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Analysis of GO annotation at cluster level by Agnieszka S. Juncker.
Integration of Host Factor Data into the Virus Pathogen Database and Analysis Resource (ViPR) and the Influenza Research Database (IRD) Brett E. Pickett.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
The AfCS Antibody Lab Rod Ceja Blythe King Eduardo Arteaga.
Acknowledgements Frank Amador and Becky Fulin of the AfCS Antibody Laboratory; Katherine Hawes, Jason Polasek, and Paul Sternweis (Director) of the AfCS.
Statistical Testing with Genes Saurabh Sinha CS 466.
Array Platforms 16K Agilent inkjet printed cDNA arrays –The recently developed inkjet printing method (Agilent Technologies) produces more uniform spots.
CONCLUSIONS TCDD responsiveness of HL1-1 cell was confirmed with CYP1A1 mRNA expression induction by QRT- PCR TCDD-elicited temporal and dose response.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Getting the story – biological model based on microarray data Once the differentially expressed genes are identified (sometimes hundreds of them), we need.
RAW two-ligand screen Strategy to Monitor Protein Phosphorylation for the Macrophage Ligand Screen  Cell Preparation and Analysis Lab: Expose RAW.
Identification of co-expression networks by comparison of a multitude of different functional states of genome activity Marc Bonin 1, Stephan Flemming.
Ligand Screen for Calcium Assays in Primary B Cells and RAW264.7 Cells Keng-Mean Lin, Madhusudan Natarajan, Robert Hsueh, Paul Sternweis Alliance for Cellular.
Microarray Data Analysis The Bioinformatics side of the bench.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Methodology U937 Human Immune Cells Control (No treatment) (n=4) Estrogen (5 uM) (n=4) 4-nonylphenol (5 uM) (n=4) Cultured Cells, RNA Isolation, RT (to.
HCR : A FUNCTIONAL ORPHAN CC CHEMOKINE RECEPTOR IN HUMAN ASTROCYTES Introduction Chemokines are a family of chemotactic cytokines that orchestrate the.
Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Human cytomegalovirus (HCMV) is a member of the herpesvirus family. About 70-90% of the population is infected with HCMV. In healthy.
The AfCS Antibody Database Goal: –provide the signaling community with a resource on the AfCS Antibody Lab’s experience with commercial antibodies. Content:
Functional Genomics in Evolutionary Research
Number Scoring of Phosphospecific
Analysis of GO annotation at cluster level by Agnieszka S. Juncker
Analysis of TNF-receptor and ligand superfamily molecules in patients with lymphoproliferative disease of granular lymphocytes by Renato Zambello, Livio.
Volume 123, Issue 1, Pages (July 2002)
Sequential Polarization and Imprinting of Type 1 T Helper Lymphocytes by Interferon-γ and Interleukin-12  Edda G. Schulz, Luca Mariani, Andreas Radbruch,
Gene expression analysis
Marc Hertz, David Nemazee  Immunity 
Volume 33, Issue 1, Pages (July 2010)
PD-1 on Immature and PD-1 Ligands on Migratory Human Langerhans Cells Regulate Antigen-Presenting Cell Activity  Victor Peña-Cruz, Sean M. McDonough,
Volume 13, Issue 4, Pages (October 2000)
Volume 21, Issue 2, Pages (August 2004)
Hsueh Yang, Gabrielle Curinga, Cecilia M. Giachelli 
Andrew J Henderson, Ruth I Connor, Kathryn L Calame  Immunity 
Peter A. Savage, Mark M. Davis  Immunity 
Volume 6, Issue 5, Pages (May 1997)
Volume 6, Issue 5, Pages (May 1997)
Volume 29, Issue 6, Pages (December 2008)
Volume 21, Issue 2, Pages (August 2004)
Volume 32, Issue 5, Pages (May 2010)
Opposing Effects of TGF-β and IL-15 Cytokines Control the Number of Short-Lived Effector CD8+ T Cells  Shomyseh Sanjabi, Munir M. Mosaheb, Richard A.
Volume 56, Issue 5, Pages (November 1999)
Volume 87, Issue 2, Pages (October 1996)
Intestinal myofibroblasts in innate immune responses of the intestine
Volume 10, Issue 3, Pages (March 1999)
Presentation transcript:

Rank order ontologies based on p values Remove duplicate probe IDs Select primary GO annotation; determine full GO ancestry for each probe ID Enumerate n, f, g, c for each ontology Calculate solution of hypergeometric equation for each ontology P = n - 1  i = 0 f i g - f c - i g c 1 - Included on microarrayNot included on microarray Anti-IgM CD40L LPS Molecular components of B cell antigen receptor-mediated endocytosis revealed by CLASSIFI: a tool for functional classification of microarray gene clusters Jamie A. Lee †, Robert Sinkovits §††, Dennis Mock §††, Eva Rab †, Jennifer Cai †, Peng Yang †, Brian Saunders §††, Robert C. Hsueh ‡††, Sangdun Choi ||††, Tamara I. A. Roach* ††, Shankar Subramaniam §¶††, and Richard H. Scheuermann †§†† † Department of Pathology, Laboratory of Molecular Pathology and ‡ Department of Pharmacology, University of Texas Southwestern Medical Center, Dallas, Texas 75390; § San Diego Supercomputer Center and ¶ Department of Bioengineering, University of California, San Diego, California 92122; || Division of Biology, California Institute of Technology, Pasadena, CA; *San Francisco Veterans Administration Medical Center, San Francisco, CA; †† Alliance for Cellular Signaling Abstract Antigen recognition by B lymphocytes leads to a complex series of phenotypic changes that help orchestrate the immune response to infection. Gene expression microarrays provide excellent tools to identify the genes that control these phenotypic changes. One approach to the analysis of microarray data is to group genes together into gene clusters based on their similarity in expression patterns in comparison with the experimental variables. To understand the biological significance of these gene clusters, we developed CLASSIFI (CLuster ASSIgnment For biological Inference), a bioinformatics tool that classifies gene clusters based on the probability of co-clustering of probes with similar gene ontology annotation. We applied CLASSIFI to an Alliance for Cellular Signaling (AfCS) data set that examines the in vitro responses of B cells to stimulation with three ligands that cause a strong proliferative response: CD40L, LPS, and anti-IgM. CLASSIFI analysis revealed an overrepresentation of gene ontologies related to intracellular transport, including genes involved in endocytosis and vesicle acidification such as ATPase H+ pump subunits and SNARE-related genes, in the cluster of genes that are upregulated only in response to anti-IgM. Based on this gene expression data analysis, we hypothesized that anti-IgM, unlike CD40L and LPS, specifically stimulates a complex biological process that includes endocytosis, endosome acidification, vesicle fusion and vesicle transport. The predicted effect of these ligands on receptor endocytosis has been verified experimentally. Figure 1. Experimental methodology and analysis of microarray data. B cells were negatively selected from mouse spleens using anti-CD43-coated magnetic beads and cultured for 4 hrs in the presence or absence of anti-IgM, LPS or anti-CD40. Fluorescently-labeled cRNA prepared from these cells was mixed with a reference cRNA (from total spleen) and hybridized to a custom spotted cDNA microarray. A. Fluorescence values were filtered to remove features too close to background and normalized to the spleen reference. Significance Analysis of Microarrays (SAM) was used to identify genes whose expression was significantly different between untreated and treated conditions. Genes that were significantly upregulated or downregulated were assigned values of “+1” or “-1”, respectively. The genes were clustered together based on the categorical expression patterns, and analyzed using CLASSIFI. B. The steps involved in CLASSIFI analysis of clustered microarray data are detailed. g=number of probes in entire data set, c=number of probes in a specific gene cluster, f=number of probes with a given ontology in entire data set, n=number of probes with a given ontology in the specific gene cluster. Figure 2. Clustering and CLASSIFI results for data from 3 ligands. Clustering of categorical data from B cells stimulated with CD40L, LPS, and anti-IgM results in 19 gene clusters. Red=upregulated. Green=downregulated. Black=no change. Following CLASSIFI analysis, the gene ontology with the lowest p value in each gene cluster is listed. GO id=a unique Gene Ontology identifier that corresponds to a defined molecular function (MF), biological process (BP), or cellular component (CC). Expt=the expected number of occurrences of a given GO id in a given cluster of size (n) based on a random distribution. Prob=the probability that the GO id co-cluster pattern has occurred by chance. Figure 3. Intracellular transport-related genes in Gene Cluster 18. Selection of genes found in Gene Cluster 18 with functions related to endosome acidification, transport and fusion. Figure 4. Coordinate expression of endocytosis genes. Since the microarray methodology is inherently “noisy”, it is important to verify the expression pattern of potentially interesting genes by a parallel methodology. The ligand-specific expression pattern of four genes found in Gene Cluster 18 was verified by real-time PCR analysis (left side). Based on the CLASSIFI analysis, we hypothesize that anti-IgM might also induce the upregulation of other genes involved in endosome processing. Indeed, anti-IgM was found to induce the mRNA levels of four other components of the ATPase H+ pump, while CD40L and LPS did not (right side). Figure 5. Internalization through the B cell antigen receptor (BCR) WEHI-231 cells were treated with a non-stimulating anti-IgM mAb conjugated to FITC. Polyclonal anti-IgM was then added to stimulate the cells. Following a 1-hour stimulation, cells were harvested and washed with acid to remove surface-bound antibody. Cells that were washed with acid (dotted lines) or not (solid lines) were compared to unstimulated cells (black lines). Anti-IgM, but not CD40L or LPS, stimulates internalization of the BCR, as predicted from the gene expression data. Conclusions We have applied CLASSIFI, a tool for functional classification of microarray gene clusters, to a microarray data set comparing B cell responses to CD40L, LPS, and anti-IgM. CLASSIFI analysis reveals significant co-clustering of gene ontologies related to intracellular transport in Gene Cluster 18, which contains genes that are upregulated specifically in response to anti-IgM. Several genes within Gene Cluster 18 are related to various aspects of endosome internalization, acidification and trafficking, including ATPase H+ pump subunits and SNARE-related genes, leading to the hypothesis that activation of B cells through the antigen receptor induces endocytosis, antigen processing and presentation. ATPase H+ pump subunit genes that were not included on the microarray were also found to be upregulated in a ligand- specific manner, indicating that genes involved in the same biological process are coordinately expressed. Anti-IgM stimulates ligand-specific receptor internalization, indicating that CLASSIFI analysis is useful in predicting biological responses to ligand stimulation from the gene expression data analysis. Basic filtering Normalization Statistical filtering Correlation clustering CLASSIFI Raw Data A. B.