Summarizing Differential Expression Using Mann-Whitney U-tests.

Slides:



Advertisements
Similar presentations
Microarray statistical validation and functional annotation
Advertisements

Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Processing of miRNA samples and primary data analysis
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Fission Yeast Computing Workshop -1- Exercise 5: Looking for overreprsented GO terms in a gene set using Onto-Express GO annotations can be used to obtain.
Peter Tsai Bioinformatics Institute, University of Auckland
DEG Mi-kyoung Seo.
RNA-seq: the future of transcriptomics ……. ?
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
MCB Lecture #21 Nov 20/14 Prokaryote RNAseq.
Ribosomal Profiling Data Handling and Analysis
Using Gene Ontology Models and Tests Mark Reimers, NCI.
Using visualization and network analysis to assist function analysis of microarray data Hepatitis C Virus (HCV) Micorarray Data Function Analysis Current.
Gene Expression And Regulation Bioinformatics January 11, 2006 D. A. McClellan
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Biological Interpretation of Microarray Data Helen Lockstone DTC Bioinformatics Course 9 th February 2010.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
 2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments.
Gene Set Enrichment Analysis Petri Törönen petri(DOT)toronen(AT)helsinki.fi.
1Module 2: Analyzing Gene Lists Canadian Bioinformatics Workshops
Multiple testing in high- throughput biology Petter Mostad.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Gene Set Enrichment Analysis (GSEA)
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
Suppose we have analyzed total of N genes, n of which turned out to be differentially expressed/co-expressed (experimentally identified - call them significant)
RNAseq analyses -- methods
Experimental validation. Integration of transcriptome and genome sequencing uncovers functional variation in human populations Tuuli Lappalainen et al.
Copyright OpenHelix. No use or reproduction without express written consent1.
Top X interactions of PIN Network A interactions Coverage of Network A Figure S1 - Network A interactions are distributed evenly across the top 60,000.
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
Comparative transcriptomic analysis of fungi Group Nicotiana Daan van Vliet, Dou Hu, Joost de Jong, Krista Kokki.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Analyzing digital gene expression data in Galaxy Supervisors: Peter-Bram A.C. ’t Hoen Kostas Karasavvas Students: Ilya Kurochkin Ivan Rusinov.
GO enrichment and GOrilla
No reference available
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
CCLE Cancer Cell Line Encyclopedia Alexey Erohskin.
HOMER – a one stop shop for ChIP-Seq analysis
Canadian Bioinformatics Workshops
QUIET BALL UNIT EXAM REVIEW 7 TH GRADE. HOW MANY CHOICES ARE PRESENT IN A DICHOTOMOUS KEY? two.
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
Gene Set Enrichment Analysis. GSEA: Key Features Ranks all genes on array based on their differential expression Identifies gene sets whose member genes.
Clench 2.0 A program for cluster enrichment analysis and integrated visualization of expression, annotation and transcription factor binding site data.
RNA Quantitation from RNAseq Data
RNA-Seq analysis in R (Bioconductor)
Tutorial 6 : RNA - Sequencing Analysis and GO enrichment
GO : the Gene Ontology & Functional enrichment analysis
::: Schedule. Biological (Functional) Databases
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
Figure 1. Effect of acute TNF treatment on transcription in human SGBS adipocytes as assessed by RNA-seq and RNAPII ChIP-seq. Following 10 days in vitro.
Department of Genetics • Stanford University School of Medicine
Genesets and Enrichment
Overview Gene Ontology Introduction Biological network data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Adrien Le Thomas, Georgi K. Marinov, Alexei A. Aravin  Cell Reports 
Volume 158, Issue 2, Pages (July 2014)
The Omics Dashboard.
Introduction to RNA-Seq & Transcriptome Analysis
Genomic determinants of coral heat tolerance across latitudes
Cancer Cell Line Encyclopedia
Volume 2, Issue 3, Pages (March 2016)
Genome-wide Functional Analysis Reveals Factors Needed at the Transition Steps of Induced Reprogramming  Chao-Shun Yang, Kung-Yen Chang, Tariq M. Rana 
Differential Expression of RNA-Seq Data
Presentation transcript:

Summarizing Differential Expression Using Mann-Whitney U-tests

RNA-Seq… at it’s Most Basic Form Samples from two conditions Isolate RNA Generate cDNA Create sequencing library by fragmenting, size selection and adding adaptors Run sequencer Generate short reads Identify differentially expressed genes Profound biological discovery

Heat stress experiment analyzed with tag-based RNA-seq individual stress control stress

Input : - list of significant genes (“our list”) - all GO annotations for all genes in a genome (or transcriptome) Enrichment test: whether “our list” contain more representatives of a certain GO category than expected by chance (Fisher’s exact, hypergeometric, or similar test) Gene Ontology enrichment analysis (classic)

Mann-Whitney U-test Use ranks to test if distributions of group X and group Y are different Robust to outliers and does not require normally distributed data

Input : - list of significant genes with measures to rank them - GO annotations for all genes in a genome (or transcriptome) Enrichment test: whether a GO category is significantly enriched with either top- or bottom-ranking genes (two-sided Mann-Whitney U test, or permutations) Advantages: - no need to do choose a “significance cutoff” - can keep track of direction of change Gene Ontology enrichment analysis (rank-based) controlstress Genes annotated with the GO term MWU test determines whether genes annotated with the GO term in question (stripes on the white box to the left) are significantly “bunched up” either on top or at the bottom of the ranked list. “delta rank” : mean rank of GO-term genes minus mean rank of all other genes (how much shift in ranks there is).

control treatment Differential Expression Analysis (DESeq EdgeR) Namepvalue-log(p)Rank gene gene gene gene gene gene gene gene delta rank

- Cluster GO categories according to the proportion of shared genes would bring similar biological processes together - Merge identical or very similar categories to reduce redundancy. Some GO categories in your data might share the same genes (and some may overlap completely)

Run R Script GO_MWU.R go to ~ /Desktop/Mann-Whitney_U-tests/MWU_go open the file GO_MWU.R execute commands by highlighting and pressing control + enter

gene,logP isogroup0,0.6 isogroup1,3.5 isogroup10,6.8 isogroup100,6.4 isogroup1000,1.7 isogroup10000,0.1 isogroup10001,-0.2 isogroup10002,0.6 isogroup10003,-0.4 heats.csv (differential expression dataset) V1V2 isogroup15359GO: ;GO: ;GO: ; isogroup0GO: isogroup100GO: ;GO: isogroup10001GO: isogroup10002GO: ;GO: ;GO: ;GO: isogroup10003GO: ;GO: ;GO: ;GO: ; isogroup10004GO: ;GO: ;GO: ;GO: isogroup10006GO: isogroup10007 GO: ;GO: ;GO: ;GO: ;GO: ;GO: ;GO: amil_defog_iso2go.tab (links genes with their GO terms) id: GO: name: mitochondrial genome maintenance namespace: biological_process def: "The maintenance of the structure and integrity of the mitochondrial genome; includes replication and segregation of the mitochondrial chromosome." [GOC:ai, GOC:vw] is_a: GO: ! mitochondrion organization [Term] id: GO: name: reproduction namespace: biological_process alt_id: GO: alt_id: GO: def: "The production of new individuals that contain some portion of genetic material inherited from one or more parent organisms." [GOC:go_curators, GOC:isa_complete, GOC:jl, ISBN: ] subset: goslim_generic subset: goslim_pir subset: goslim_plant subset: gosubset_prok go.obo (links GO terms with names, namespaces, and definitions)

Molecular function: Cellular component: Dendrograms : sharing of genes between categories. Fractions : genes with an unadjusted p<0.05 / total number of genes within the category. FDR-adjusted p-values GO_MWU: response to adult corals to 3 days of heat stress

Run R Script GO_MWU.R go to ~ /Desktop/Mann-Whitney_U-tests/MWU_go open the file GO_MWU.R execute commands by highlighting and pressing control + enter

KOG-MWU: same idea as GOMWU (“KOGMWU” package in ) Non-hierarchical and [mostly] non-overlapping nature of KOG class annotations allows for quantitative comparisons of diverse datasets based on KOG delta-ranks. “categories enriched with either up- or down-regulated genes”

Questions