Analysis of GO annotation at cluster level by Agnieszka S. Juncker.

Slides:



Advertisements
Similar presentations
1 Microbial Metabolism Databases of Microbial Metabolism & Degradation Ching-Tsan Huang ( 黃慶璨 ) Office: Agronomy Hall, Room 111 Tel: (02)
Advertisements

CAVEAT 1 MICROARRAY EXPERIMENTS ARE EXPENSIVE AND COMPLICATED. MICROARRAY EXPERIMENTS ARE THE STARTING POINT FOR RESEARCH. MICROARRAY EXPERIMENTS CANNOT.
Gene Ontology John Pinney
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
Getting the numbers comparable
Gene expression analysis summary Where are we now?
DNA Microarray Bioinformatics - #27611 Program Normalization exercise (from last week) Dimension reduction theory (PCA/Clustering) Dimension reduction.
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Classification of Microarray Data. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical.
Statistical Analysis of Microarray Data
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Introduction to BioInformatics GCB/CIS535
Classification of Microarray Data. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
ICA-based Clustering of Genes from Microarray Expression Data Su-In Lee 1, Serafim Batzoglou 2 1 Department.
Introduction to DNA microarrays DTU - January Hanne Jarmer.
Scanning and image analysis Scanning -Dyes -Confocal scanner -CCD scanner Image File Formats Image analysis -Locating the spots -Segmentation -Evaluating.
Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker.
Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
 2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Analysis of microarray data
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Metagenomic Analysis Using MEGAN4
Gene Set Enrichment Analysis (GSEA)
Extracting Biological Information from Gene Lists
Introduction to DNA microarrays DTU - May Hanne Jarmer.
Living Systems C383. Chemical Basis of Life What is life? Study at the “biological level” vs. study at the “chemical level” Thermodynamics and kinetics.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Gene Expression and Networks. 2 Microarray Analysis Supervised Methods -Analysis of variance -Discriminate analysis -Support Vector Machine (SVM) Unsupervised.
Statistical Testing with Genes Saurabh Sinha CS 466.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
1 ArrayTrack Demonstration National Center for Toxicological Research U.S. Food and Drug Administration 3900 NCTR Road, Jefferson, AR
Statistical Analysis of Microarray Data By H. Bjørn Nielsen.
Introduction to Microarrays. The Central Dogma.
Annotation. Traditional genome annotation BLAST Similarities.
Flat clustering approaches
GO-Slim term Cluster frequency cytoplasm 1944 out of 2727 genes, 71.3% 70 out of 97 genes, 72.2% out of 72 genes, 86.1% out.
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Higher Human Biology Unit 1 Human Cells KEY AREA 5: Human Genomics.
Clustering Manpreet S. Katari.
Tutorial 6 : RNA - Sequencing Analysis and GO enrichment
GO : the Gene Ontology & Functional enrichment analysis
Statistical Testing with Genes
Cancer, reproductive system diseases, lipid metabolism 48 22
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Microarray Clustering
Analysis of GO annotation at cluster level by Agnieszka S. Juncker
Overview Gene Ontology Introduction Biological network data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Gene expression analysis
Dimension reduction : PCA and Clustering
Gene Expression Analysis
The Omics Dashboard.
Statistical Testing with Genes
Presentation transcript:

Analysis of GO annotation at cluster level by Agnieszka S. Juncker

Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical Analysis Fit to Model (time series) Expression Index Calculation Advanced Data Analysis ClusteringPCAClassification Promoter Analysis Meta analysisSurvival analysisRegulatory Network Normalization Image analysis The DNA Array Analysis Pipeline Comparable Gene Expression Data GO annotations

Gene Ontology Gene Ontology (GO) is a collection of controlled vocabularies describing the biology of a gene product in any organism There are 3 independent sets of vocabularies, or ontologies: Molecular Function (MF) –e.g. ”DNA binding” and ”catalytic activity” Cellular Component (CC) –e.g. ”organelle membrane” and ”cytoskeleton” Biological Process (BP) –e.g. ”DNA replication” and ”response to stimulus”

Gene Ontology structure

GO structure, example 2

KEGG pathways KEGG PATHWAYS: –collection of manually drawn pathway maps representing our knowledge on the molecular interaction and reaction networks, for a large selection of organisms 1. Metabolism –Carbohydrate, Energy, Lipid, Nucleotide, Amino acid, Other amino acid, Glycan, PK/NRP, Cofactor/vitamin, Secondary metabolite, Xenobiotics 2. Genetic Information Processing 3. Environmental Information Processing 4. Cellular Processes 5. Human Diseases 6. Drug Development

KEGG pathway example 1

KEGG pathway example 2

Cluster analysis and GO Analysis example: Partitioning clustering of genes into e.g. 15 clusters based on expression profiles Assignment of GO terms to genes in clusters Looking for GO terms overrepresented in clusters

Hypergeometric test The hypergeometric distribution arises from sampling from a fixed population. 10 balls We want to calculate the probability for drawing 7 or more white balls out of 10 balls given the distribution of balls in the urn 20 white balls out of 100 balls

Yeast cell cycle Time series experiment: Gene expression profiles: Time Y Y Y Y Y Y Y Gene1 Gene2 Sampling

R stuff Indexing of a matrix (used when you wish to select a subset of your data, e.g. specific rows or columns): Example 1 rowindex <- 1:10 colindex <- 1:5 datamatrix[rowindex, colindex] # first 10 rows, first 5 columns datamatrix[1:10, 1:5] # gives the same as above “Missing” rowindex (or columnindex) means that all rows (or columns) are selected Example 2 datamatrix[1:5,] # 5 first rows, all columns datamatrix[,5:10] # all rows, columns 5 to 10 datamatrix[,] # is the same as datamatrix