EMBL- EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK T +44 (0) 1223 494467 F +44 (0) 1223 494496 Gene Co-expression.

Slides:



Advertisements
Similar presentations
Microarray statistical validation and functional annotation
Advertisements

Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
A New Biclustering Algorithm for Analyzing Biological Data Prashant Paymal Advisor: Dr. Hesham Ali.
Bi-correlation clustering algorithm for determining a set of co- regulated genes BIOINFORMATICS vol. 25 no Anindya Bhattacharya and Rajat K. De.
Heuristic alignment algorithms and cost matrices
Bioinformatics and Phylogenetic Analysis
Detecting Orthologs Using Molecular Phenotypes a case study: human and mouse Alice S Weston.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Gene Set Enrichment Analysis Petri Törönen petri(DOT)toronen(AT)helsinki.fi.
Microarray Preprocessing
Chapter 21 Correlation. Correlation A measure of the strength of a linear relationship Although there are at least 6 methods for measuring correlation,
Correlation and regression 1: Correlation Coefficient
Multiple Sequence Alignment May 12, 2009 Announcements Quiz #2 return (average 30) Hand in homework #7 Learning objectives-Understand ClustalW Homework#8-Due.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Correlation.
1. Abstract SAGE Serial analysis of gene expression (SAGE) is a method of large-scale gene expression analysis.that involves sequencing small segments.
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai.
PLEXdb Plant Expression database Ethalinda Cannon Iowa State University January 15th, 2007.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
1/17 Identification of thermophilic species by the amino acid compositions deduced from their genomes Reporter: Yu Lun Kuo
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Team Conoscenza Bioinformatics Tan Jian Wei ~ Tan Fengnan.
Figure 2: over-representation of neighbors in the fushi-tarazu region of Drosophila melanogaster. Annotated enhancers are marked grey. The CDS is marked.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Introduction The mission of the European Life sciences Infrastructure for Biological Information (ELIXIR) is to enable a sustainable infrastructure for.
Comparison of Microarray Data Generated from Degraded RNA using Five Different Target Synthesis Methods and Commercial Microarrays Scott Tighe and Tim.
Comparative analysis between Rheumatoid arthritis and arthritis model – study of the functional components in expression profiles of synovitis Irene Ziska.
Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Gene Expression Platforms for Global Co-Expression Analyses A Comparison of spotted cDNA microarrays, Affymetrix microarrays, and SAGE Obi Griffith, Erin.
Gene Expression Platforms for Global Co-Expression Analyses A Comparison of spotted cDNA microarrays, Affymetrix microarrays, and SAGE Obi Griffith, Erin.
Plant Biology Division Post-process of IMGAG M.t. 2.0 Release Affymetrix Medicago Probe set – IMGAG 2.0 / MTGI 8.0 Mapping Zhao Bioinformatics Lab.
Copyright OpenHelix. No use or reproduction without express written consent1.
Pairwise Local Alignment and Database Search Csc 487/687 Computing for Bioinformatics.
Bioinformatics and Computational Biology
Cluster validation Integration ICES Bioinformatics.
Identification of co-expression networks by comparison of a multitude of different functional states of genome activity Marc Bonin 1, Stephan Flemming.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Dynamic programming with more complex models When gaps do occur, they are often longer than one residue.(biology) We can still use all the dynamic programming.
Tmm: Analysis of Multiple Microarray Data Sets Richard Moffitt Georgia Institute of Technology 29 June, 2006.
Copyright OpenHelix. No use or reproduction without express written consent1.
InterPro Sandra Orchard.
ABSTRACT First genomic scale data about gene expression have recently started to become available in addition to complete genome sequence data and annotations.
Introduction to Oligonucleotide Microarray Technology
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment Raja Jothi, Teresa.
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
EGASP 2005 Evaluation Protocol
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
EMBL’s European Bioinformatics Institute
EGASP 2005 Evaluation Protocol
Prediction of Regulatory Elements for Non-Model Organisms Rachita Sharma, Patricia.
Overview Bioinformatics: Analyzing biological data using statistics, math modeling, and computer science BLAST = Basic Local Alignment Search Tool Input.
Biomarkers HCLS F2F Michael Miller.
The Pearson Correlation
Hyeshik Chang, Jaechul Lim, Minju Ha, V. Narry Kim  Molecular Cell 
Supplementary Figure 4. Comparisons of MethyLight and gene expression data. PMR values (X-axis) were plotted against log2 gene expression values (Y-axis)
Genomic characterization of the inflammatory response initiated by surgical intervention and the effect of perioperative cyclooxygenase 2 blockade  Keith.
Hyeshik Chang, Jaechul Lim, Minju Ha, V. Narry Kim  Molecular Cell 
Michal Levin, Tamar Hashimshony, Florian Wagner, Itai Yanai 
Structural Architecture of SNP Effects on Complex Traits
Reconstructing the hematopoietic hierarchy from micro‐clusters
Brandon Ho, Anastasia Baryshnikova, Grant W. Brown  Cell Systems 
Analysis of the nonsynonymous to synonymous substitutions dN/dS (ω) for the 5,493 1:1 ortholog transcripts. Analysis of the nonsynonymous to synonymous.
Genetic and Epigenetic Regulation of Human lincRNA Gene Expression
Perturbational Gene-Expression Signatures for Combinatorial Drug Discovery  Chen-Tsung Huang, Chiao-Hui Hsieh, Yun-Hsien Chung, Yen-Jen Oyang, Hsuan-Cheng.
Presentation transcript:

EMBL- EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK T +44 (0) F +44 (0) Gene Co-expression on Microarrays: Fiction or Fact? Tineke Casneuf 1,2, Yves Van de Peer 1 and Wolfgang Huber 2 Introduction Microarray co-expression signatures of genes are an important tool for studying gene function and relations between genes. In addition to real biological co-expression, correlated signals can result from technical deficiencies like hybridisation of probes with off-target transcripts. We investigated the nature and scale of off-target transcript hybridisation in relation to signal correlation with data from Affymetrix genechips. Acknowledgements This work was supported by a grant from the Fund for Scientific Research, Flanders (3G031805) and by the European Commission through a Marie Curie Host Fellowship program (MEST-CT BIOSTAR). Tineke Casneuf PhD student European Bioinformatics Institute June 2007 VIB / Ghent University Bioinformatics & Evolutionary Genomics Technologiepark 927 B-9052 Gent BELGIUM (1)(2) T + 32 (0) F + 32 (0) Highly correlated pairs returned by our custom-made definition show longer common paths than those returned by Affymetrix’ definition. We propose that these correlation relationships result from real biological co-expression, as opposed to from cross- hybridisation. The latter is likely the case for Affymetrix probe sets as reporters with perfect sequence identity to off-target genes are sustained. Conclusions We here reveal a positive relation between off-target reporter alignment strength and expression correlation that is present even between gene pairs that do not share longer stretches of sequence similarity and where the reporter to off-target alignment is only based on short near-matches. Furthermore, this effect can be observed within probe sets. We show that omitting reporters liable to cross-hybridisation results in biologically more relevant expression relationships. The application of this finding is essential for enrichment for real true biological expression correlations and assures that reliable co-expression links are identified for downstream co-expression analyses. More stringent probe set definitions return biologically more relevant co-expression links Reporters with unequal off-target responses We also studied the behavior of different reporters within a probe set and found a positive relation between the alignment scores a i of reporter x i to Y's transcript sequence and the Pearson correlation coefficients of the reporters' signal patterns to the expression pattern of Y. We illustrate these finding with an example: The summarised expression values of a probe set _at, designed to target AT5G04790 and off-target gene AT1G ρXY = 0.7. The background corrected, normalised signal profiles of _at's reporters. The colour of the profile corresponds to its a i and is explained in the legend. For each of these reporters, Pearson correlation coefficient ρX i Y calculated between its signal profile to that of Y, is plotted against its off-target sensitivity score a i. These plots demonstrate that _at’s reporters show unequal responses to AT1G75180: four of them have perfect sequence identity and show an expression profile with an ρ>0.8 to this off-target. This is contrasted by reporters with lower alignment strength. The relation between off-target sensitivity and signal correlation is different for different reporters of a probe set. In addition to probe sets defined and annotated by the manufacturer Affymetrix, we evaluated the use of a more stringent custom-made definition, where probe sets were constructed solely from perfect matching reporters while excluding reporters most liable to cross-hybridisation (with a i = 23 to an off-target). Expression correlation and off-target sensitivity These boxplots depict the expression correlation ρ in function of off-target sensitivity Q 75 XY. The reveal a positive relation between the two variables: a gene whose expression is measured by reporters that align well to a different transcript tends to have an expression signal that is correlated with that of the other transcript. Figure A shows the data for all probe set pairs; for Figure B gene pairs with a BLAST hit in at least one direction with an E-value< were omitted. We compared gene pairs with considerable different correlation coefficients in the two probe set definitions: a first set that have a high ρ in Affymetrix’ definition and a low ρ in the custom-made (blue) and a second with high ρ in the custom-made definition and low ρ in Affymetrix’ (orange). This plot shows the cumulative of the lengths of the longest common path down the biological Process branch of the Gene Ontology tree of the annotation of the gene pairs of both sets. Illustration of our approach