Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang Borevitz Lab Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang.

Slides:



Advertisements
Similar presentations
Yaroslav Ryabov Lognormal Pattern of Exon size distributions in Eukaryotic genomes.
Advertisements

Peter Tsai Bioinformatics Institute, University of Auckland
Gene regulation in cancer 11/14/07. Overview The hallmark of cancer is uncontrolled cell proliferation. Oncogenes code for proteins that help to regulate.
RNA-seq: the future of transcriptomics ……. ?
Profiles for Sequences
RNA-Seq based discovery and reconstruction of unannotated transcripts in partially annotated genomes 3 Serghei Mangul*, Adrian Caciula*, Ion.
Natural Variation in Arabidopsis thaliana Light Response: Genomic Approaches Justin Borevitz Salk Institute naturalvariation.org.
MicroArray Evolution: expression to mapping and back again Justin Borevitz Salk Institute naturalvariation.org.
Light response QTL in Arabidopsis thaliana: LIGHT1 cloning Justin Borevitz Ecology & Evolution University of Chicago naturalvariation.org.
Markers, mapping, and expression using arrays Justin Borevitz Salk Institute naturalvariation.org.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Genomics tools to identify the molecular basis of complex traits Justin Borevitz Salk Institute naturalvariation.org.
Comparative Genomic Hybridization (CGH). Outline Introduction to gene copy numbers and CGH technology DNA copy number alterations in breast cancer (Pollack.
Global dissection of cis and trans regulatory variations in Arabidopsis thaliana Xu Zhang Borevitz Lab.
QTL mapping using Single Feature Polymorphisms Justin Borevitz Salk Institute naturalvariation.org.
High Resolution Patterns of Variation in the Arabidopsis Genome Justin Borevitz Ecology & Evolution University of Chicago naturalvariation.org.
Genomic Systems underlying the genetics of adaptation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago
Towards the Arabidopsis Haplotype Map using Arrays Justin Borevitz Salk Institute naturalvariation.org.
Studies of Genome Wide Molecular Variation in Arabidopsis thaliana using Arrays Justin Borevitz Salk Institute naturalvariation.org.
Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago
High Throughput Sequencing
Committee Meeting April 24 th 2014 Characterizing epigenetic variation in the Pacific oyster (Crassostrea gigas) Claire Olson School of Aquatic and Fishery.
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
CDNA Microarrays MB206.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
RNAseq analyses -- methods
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Regulation of Gene Expression Eukaryotes
Experimental validation. Integration of transcriptome and genome sequencing uncovers functional variation in human populations Tuuli Lappalainen et al.
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
Supplemental Figure 1A. A small fraction of genes were mapped to >=20 SNPs. Supplemental Figure 1B. The density of distance from the position of an associated.
8.6 Gene Expression and Regulation TEKS 5C, 6C, 6D, 6E KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells.
DNA to Protein – 12 Part one AP Biology. What is a Gene? A gene is a sequence of DNA that contains the information or the code for a protein or an RNA.
DNA TO RNA Transcription is the process of creating a molecule that can carry the genetic blueprint for a particular protein coding gene from the DNA.
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
High Resolution Patterns of Variation in the Arabidopsis Genome Justin Borevitz University of Chicago naturalvariation.org.
Supplementary Figure 2A. A. ZMYM6-variant missing Exon 2 C. ZMYM6-variant missing Exon 4 B. ZMYM6-variant missing Exon 5 D. ZMYM6-variant missing Exons.
Eukaryotic Genomes  The Organization and Control of Eukaryotic Genomes.
Proposed redefinition of “gene” requires it to have a biological role Gerstein MB, …, Snyder M Genome Res 17: example of complexities observed.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
Background & Motivation Problem & Feature Construction Experiments Design & Results Conclusions and Future Work Exploring Alternative Splicing Features.
Mark D. Adams Dept. of Genetics 9/10/04
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Introduction to RNAseq
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
DECODING OF EXON SPLICING PATTERNS IN THE HUMAN RUNX1-RUNX1T1 FUSION GENE Vasily V. Grinev Associate Professor Department of Genetics Faculty of Biology.
Chapter 2 From Genes to Genomes. 2.1 Introduction We can think about mapping genes and genomes at several levels of resolution: A genetic (or linkage)
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Example of a Functional Genomics Study Molecular Ecology ,
TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.
No reference available
Eukaryotic Genomes 11 November, 2005 Text Chapter 19.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Supplementary Fig. 1 Supplementary Figure 1. Distributions of (A) exon and (B) intron lengths in O. sativa and A. thaliana genes. Green bars are used for.
A knowledge-based approach to integrated genome annotation Michael Brent Washington University.
KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells. Chapter 11 – Gene Expression.
Date of download: 6/22/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Identification of a Novel TP53 Cancer Susceptibility.
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
RNA-Seq analysis in R (Bioconductor)
Experimental Verification Department of Genetic Medicine
ChipViewer is coded to visualize and analyze the tiling chip data.
Gene expression estimation from RNA-Seq data
Genome organization and Bioinformatics
High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincRNA Regulation  Song Li, Masashi Yamada, Xinwei Han, Uwe Ohler,
Sequence Analysis - RNA-Seq 2
Presentation transcript:

Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang Borevitz Lab Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang Borevitz Lab

Arabidopsis thaliana have been adapted to highly variable environments

Transcription and splicing Chromosomal DNA Transcription Nuclear RNA Exon 1 Exon 2 Exon 3 Intron 1Intron 2 RNA splicing Messenger RNA Exon 1Exon 2Exon 3Exon 1Exon 3

Whole genome tiling array Genetic hybridization polymorphisms could affect the estimation of gene expression  High density and resolution: 1.6M unique probes at 35bp spacing  Without bias toward known transcripts

Col♀ x Col♂Van ♀ x Van ♂Col ♀ x Van ♂Van ♀ x Col ♂  parental strains and reciprocal F1 hybrids  mRNA from total RNA; genomic DNA The experiment

Double-stranded random labeling Random reverse transcription Double-stranded cDNA Random priming AAAAA

 Sequence polymorphisms  Gene expression variation  Splicing variation  A functional network of differentially spliced genes  HMM for a de novo transcription profiling Outlines

 Sequence polymorphisms  Gene expression variation  Splicing variation  A functional network of differentially spliced genes  HMM for a de novo transcription profiling Outlines

SFP deletion or duplication in Van Single Feature Polymorphisms and indels SFPs SFP

Sequence polymorphisms SPFs and indels (>200bp) were removed before gene expression analysis SFPs a FDRCol > Van c Van > Col c Total 11.82% % % % % Indels b Model selectiondeletionduplicationTotal BIC d AIC e

Deletions vs duplications

Distribution of indels along chromosomes

 Sequence polymorphisms  Gene expression variation  Splicing variation  A functional network of differentially spliced genes  HMM for a de novo transcription profiling Outlines

Additive, dominant and maternal effects of gene expression

The linear model Gene probe Intensity ~ additive + dominant + maternal + ε intensity Col Van F1c F1v additive maternal dominant genotypes

Gene expression variation between genotypes Delta a Sig+ b Sig- c TotalFalse d FDR additive % % % % % dominant % % % % % maternal % % % % %

Mean gene intensity Van dominant Col dominant over dominant F1v dominantF1c dominant Maternal paternal The pattern of gene expression inheritance Col Van F1v F1c

The pattern of gene expression inheritance

Enrichment in GO functional categories GO enrichment for additive dominant maternal effect genes Defense response genes are highly expressed in F1 hybrid lines, while many growth related pathway are down-regulated

 Sequence polymorphisms  Gene expression variation  Splicing variation  A functional network of differentially spliced genes  HMM for a de novo transcription profiling Outlines

Default expression status of exon and intron  Exons: correction for gene expression corrected by gene mean corrected by a gene median splicing index (Mean exon /Mean gene )  Introns: direct comparison Exon/intron probe Intensity ~ additive + dominant + maternal + ε

Differential exon splicing Exon probe Intensity ~ additive + dominant + maternal + ε Delta a Sig+ b Sig- c TotalFalse d FDR corrected by gene mean % % % % % Corrected by gene median % % % % % Splicing index % % % % %

Differential intron splicing Intron probe Intensity ~ additive + dominant + maternal + ε Delta a Sig+ b Sig- c TotalFalse d FDR % % % % % %

Differential exon splicing is predominantly additive in F1 hybrids

Some dominant effect in differential intron splicing in F1 hybrids

Comparison for enrichment in known alternatively spliced exons Threshold 1Threshold 2 CalledNot calledCalledNot called Corrected by gene mean Known Not known Fold enrichment p-value 5.97E E-03 Corrected by gene median polish Known Not known Fold enrichment p-value 3.60E E-03 Splicing index Known Not known Fold enrichment p-value 6.84E E-02

AT1G21350 AT1G34180 AT1G76170 AT1G29120 AT1G51350 AT1G80960 AT1G07350 Experimental determined FDR for differential splicing # of significant calls estimated FDR # of tested # of confirmed experimental FDR Exon (corrected by mean) % % % % Exon (corrected by median) % % % % Exon (splicing index) % % % % intron % % % %

 Sequence polymorphisms  Gene expression variation  Splicing variation  A functional network of differentially spliced genes  HMM for a de novo transcription profiling Outlines

Enrichment of differentially spliced genes in chloroplast thylakoid enrichment of differentially spliced genes

Chloroplast thylakoid

Differrentially spliced genes which are located in chloroplast thylakoid Photosynthesis related genes AT5G38660 APE1 (Acclimation of Photosynthesis to Environment) mutant has altered acclimation responses

AT1G07350transformer serine/arginine-richribonucleoprotein putative AT1G55310SC35-like splicing factor 33 kD(SCL33) AT2G29210splicing factor PWIdomain-containing protein AT5G04430KH domain-containing proteinNOVA putative Splicing regulator tend to be differentially spliced

 Sequence polymorphisms  Gene expression variation  Splicing variation  A functional network of differentially spliced genes  HMM for a de novo transcription profiling Outlines

Generalized tiling array HMM  3-state HMM  Discrete distribution for emission probability  Transition probability counts for probe spacing  Baum-Welch parameter estimation (by Jake Byrnes)

An example of HMM detected segments

A nice model also needs better array  Array density is not enough to distinguish exon/intron boundaries  Probe quality

Differential segments >=3 continuous probes with posterior probability >0.99. Differentially expressed genes annotated genes for which ≥33% of their probes reside within the observed differential segments. Differentially spliced genes annotated genes for which <33% of probes resided within the differential segment, or annotated genes containing ≥2 differential segments with different states. Novel gene boundaries differential segments with >= 5 probes extending beyond annotated gene boundary Novel transcripts differential segments with >= 5 probes and outside any annotated gene boundary.

Length distribution of segments called by HMM

Comparison of annotation-based analysis and HMM Col > VanVan > ColTotal Annotation differential expression a differential exonic splicing b differential intronic splicing c HMM differential expression d differential splicing e un-annotated transcript f un-annotated 5' g un-annotated 3' g 28836

Comparison of annotation-based analysis and HMM Annotation Expression (Col>Van) Expression (Van>Col) Splicing (Col>Van) Splicing (Van>Col) HMM Expression (Col>Van) Expression (Van>Col) Splicing (Col>Van) Splicing (Van>Col)

Acknowledgements Justin Borevitz Yan Li Christos Noutsos Geoff Morris Andy Cal Jake Byrnes Josh Rest