Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007.

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
Biol/Chem 473 Schulze lecture 2: Eukaryotic gene structure.
SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Differential Gene Expression
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Figure S1: Genome-wide distribution of positions of TAL1 OSs relative to the transcription start sites (TSSs) of RefSeq genes [110].
Regulatory variation and eQTLs Chris Cotsapas
Speaker: HU Xue-Jia Supervisor: WU Yun-Dong Date: 19/12/2013.
Genetica per Scienze Naturali a.a prof S. Presciuttini Human and chimpanzee genomes The human and chimpanzee genomes—with their 5-million-year history.
[Bejerano Fall10/11] 1 Thank you for the midterm feedback! Projects will be assigned shortly.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
[Bejerano Fall09/10] 1 Thank you for the midterm feedback!
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Cory McLean, Aaron Wenger.
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Identification of obesity-associated intergenic long noncoding RNAs
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Geuvadis RNAseq analysis at UNIGE Analysis plans
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
P300 Marks Active Enhancers Ruijuan LiChao HeRui Fu.
Model Selection in Machine Learning + Predicting Gene Expression from ChIP-Seq signals
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
Igor Ulitsky.  “the branch of genetics that studies organisms in terms of their genomes (their full DNA sequences)”  Computational genomics in TAU ◦
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Genomics and High Throughput Sequencing Technologies: Applications Jim Noonan Department of Genetics.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
SHI Meng. Abstract Changes in gene expression are thought to underlie many of the phenotypic differences between species. However, large-scale analyses.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Thank you for the midterm feedback!
Differential Principal Component Analysis (dPCA) for ChIP-seq
4 male, 4 female LCLs HumanChimpanzeeRhesus Macaque Expression: RNAseq Active Gene Marks: Pol II (ChIPseq) H3K4me3 (ChIPseq) Repressed Region Mark: H3K27me3.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Comparative Genomics Methods for Alternative Splicing of Eukaryotic Genes Liliana Florea Department of Computer Science Department of Biochemistry GWU.
Thoughts on ENCODE Annotations Mark Gerstein. Simplified Comprehensive (published annotation, mostly in '12 & '14 rollouts)
Overview of ENCODE Elements
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
Accessing and visualizing genomics data
1 What forces constrain/drive protein evolution? Looking at all coding sequences across multiple genomes can shed considerable light on which forces contribute.
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Published primate genome sequences - I Published primate genome sequences - II.
Transcriptional Enhancers Looking out for the genes and each other Sridhar Hannenhalli Department of Cell Biology and Molecular Genetics Center for Bioinformatics.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Prader Willi Syndrome & Necdin
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
1 How do regulatory networks evolve? Module = group of genes co-regulated by the same regulatory system * Evolution of individual gene targets Gain or.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
The Chromatin State The scientific quest to decipher the histone code Lior Zimmerman.
Genetics Journal Club Sumeet A. Khetarpal 10 December 2015.
The Transcriptional Landscape of the Mammalian Genome
Epigenetics Continued
Epigenetics 04/04/16.
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
Genetics and Evolutionary Biology
Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology.
Figure 3 Example of how a noncoding regulatory rheumatoid
A Zero-Knowledge Based Introduction to Biology
Hannah K. Long, Sara L. Prescott, Joanna Wysocka  Cell 
Systematic mapping of functional enhancer-promoter connections with CRISPR interference by Charles P. Fulco, Mathias Munschauer, Rockwell Anyoha, Glen.
Evolution of Alu Elements toward Enhancers
Anh Pham Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer’s disease.
By Wenfei Jin Presenter: Peter Kyesmu
Integrative analysis of 111 reference human epigenomes
Presentation transcript:

Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007

Overview I. Comparing genome sequences Concepts and terminology Methods  Whole-genome alignments  Quantifying evolutionary conservation (PhastCons, PhyloP, GERP)  Identifying conserved elements Utility and limitations of conservation Available datasets at UCSC II. Comparative analyses of function Evolutionary dynamics of gene regulation Case studies Insights into regulatory variation within and across species

Functional variation within and among species Human Chim p Rhes us Mous e

Modularity of developmental gene expression forebrain gene A Brain TFs neural tube gene A Neural TFs limb Limb TFs gene A Regulatory changes introduce variance without disrupting protein function Regulatory variation contributes to human phenotypic variation overall

Lettice et al. Hum Mol Genet 12:1725 (2003) Sagai et al. Development 132:797 (2005) Regulatory mutations affecting pleiotropic genes cause discrete developmental changes

NeutralConstrainedDirectional Patterns of selection on gene expression and regulation Romero et al., Nat Rev Genet. 13:505 (2012)

Comparative approaches to identify conserved and variant regulatory functions Visel and Pennacchio, Nat Genet 42:557 (2010) Regulatory conservation Regulatory rewiring

Furey and Sethupathy, Science 2013 Genetic drivers of gene regulatory variation

H3K4me2 H3K27ac H3K4me2 H3K27ac Comparative analysis of ChIP-seq datasets Human Mouse Compare TF binding, histone modifications, DNase hypersensitivity in equivalent tissues Requires a statistical framework to reliably quantify changes in ChIP-seq signals

Input data are noisy: ChIP-seq, RNA-seq data are signal based, subject to considerable experimental variation Using comparable biological states within and across species (e.g., human liver vs. mouse liver) = variation across tissues? How do epigenetic states and gene expression diverge among individuals and across species (Neutral? Constrained?) Can we identify variants or substitutions that drive regulatory changes? Issues in comparative functional genomics

10 human lymphoblastoid cell lines 3 major population groups: European, East Asian, Nigerian 9 females, 1 male 9 analyzed by HapMap and 1000 Genomes Science 328: 232 (2010) Targets: RNA Polymerase II NFkB

PolII Pairwise difference in binding Fraction of regions bound # individuals Variation in TF binding is common

Science 342: 747 (2013) 10 human lymphoblastoid cell lines 1 population group ( Nigerian) All analyzed by HapMap and 1000 Genomes Targets: RNA Polymerase II H3K4me1, H3K4me3, H3K27ac, H3K27me3 DNase hypersensitivity

Measuring allelic imbalance in histone modification profiles G allele T allele Need to map reads reliably to individual alleles ChIP-seq reads Allelic imbalance

Cis-quantitative trait loci ~1200 identified

Science 328: 1036 (2010) Targets: CCAAT/enhancer binding protein  (CEBPA) Hepatocyte nuclear factor 4  (HNF4A) Essential for normal liver development and function Tissue: Adult liver from 4 mammal species plus chicken

Lineage-specific gain and loss of CEBPA binding in liver Lineage-specific: 0 bp overlap in multiple species alignment

Widespread variation in CEBPA binding in mammals

Cell 154: 530 (2013)

Enhancer-associated histone modification Single TF binding events may not indicate regulatory function Many TFs are present at high concentrations in the nucleus TF motifs are abundant in the genome Single TF binding events may be incidental

Combinatorial TF binding events are more conserved

Many TF binding changes do not have obvious genetic causes In mammalian liver:

Many TF binding changes do not have obvious genetic causes In mouse liver:

Human Rhesus Mouse Bud stage; digit specification Digit separation Cell 154: 185 (2013)

Identifying human-lineage changes in promoter and enhancer function Compare H3K27ac signal at orthologous sites ‘Stable marking’: 1.5-fold or less change in H3K27ac among human, rhesus and mouse Human gain: require significant, reproducible gain in human versus all 12 datasets in rhesus and mouse

Mapping active promoters and enhancers in human limb ENCODE cell lines H3K27ac

Gains in promoter and enhancer activity Bone morphogenesis Chondrogenesis Digit malformations in mouse

Human-specific H3K27ac marking correlates with changes in enhancer function

Epigenetic signatures reflect tissue identity and species relationships H3K27ac signal in human and mouse Primate Mouse H3K27ac in human, rhesus, mouse

Human Chimpanzee Bonobo Gorilla Orangutan Macaque Mouse Opossum Platypus Chicken Custom gene models based on Ensembl + RNA-seq 5,636 1:1 orthologs in amniotes 13,277 1:1 orthologs in primates Only constitutive exons Nature 478: 343 (2011)

Global patterns of gene expression differences

Gene expression recapitulates species phylogenies

Gene expression divergence rates are tissue-specific liver testis brain

Gene expression divergence increases with evolutionary time Conservation of core organ functions restricts divergence

Comparative functional genomics identifies regulatory differences within and among species TF binding is variable within species and highly variable among species Epigenetic comparisons provide more insight into biologically relevant regulatory diversity and divergence Gene regulation and expression diverges with increasing phylogenetic distance – they mirror neutral expectation Summary