Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology.

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Periodic clusters. Non periodic clusters That was only the beginning…
Introduction to genomes & genome browsers
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Speaker: HU Xue-Jia Supervisor: WU Yun-Dong Date: 19/12/2013.
Section 8.6: Gene Expression and Regulation
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
[Bejerano Fall10/11] 1 Thank you for the midterm feedback! Projects will be assigned shortly.
[Bejerano Fall10/11] 1 Any Project reflections?
[Bejerano Fall09/10] 1 Milestones due today. Anything to report?
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007.
Gene Structure and Identification
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
P300 Marks Active Enhancers Ruijuan LiChao HeRui Fu.
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
Mutation And Natural Selection how genomes record a history of mutations and their effects on survival Tina Hubler, Ph.D., University of North Alabama,
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
From Genomes to Genes Rui Alves.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Gene Regulatory Networks and Neurodegenerative Diseases Anne Chiaramello, Ph.D Associate Professor George Washington University Medical Center Department.
Tools for Comparative Sequence Analysis Ivan Ovcharenko Lawrence Livermore National Laboratory.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Introduction to Bioinformatics II
Starter What do you know about DNA and gene expression?
CFE Higher Biology DNA and the Genome Transcription.
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
1 From Bi 150 Lecture 0 October 4, 2012 An introduction to molecular biology... but you will learn the cell biology in this course.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
TRANSCRIPTION AND TRANSLATION Vocabulary. GENE EXPRESSION the appearance in a phenotype characteristic or effect attributed to a particular gene.
Who is smarter and does more tricks you or a bacteria? YouBacteria How does my DNA compare to a prokaryote? Show-off.
Transcriptional Enhancers Looking out for the genes and each other Sridhar Hannenhalli Department of Cell Biology and Molecular Genetics Center for Bioinformatics.
Gene structure and function
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Regulation of Gene Expression
Background for Molecular Biology of Lactase Persistence
School of Pharmacy, University of Nizwa
Gene Hunting: Design and statistics
Genomes and Their Evolution
Relationship between Genotype and Phenotype
Bellwork: What is the human genome project. What was its purpose
Genomes and their evolution
Genomes and their evolution
Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states by Thu-Hang Pham, Christopher.
Interpreting noncoding variants
Recitation 7 2/4/09 PSSMs+Gene finding
Effect of polymorphisms on transcriptional regulation in mice
Albert Xue, Binbin Huang, Jianrong Wang
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
A Zero-Knowledge Based Introduction to Biology
Relationship between Genotype and Phenotype
Genome-wide Identification of Craniofacial Transcriptional Enhancers
How Proteins are Made Biology I: Chapter 10.
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Leonie Ringrose, Marc Rehmsmeier, Jean-Maurice Dura, Renato Paro 
School of Pharmacy, University of Nizwa
Mapping Global Histone Acetylation Patterns to Gene Expression
Working in the Post-Genomic C. elegans World
One SNP at a Time: Moving beyond GWAS in Psoriasis
Presented by, Jeremy Logue.
Presented by, Jeremy Logue.
Relationship between Genotype and Phenotype
Presentation transcript:

Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology Information National Institutes of Health September 23, 2010

The Genome Sequence: The Ultimate Code of Life 3 billion letters ~ 45% is “junk” (repetitive elements) ~ 3% is coding for proteins gene regulatory elements (REs) reside SOMEWHERE in the rest ~50%

Distant Regulatory Elements 11/10/2018

Hirschprung disease is associated with a noncoding SNP RET There has always been interest in the genetics of different eye colors. A recent study showed that blue and brown are actually associated with a mutation within an intron of the HERC2 gene. The mutation does not affect the expression of HERC2 itself, but of the gene which is immediately downstream, the OCA2 gene, which is the one responsible for pigmentation. This is just an example of gene regulation and you can see the genotypes and corresponding phenotypes on this picture here. Regulatory elements (REs) orchestrate temporal and spatial expression of genes, and it is becoming more and more evident, that many diseases with a genetic basis can be actually linked to mutations in regulatory elements. This project intends to provide a higher insight into the rules of gene regulation.

Hundreds of noncoding disease SNPs

REGULATORY ELEMENT (RE) Combinations of binding sites define the biological function of regulatory elements Transcription factors (TF) bind to very short binding sites (6-10 nucleotides) (TFBS) Combinatorial binding of multiple TFs to a RE defines a specific pattern of gene expression Correlating patterns of TFBS in REs with the biological function will “decode” the gene regulatory encryption GENE aCTGACTgaaaaCTGATATTGacagtTTGTTGTTGttaa TFBS REGULATORY ELEMENT (RE) Protein A Protein B Protein C DNA

Homotypic TFBS clusters Are known to occur widely in nature (Arnone and Davidson, 1997) Provide redundancy for key regulatory events – cornerstone of developmental stability Respond to various concentrations of TFs (e.g. allow lowly abundant TFs to bind) Berman et al. (2002) PNAS 99:757

Searching the human genome for homotypic TFBS clusters E2F_Q6_01 Cluster

Homotypic TFBS clusters in the human genome ~700 TRANSFAC & Jaspar PWMs were used to annotate putative TFBS in the non-repetitive, non-exonic part of the human genome A 2-state HMM model was trained to identify genomic regions with an elevated density of TFBS events TFBS “A” TFBS cluster < 500 bps < 3kb

Only 33 PWMs have more than 1000 clusters 126,000 homotypic TFBS clusters 272 (40%) of TFs have at least 5 clusters Median length – 597 bps Median number of TFBS per cluster – 5 Total genome span – 50.4 Mb (1.6%) Direct Indirect Human specific

Homotypic TFBS are strongly associated with promoters 2290 clusters (47% of 4894 total) are in promoters 51% of human promoters contain at least 1 cluster

Fraction of clusters in promoters p-val < 0.005 for 78 TFs

SNP density in clusters

Comparing TFBS to inter-site regions within clusters to avoid ascertainment bias

Two lines of evidence of negative selection acting on TFBS within TFBS clusters

Overlap with in vivo developmental enhancers http://enhancer.lbl.gov “deep” or “ultra” conservation 346 ENHANCERS 503 NEGATIVES

LBL enhancers overlapping conserved homotypic clusters p-value < 10-100

Breaking the code. TF – tissue associations.

3-fold stronger association with p300 binding than expected enhancer

Tissue-specific association of NOBOX and E2F4 E2F4 HCT NOBOX HCT 25-fold difference, P=2.99·10-50

Experimental validation, E2F4 & NRF1 clusters diencephalon B caudal somites pancreas subregions of forebrain, midbrain, hindbrain C Lawrence Berkeley Lab Axel Visel Len Pennacchio neural tube

~50% of human promoters contain a homotypic cluster of binding sites Summary Homotypic TFBS clusters are abundant in the human genome; they span 50.4 Mb (1.6% of the genome) – about as much as coding DNA ~50% of human promoters contain a homotypic cluster of binding sites ~50% of validated enhancers contain a homotypic cluster of binding sites

Acknowledgements Valer Gotea Lawrence Berkeley Lab Axel Visel Len Pennacchio

SNP ascertainment bias leads to low SNP density in clusters