Understanding the Human Genome: Lessons from the ENCODE project

Slides:



Advertisements
Similar presentations
BME 130 – Genomes Lecture 19 The histone code. Figure 7.1 Genomes 3 (© Garland Science 2007)
Advertisements

Epigenetics Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Control of Gene Expression
Differential Gene Expression
Gene Expression Chapter Eleven. What is Gene Expression? When a gene is expressed – that gene’s protein product is made: 1.DNA is transcribed into RNA.
20,000 GENES IN HUMAN GENOME; WHAT WOULD HAPPEN IF ALL THESE GENES WERE EXPRESSED IN EVERY CELL IN YOUR BODY? WHAT WOULD HAPPEN IF THEY WERE EXPRESSED.
Detecting DNA-protein Interactions Xinghua Lu Dept Biomedical Informatics BIOST 2055.
Ch 11 – Gene Expression The control of a gene at transcription, translation for even the polypeptide.
Methylation, Acetylation and Epigenetics
Regulation of Gene Expression
Lecture #8Date _________ n Chapter 19~ The Organization and Control of Eukaryotic Genomes.
Section 8.6: Gene Expression and Regulation
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
Organization of DNA Within a Cell from Lodish et al., Molecular Cell Biology, 6 th ed. Fig meters of DNA is packed into a 10  m diameter cell.
Control of Gene Expression Eukaryotes. Eukaryotic Gene Expression Some genes are expressed in all cells all the time. These so-called housekeeping genes.
 Eukaryotic Gene Expression.  Transduction  Transformation.
Introns and Exons DNA is interrupted by short sequences that are not in the final mRNA Called introns Exons = RNA kept in the final sequence.
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
Consider the following… Do all of the cells in your body carry out the same processes? Do all of the cells in your body make the same proteins? Do all.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
Eukaryotic Gene Control. Developmental pathways of multicellular organisms: All cells of a multicellular organism start with the same complement of DNA.
Regulation of Gene Expression Eukaryotes
Chapter 11 Regulation of Gene Expression. Regulation of Gene Expression u Important for cellular control and differentiation. u Understanding “expression”
Regulation of Gene Expression Chapter 18. Warm Up Explain the difference between a missense and a nonsense mutation. What is a silent mutation? QUIZ TOMORROW:
Introduction to the Tsinghua University ENCODE Journal Club Monica C. Sleumer ( 苏漠 )
DNA Organization, Replication, & Repair. Model for the structure of the nucleosome.
Eukaryotic Genome & Gene Regulation The entire genome of the eukaryotic organism is present in every cell of the organism. Although all genes are present,
How Genes Work Ch. 12.
Regulation of Gene Expression – Part II
Eukaryotic Gene Expression. Introduction Every cell in a multi-cellular eukaryote does not express all its genes, all the time (usually only 3-5%) –Long-term.
Gene Expression. Remember, every cell in your body contains the exact same DNA… …so why does a muscle cell have different structure and function than.
Prokaryotic cells turn genes on and off by controlling transcription.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Control of Eukaryotic Genome
Eukaryotic Gene Structure. 2 Terminology Genome – entire genetic material of an individual Transcriptome – set of transcribed sequences Proteome – set.
TRANSCRIPTION Copying of the DNA code for a protein into RNA Copying of the DNA code for a protein into RNA 4 Steps: 4 Steps: Initiation Initiation Elongation.
Functions of RNA mRNA (messenger)- instructions protein
GENE REGULATION RESULTS IN DIFFERENTIAL GENE EXPRESSION, LEADING TO CELL SPECIALIZATION Eukaryotic DNA.
CS173 Lecture 9: Transcriptional regulation III
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
CFE Higher Biology DNA and the Genome Transcription.
Transcription and The Genetic Code From DNA to RNA.
How is gene expression in eukaryotes accomplished ?
Gene structure and function
CAMPBELL BIOLOGY IN FOCUS © 2014 Pearson Education, Inc. Urry Cain Wasserman Minorsky Jackson Reece Lecture Presentations by Kathleen Fitzpatrick and Nicole.
Eukaryotic Gene Regulation
Chapter 15. I. Prokaryotic Gene Control  A. Conserves Energy and Resources by  1. only activating proteins when necessary  a. don’t make tryptophan.
Control of Gene Expression in Bacteria
Gene Regulation, Part 2 Lecture 15 (cont.) Fall 2008.
Additional high-throughput sequencing techniques (finding all functional elements of genome) June 15, 2017.
Eukaryotic Gene Regulation
Regulation of Gene Expression
Prokaryotic cells turn genes on and off by controlling transcription.
Prokaryotic cells turn genes on and off by controlling transcription.
Eukaryote Gene Expression/Regulation
Regulation of Gene Expression
Concept 18.2: Eukaryotic gene expression can be regulated at any stage
Gene Regulation.
DNA and the Genome Key Area 3b Transcription.
Analogy Video Central Dogma Analogy Video (Resources Page)
Prokaryotic cells turn genes on and off by controlling transcription.
Prokaryotic cells turn genes on and off by controlling transcription.
Non coding DNA Coding Not all DNA codes for a polypeptide to be made May have another useful function Non-coding sequences of DNA e.g. STRs Another example:
Volume 132, Issue 2, Pages (January 2008)
Adam C. Wilkinson, Hiromitsu Nakauchi, Berthold Göttgens  Cell Systems 
Prokaryotic cells turn genes on and off by controlling transcription.
Prokaryotic cells turn genes on and off by controlling transcription.
Eukaryotic Gene Regulation
Prokaryotic cells turn genes on and off by controlling transcription.
Presentation transcript:

Understanding the Human Genome: Lessons from the ENCODE project University of Brawijaya 4th December 2013 Austen Ganley INMS Understanding the Human Genome: Lessons from the ENCODE project

Glossary Genome Non-coding RNA Genes Sequencing DNA/RNA Microarray Protein Cell Transcription Chromatin Histones Nucleosomes Non-coding RNA Sequencing Microarray Transcription start site Active/open Inactive/repression

transcriptional start site transcriptional terminator intron promoter exon

Introduction Individual scientists worked together Aim was to understand 1% of the human genome (2007), and 100% (2012) Looked at: Transcription Chromatin/transcription factors Replication Evolution

Genes Now estimated to be about 21,000 protein-coding genes (taking about 3% of the whole genome) In addition, there are about 9,000 microRNAs, and about 10,000 long non-coding RNAs

Transcription Transcription was measured by two different methods: Whole genome microarrays RNA-sequencing

Detecting transcription using tiled microarrays

Transcription Transcription was measured by two different methods: Whole genome microarrays RNA-sequencing They found at least 62% of the whole genome is transcribed (remember, genes only account for about 3% of the whole genome)

Transcriptional start sites Goal is to identify the transcription start sites Not easy to do! Use a technique called CAGE (Cap Analysis Gene Expression)

CAGE Makes use of the 5’ CAP on mRNA First, mRNA is reverse-transcribed, to form cDNA (RNA-DNA hybrid) Then, biotin is attached to the 5’ CAP, and the cDNA is fragmented The biotin fragments are isolated (representing the 5’ end of mRNA), and these fragments are sequenced

About 60,000 transcription start sites found Only half of these match known genes What do the other ones do? May explain high level of transcription The transcription start sites are often far upstream of the gene start, and can overlap genes

Overlapping Genes An overlapping gene, starting far upstream Transcriptional start sites from the DONSON gene An overlapping gene, starting far upstream The DONSON gene is a known gene However, some transcripts start in the ATP50 gene, and include some ATP50 exons Two genes are skipped out

Chromatin: histones and nucleosomes Nucleosomes are formed from DNA that is packaged around histones Histones are a set of proteins that usually associate as an octamer www.mun.ca/biochem/courses/3107/Topics/supercoiling.html www.palaeos.com/Eukarya/Eukarya.Origins.5.html

Dnase I hypersensitive sites (DHS) DNase I preferentially digests nucleosome-depleted regions (DNase I hypersensitive sites) These are associated with gene transcription Chromatin is digested with DNase I: only digests nucleosome-free regions The remaining DNA is isolated, and put on a microarray or sequenced Find the open, active regions of the genome Hebbes Lab, University of Portsmouth, UK Gilbert, Developmental Biology, Sinauer

DNase I hypersensitive sites In total, about 3 million DNase I hypersensitive sites in the genome, covering about 15% (versus about 40,000 genes covering about 4%) Transcriptional start sites are regions of DNase I hypersensitivity, as expected Most DNase I hypersensitive sites are not associated with transcriptional start site, though

Transcription start sites Genome Transcribed region Transcription start sites DNase I hypersensitive region Genes

Histone Modification Effects Modifications occur on the histone tails They alter the strength of DNA-histone binding, and influence the binding of other proteins to the DNA Thus they can activate or silence gene expression

The “Histone Code” The combination of histone modifications determine a gene’s transcriptional status – histone code Some modifications are associated with active gene expression H3K4me2 H3K4me3 H3ac H4ac Some with repression H3K27me3 H3K4me1 www.nature.com/nrm/index.html

ChIP (Chromatin immunoprecipitation) Method to find where your protein of interest is binding to You cross-link the sample, and fragment the DNA into pieces Immunoprecipitate using an antibody to your protein of interest Reverse the cross-links, and isolate the DNA To find where in the genome the protein was bound: Hybridise the DNA to a microarray (ChIP-chip) OR sequence it (ChIP-seq) www.rndsystems.com/product_detail_objectname_exactachip_assayprinciple.aspx

Histone modification profiles They found that histone modifications associated with active transcription were found around transcription start sites They found that histone modifications associated with gene repression were depleted around transcription start sites This is as expected Around DNase I hypersensitive sites not near transcription start sites, they found almost the opposite pattern

Enrichment of active histone marks and depletion of inactive histone marks at a transcription start site Enrichment of inactive histone marks but little enrichment of active histone marks at a DNase I hypersensitive site

Histone modification profiles They also found other patterns Combining all the results (plus results for transcription factor binding), they say that the human genome is divided into seven different types of chromatin states Which state it is depends on what combination of histone modifications/transcription factor binding there is

The seven chromatin states

The seven chromatin states Enhancer (yellow) Gene body (green) Inactive region (grey) Promoter (red)

Grand Summary ENCODE Transcription start sites: • Twice as many transcription start sites as traditional “genes” • transcripts span large regions, even between genes DNase I hypersensitive sites: • more than just at transcription start sites • two types: those found both at TSS, and those found at other regions • these have different chromatin profiles Transcription: • a lot of non-coding transcription (~60% of the genome transcribed) – much more than needed just to transcribe all the genes ENCODE Overview: • genome can be generalised into seven different states • the function of some of these states is known – e.g. promoter • the function of others is not known, but may explain the high level of transcription and open chromatin structure Histone modifications: • active marks correlate with TSS/DHS • distal DHS have a different histone modification profile Chromatin states: • The genome can be divided into seven different types • these are determined by the combination of histone modifications and transcription factor binding that occur