Epigenome 1
2
Background: GWAS Genome-Wide Association Studies 3
What is a genome-wide association study? It involves rapidly looking at markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular disease. Researchers can use the information to develop better ways to detect, treat and prevent the disease. GWAS are useful in finding genetic variations that contribute to common, complex diseases, such as asthma, cancer, diabetes, heart disease and mental illnesses. 4
A Catalog of Published Genome-Wide Association Studies 5
Why are such studies possible now? -completion of the Human Genome Project in International HapMap Project in tools include -computerized databases that contain the reference human genome sequence, -a map of human genetic variation and -a set of new technologies that can quickly and accurately analyze whole-genome samples for genetic variations that contribute to the onset of a disease. 6
How will genome-wide association studies benefit human health? - Leads to personalized medicine. -provide patients with individualized information about their risks of developing certain diseases. -design prevention programs to each person's unique genetic makeup. -select the treatments most likely to be effective and least likely to cause adverse reactions in that particular patient. 7
What have genome-wide association studies found? -2005, 3 studies found that age-related macular degeneration (a common form of blindness) is associated with variation in the gene for complement factor H, which produces a protein involved in regulating inflammation. -Found genetic variations that contribute to risk of type 2 diabetes, Parkinson's disease, heart disorders, obesity, Crohn's disease and prostate cancer, as well as genetic variations that influence response to anti-depressant medications. 8
How are genome-wide association studies conducted? -use two groups of participants: people with the disease being studied and similar people without the disease. -Get DNA from each participant, eg blood sample or mouth cells. -complete set of DNA, or genome, is: -purified from the blood or cells, -placed on tiny chips and -scanned on automated laboratory machines. -single nucleotide polymorphisms, or SNPs, are found. -genetic variations significantly more frequent in people with the disease compared to people without disease, are said to be "associated" with the disease. 9
Associated genetic variations can point to the region of the human genome where the disease- causing problem is. Associated variants may not directly cause the disease. They may just be connected with the actual causal variants. Additional steps, such as sequencing DNA base pairs in that particular region of the genome, identify the exact genetic change involved in the disease. 10
11
12
Histone modification patterns denote complex chromatin states 13
14
15
16
A vast resource for the normal epigenome ~3,000 data sets from over 400 cellular states (cell types, differentiation states, developmental time points) ~80 highly information rich ‘complete epigenomes’ with multiple data types per cell/tissue state DNaseI 6-30 histone modifications DNA methylation RNA 17
18
Most epigenome features are highly cell-selective 19
Most epigenomic features are highly cell-and lineage-selective 20
21
Connecting epigenomic data to genes 22
23
The epigenome can ‘remember’ earlier cellular states. 24
Developmental persistence of enhancer chromatin accessibility 25
Regulatory DNA variation associated with common diseases and traits 26
Identification of disease-and trait- associated variation by GWAS GWAS disease/trait associated variants x Maps of regulatory DNA in >300 diverse cell and tissue types Maurano et al., Science
Disease-associated variation is concentrated in non-coding regulatory DNA. Disease-and trait-associated SNPs are concentrated in regulatory DNA 28
GWAS variants selectively localize in regulatory DNA of pathologically relevant cell types 29
Disease-associated variation clusters in pathogenic or target cell types Maurano et al., Science
Variants associated with diseases and traits with developmental contributions preferentially localize in fetal regulatory DNA. 31
Surveying the normal epigenomic landscape Developing cells and tissues 32
Most variants lie in regulatory DNA of fetal origin 33 Maurano et al., Science 2012
Fetal regulatory variants are enriched in traits & diseases with known links to intrauterine exposures 34 Maurano et al., Science 2012
Correcting genetic variation for epigenetic circuitry Regulatory DNA with disease-associated variants mainly controls distant genes 35
36
37
Regulatory GWAS variants linked to distant genes with pathogenic potential 38
Disease-associated variants selectively localize to relevant transcription factor recognition sites 39
Within regulatory DNA, disease-associated variants systematically localize within relevant TF recognition sites 40
41
42
Disease-associated variants cluster in regulatory pathways and form regulatory networks 43
44
45
46
Epigenomic data enable pinpointing of disease/trait-relevant cell types 47
Summary & Implications The Roadmap Epigenomics Project has created a vast, high-quality atlas of the epigenomic states of normal cells and tissues A powerful, enabling resource for diverse investigators Roadmap data can be integrated to reveal important insights into cellular phenotypes and functions Many novel features of the data await exploration and discovery Disease-associated variation is concentrated in regulatory DNA Enables a coherent approach to understanding the role of non- coding variants Reference maps of normal cells enable pathogenic insights Reference maps are a powerful tool that, when combined with genetic data, may obviate the need to perform deep profiling of disease populations Although it has covered significant ground, the Roadmap is only a start Only a fraction of the true diversity of human cell types and states has been covered 48
Thank you. 49