The NIEHS Environmental Genome Project: Enabling Studies of Gene-Environment Interaction Douglas A. Bell, Ph.D. Environmental Genomics Section National Institute of Environmental Health Sciences Professor, Dept of Epidemiology UNC School of Public Health
NIEHS’s Environmental Genome Project Resequencing of ~500 Candidate Genes Potentially Involved in Environmental Disease n Concept and rationale n Examples of gene-environment interaction n Resequencing studies, accomplishments, and accessing data.
Modulation of Response to Exposure Exposure EarlyEffects Genetic Susceptibility Genetic Susceptibility Disease
Genetic Modulation of Exposure, Damage, and Biological Response Disease Genetic Variation in: Metabolism, or distribution, affects dose to the tissue Metabolism, or distribution, affects dose to the tissue Detection and repair of damage Detection and repair of damage Differences in growth and recovery from damage Differences in growth and recovery from damage ExposureTargettissue Biological Response
Genetic Modulation of Exposure Risk Exposure No Exposure 4-Fold Risk 2-Fold Risk Background Risk Level (low) Sensitive Genotype Sensitive Genotype Resistant Genotype Resistant Genotype
PAH-oxide GST + Glutathione Inactive Benzo[a]pyrene Metabolism Glutathione HO CYP450 DNA Reactive
PAH-oxide GST + Glutathione Inactive DNA Reactive Benzo[a]pyrene Metabolism Glutathione HO CYP450 GSTM1 Null
Bladder Cancer Risk Associated with Smoking and GSTM1 Null Genotype * P<0.001; Bell et al, JNCI 85:1559,1993 Nonsmokers Packyears Smoking >50 Packyears Smoking * 4.3* 3.5* 5.9* GSTM1 (+) GSTM1 null Exposure Risk Genetic Risk
Examples of Gene-Environment Interaction (gene modifies environmental effect) n Malaria and Sickle Cell gene. n HIV infection and CCR5 receptor variant. n LPS sensitivity and Toll Receptor (TLR4) n Adverse drug response and CYP2D6 poor metabolism. n Alcohol intolerance and aldehyde dehydrogenase. n Smoking, GSTM1 null, NAT2 slow genotypes, and bladder cancer risk.
Variation in Risk Estimates in Human Populations Phenotypic variation in response due to: PhysiologyMetabolismRepairGrowth Timing of Exposure Risk Exposure
Example: Metabolism Polymorphisms frequency Activity No Phenotypic Polymorphism Range of Enzyme Activity in Human Populations
Distribution of Polymorphic Enzyme Activity in a Population frequency Activity Activity High Low High Low +/+ +/+ +/-+/--/--/- Examples: N-Acetyltransferase 2, GSTM1, CYP2D6
How does frequency of a risk factor impact exposure induced (G x E) risk in the population? frequency Activity 5% 95%
Effects of Exposure in High and Low Risk Human Populations Risk Exposure frequency Activity 5% 95% High Risk Low Risk Average
How will genetic data be used in public health risk assessment? Given detailed information on the relationship between genotype and phenotype, more accurate risk assessments may be possible. Given detailed information on the relationship between genotype and phenotype, more accurate risk assessments may be possible.
Risk Management More/Less Control Human Genetic Susceptibility Exposure Assessment Engineering design Risk Assessment Process Animal toxicology (dose/response) Risk Model (Extrapolation to humans) S R Effects in Humans ? Replace default assumptions about variability Hazard/Risk Assessment
Chemical X Cancer - Yes/No Dose ? Extrapolate to Humans Susceptible human subgroup? Biochemistry Biochemistry Mechanism of toxicity Mechanism of toxicity Genes, pathways Genes, pathways Human genetics Human genetics Incorporating Human Genetic Polymorphism Information Into Risk Assessment
Incorporating Genetics Into Risk Assessment: Issues A polymorphism may have different effects depending on the chemical, the target organ/ disease, and the population being considered. A polymorphism may have different effects depending on the chemical, the target organ/ disease, and the population being considered. Thus, a protective allele for one chemical may convey risk for a different chemical. Similarly one organ system may be protected at the risk of another; e.g. immune system response could increase DNA damage or neurotoxicity.
D.A.Bell NIEHS Ethylene oxide HCHO Detoxication GSTT1 + Glutathione Activation Methylene chloride GSTT1 + Glutathione DNA GST Theta 1 (GSTT1) - One gene with 2 effects Glutathione H2CH2C HO CH 2 Cl Glutathione Cl- CH 2 + (Unstable) DNA Reactive Inactive (also Methyl chloride)
Activation vs. Detoxication Activation vs. Detoxication Effects of polymorphism dependent on chemical and toxicity pathway: Activation - If the activation pathway is missing (null genotypes), some individuals may have zero risk even if they have exposure. Activation - If the activation pathway is missing (null genotypes), some individuals may have zero risk even if they have exposure. Detoxication - Since this process will never be 100% efficient, both functional and low activity genotypes will exhibit risk associated with exposure. Detoxication - Since this process will never be 100% efficient, both functional and low activity genotypes will exhibit risk associated with exposure.
The Effect of GSTT1 Genotype on Metabolism of Methyl Chloride From Lof, A. et al, Pharmacogenetics 10:645, T1 + Metabolism to DNA reactive forms T1 Null No Metabolism Measure exhaled methyl chloride
D.A.Bell NIEHS Smoking, GSTT1 Polymorphism, and Markers of Genotoxicity in Erythrocytes Background: Ethylene oxide –hemoglobin adducts are a good measure of smoking exposure in blood. Experiment: To test if GST genotypes modulated effects of smoking in erythrocytes, we measured ethylene oxide hemoglobin adducts in freshly collected human erythrocytes from nonsmokers and smokers. Results: Results: Ethylene oxide adducts (HEV) were ~50% higher in GSTT1 null individuals. Ethylene oxide adducts (HEV) were ~50% higher in GSTT1 null individuals.
GSTT1 null genotypes have higher levels of smoking-induced hemoglobin adducts Study Design: 16 nonsmokers 32 smokers HEVal hemoglobin adducts measure by mass spectrometry P = for difference in slopes; Nonparametric analysis similar. Fennel et al CEBP 9:705,2000
Incorporating Genetics Into Risk Assessment Needs: Identify genes involved in toxicological response. Identify genes involved in toxicological response. n Detailed population genetic information including: n Identify polymorphisms. n Determine frequency in populations. n Population-based risk estimates in large studies (n=2000). Determine functional relationship between genotype and phenotype Determine functional relationship between genotype and phenotype n Biochemical n In vitro, in vivo quantitative measurements of a cellular phenotype (tumors, adducts, mutation, cell death, gene expression). Consider role of multiple genes, multiple pathways, etc. Consider role of multiple genes, multiple pathways, etc. Incorporate kinetic or other functional data into risk model. Incorporate kinetic or other functional data into risk model.
Environmental Genomics Discovery: Phenotype-directed Genotype-directed FunctionalAnalysis Disease Risk Characterization Phenotype Genotype CTTATGT A/C GGGTAT Altered Binding Effects in Populations
Polymorphism and Function Gene Deletions, Duplications Coding region changes: aa subs, deletions, stops. Transcription Factors Effects of Polymorphism: Altered function Quantity of protein Regulatory polymorphisms alter transcription factor binding and mRNA/protein level. Exon 1 Exon 2 Promoter 3’ UTR e.g. GSTM1, CYP2D6
C TGGGCCCCGCCCCCTTATGTAGGGTATAAAGCCC …. CCCGTCACC ATG SP1/Oct Phenotype—Directed Approach to Find SNPs That Alter Gene Expression Level Liu, X. et al
Sequence-Directed Approaches to Catalogue All Significant SNPs In The Human Population Resequencing Projects: Describing candidate gene polymorphisms in diverse populations. ~9 million SNPs in dbSNP now, ~9 million SNPs in dbSNP now, by 2006, expect ~20 million human SNPs. by 2006, expect ~20 million human SNPs. n A SNP every ~100 bases. Haplotype Map: Describing which SNPs occur together on chromosomes in populations (haplotypes).
SNP Discovery Projects n The SNP Consortium – ~1 million SNPs across genome n NIEHS – Environmental/toxicology genes n NHLBI – Heart disease genes, inflammation n NIGMS – Pharmacogenetic genes SNP data is entered into the NCBI dbSNP database
Hapmap UCSC
n U Wash EGP Website
n Characterize the large scale genetic structure across the genome. n Genotyping SNPs at 1 kb interval across the genome in European, African, and Asian populations. HapMap Website
n HapMap Website n Seattle SNPs or EGP website n Many other freely available programs Bioinformatic Tools Available For Picking Haplotype Tagging SNPs
NIEHS Environmental Genome Project Resequencing of candidate environmental disease genes Accomplishments: Total genes sequenced = 437 Total genes sequenced = 437 Total kilobases sequenced = 11,001 kb Total kilobases sequenced = 11,001 kb Total SNPs identified = 59,475 Total SNPs identified = 59,475
NIEHS’s Environmental Genome Project Summary: n Gene-environment interaction affects disease risk. n Effects of G x E interactions can be complex. n Resequencing projects are providing many new candidate gene polymorphism. n Determining the important functional SNPs that affect disease risk is a difficult challenge.
Strategies For Incorporating SNPs Into Epidemiology Studies 1.Whole genome association studies n Test 10, ,000 SNPs in case control studies. n Identify candidate regions, genes, followup with candidate gene studies. 2. High resolution candidate gene studies. n Test functional SNPs and additional haplotype tagging SNPs in case/control or other design. n Bioinformatics to identify 1500 SNPs, 150 genes (10 SNPs/gene). n Coding SNPs, regulatory SNPs, haplotype tag SNPs.
n Application to p53 response elements n Application to NRF2 response elements Bioinformatic Identification of SNPs That Affect Gene Expression
p53 p53 inducible genes contain p53 Response Elements. RRRCWWGYYY RNA Pol Using bioinformatic methods, identify SNPs that disrupt p53 response elements. SEI1mRNAATGp53p53p53 SEI1 gene Following UV exposure p53 binds RE of target gene. RRRCWWAYYY
Test SNPs Against p53 Response Element Consensus Filter: Best Hits Access database Build Table of All Promoter SNPs RRRCWWGYYYRRRCWWGYYY AAAGGACAAGTTGAAACTTGCACAAGCAGCCTCCATTCTG DNA ambiguity code R = A or G Y = C or T W = A or T dbSNPData Binding Site ConsensusNCBI/Ensemble Genome Data Dan Tomso
Mismatch with consensus CWWG motif Dan Tomso
Do SNPs in putative p53 response elements affect p53 induced expression in Saos2 cells? Weak Strong Mike Resnick, Alberto Inga, Daniel Menendez Saos2 Osteosarcoma Cells (p53 null)
Environmental Genomics Section Douglas A. Bell Gary S. Pittman Merrill ‘Chip’ Miller, III Daniel J. Tomso Michelle R. Campbell Xuemei Liu Xuting Wang Monica Horvath
~4000 Human ARE containing genes Phylogenetic Footprinting of NRF2/ARE Genes ~2100 Rat ARE containing genes ~4000 Mouse ARE containing genes Human/ mouse/rat ~ human/mouse
Gene x Environment Interaction Gene x Environment Interaction Pharmacogenetics: Pharmacogenetics: n Adverse drug reactions (toxicity) n Reduced efficacy Environmental disease Environmental disease n Modification of exposure-induced toxicity n Modification of exposure-induced disease Can we generalize about risk associated with a specific gene? Can we generalize about risk associated with a specific gene?