Molecular and Genetic Epidemiology Kathryn Penney, ScD January 5, 2012.

Slides:



Advertisements
Similar presentations
Genetic Approaches to Thinking, Moving and Feeling
Advertisements

What is an association study? Define linkage disequilibrium
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Genetic Analysis in Human Disease
GENOMICS TERM PROJECT Assessment of Significance in a SNP.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Mapping Genes for SLE: A Paradigm for Human Disease? Stephen S. Rich, Ph.D. Department of Public Health Sciences Wake Forest University School of Medicine.
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
Introduction to Molecular Epidemiology Jan Dorman, PhD University of Pittsburgh School of Nursing
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Genome Variations & GWAS
Genes, Environment and Traits
Chapter 7 Multifactorial Traits
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Strong Heart Family Study Phase VI Genetics Center Aims October 8, 2009.
IBD genetics in children across diverse populations Subra Kugathasan, MD Professor of Pediatrics and Human Genetics Emory University.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Utilizing Science & Technology and Innovation for Development A Genome Wide Association Study for Type 2 Diabetes Susceptibility Gene and Treatment in.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Quantitative genetics: traits controlled by alleles at many loci
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
CS177 Lecture 10 SNPs and Human Genetic Variation
Gene Hunting: Linkage and Association
GWAS Hits and Functional Implications Peter Castaldi February 1, 2013.
From Genome-Wide Association Studies to Medicine Florian Schmitzberger - CS 374 – 4/28/2009 Stanford University Biomedical Informatics
Genome-Wide Association Study (GWAS)
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Methods in genome wide association studies. Norú Moreno
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Genetic Testing Amniocentesis Until recently, most genetic testing occurred on fetuses to identify gender and genetic diseases. Amniocentesis is one technique.
Genome wide association studies (A Brief Start)
The International Consortium. The International HapMap Project.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
The analysis of A Genome-wide Association Study of Autism Reveals a Common Novel Risk Locus at 5p14.1 Rodney Knowlton Kyle Andrews.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Quantitative genetics
Complex disease and long-range regulation: Interpreting the GWAS using a Dual Colour Transgenesis Strategy in Zebrafish.
Consideration for Planning a Candidate Gene Association Study With TagSNPs Shehnaz K. Hussain, PhD, ScM Epidemiology 243: Molecular.
Gene Hunting: Design and statistics
Linking Genetic Variation to Important Phenotypes
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
Chapter 7 Multifactorial Traits
Perspectives from Human Studies and Low Density Chip
SNPs and CNPs By: David Wendel.
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Presentation transcript:

Molecular and Genetic Epidemiology Kathryn Penney, ScD January 5, 2012

Definitions  Genetic Epidemiology  ‘a science which deals with the etiology, distribution, and control of disease in groups of relatives and with inherited causes of disease in populations’ - Morton, 1982  Molecular Epidemiology (  seeks to identify human (cancer) risk and (carcinogenic) mechanisms to improve (cancer) prevention strategies  is multi-disciplinary and translational, going from the bench to the field and back  uses biomarkers and state-of-art technologies to gain mechanistic information from epidemiological studies

Genetic and Molecular Epidemiology Genetic variation Disease Exposure Biological Factors/ Mechanism Association?

Genetic Studies

Twin studies  Determine if a disease has a genetic component  Estimate the genetic contribution to disease (heritability)  Genetics (heritable component)  Shared environment  Unique environment  Twins  Monozygotic (MZ) share 100% of their genes  Dyzygotic (DZ) share ~50% of their genes  Use correlation of trait/disease  R MZ = genetics + shared environment  R DZ = ½ genetics + shared environment  Genetics = 2 x (R MZ – R DZ )

Heritability Lichtenstein et al, 2000

Association studies  Family based  Parent-child trios, siblings  Population based  Case-control  Types of studies  Candidate gene/SNPs  Genome-wide association study (GWAS)  Single nucleotide polymorphisms (SNPs) vs. mutations/rare variants  Germline variation  SNPs > 1% population frequency A/A A/C casescontrols

Samples  Blood  DNA, RNA, biomarkers (dietary, hormones)  Tissue  Tumor and normal  DNA, RNA, proteins

Candidate genes  Select a gene of interest  Select SNPs to genotype  Literature  tagSNPs  Haplotype tagSNPs CGAACG CGAACG CGACCG CTACCA CTACCA G/TA/CG/A CGAACG CGAACG CGACCG CTACCA CTACCA G/TA/CG/A

Candidate genes  The International HapMap Project  Catalog of common genetic variants  Describes what these variants are, where they occur, and how they are distributed among people within populations and among populations

  Haploview – visualize correlations between SNPs in HapMap or study data  Tagger – method to select tagSNPs in HapMap or study data Candidate genes

 Are the SNPs associated with outcome?  Are the SNPs associated with intermediate phenotypes/biomarkers/tumor markers? Candidate genes

Genotyping technology  Taqman  PCR-based fluorescent assay  Single SNP assay  Sequenom  PCR-based single-base extension  MALDI-TOF (Matrix-Assisted Laser Desorption/Ionization – Time Of Flight)  Multi-plex (≤36-40 SNPs) assay

Genome-wide Association Study (GWAS)  Estimated 10 million SNPs in the genome  Genotype 350k – 1 million SNPs across entire genome  Test association of each SNP with outcome  Adjust for the number of tests performed  p < 5x10 -8 considered “genome-wide” significant  Replicate findings in a different population  Same SNP, same direction, approximate same magnitude of effect

GWAS results Amundadottir et al, 2009

Published Genome-Wide Associations through 6/2010, 904 published GWA at p<5x10 -8 for 165 traits NHGRI GWA Catalog

Genotyping technology  Illumina  1 million SNP chip  tagSNPs selected from HapMap data  Affymetrix  1 million SNP chip  Selected based on distance technote_intelligent_snp_selection.pdf

Whole Genome Sequencing  Human Genome Project  First genome sequenced in 2000; project completed 2003  1000 Genomes Project  Goal: to create a complete and detailed catalogue of human genetic variation  Knome (founded by George Church and Harvard University)  knomeDiscovery – sequencing (30x) and interpretation for ~$5,000  The Personal Genome  Interpretation (counseling?)  Screening?  High-risk groups?  Drug efficacy?  May help individuals alter behavior – but for now, we can’t do anything about our genes!

Bias in Genetic Studies

Genetic polymorphismDisease ??? CONFOUNDING

Bias in Genetic Studies Genetic polymorphismDisease Race/Ethnicity CONFOUNDING

Population Stratification  Example:  Prostate cancer is more common in African Americans than in Caucasians  Frequency of many SNPs is different in African American and Caucasian populations  If we ignored race/ethnicity, what might happen in our study?

Population Stratification Figure 1. The effects of population structure at a SNP locus. If the study population consists of subpopulations that differ genetically, and if disease prevalence also differs across these subpopulations, then the proportions of cases and controls sampled from each subpopulation will tend to differ, as will allele or genotype frequencies between cases and controls at any locus at which the subpopulations differ. The figure shows an example of this scenario with two populations in which the cases have an excess of individuals from population 2 and population 2 has a lower frequency of allele A than population 1. In this example, the structure mimics the signal of association in that there is a significant difference in allele and genotype frequencies between cases and controls. Marchini, 2004 Caucasian African American

Adjusting for Ethnicity  Defining & measuring ethnicity  Self-report  Ancestry (where are you grandparents from?)  Genotype many (hundreds) “ancestry informative markers”  Control for ethnicity  In design  Restrict to one ethnicity  Match on ethnicity  In analysis  Stratify by ethnicity  Include ethnicity in regression model

Misclassification  Non-differential  Of exposure: the degree of misclassification is the same according to disease status  Likelihood that exposure is wrong is similar among those who do and do not develop disease  Differential  Of exposure: The degree of misclassification varies according to the disease status

Misclassification  Laboratory tests do not always work perfectly – some % of samples may fail genotyping  Missing or incorrect exposure information  Non-differential or differential misclassification?  What can we do to ensure that the misclassification is non- differential?

Gene x Environment Interaction: An Example of Effect Modification Given equal exposure to the same risk factor, individuals may have different risk of disease depending on their genetic background  The effect of an exposure on a disease outcome is modified by genotype

Gene-environment interaction D+D- E+4020 E-8040 D+D- E+6080 E-2060 D+D- E+100 E-100 OR = 1 AA genotypeAT/TT genotype OR = 1 OR = 2.25 Stratify on genotype

Effect Modification is Biological DNA damage Lung Cancer CYP1A1 GSTM1 Metabolism

GWAS follow-up

-Dozens of GWAS for many diseases have now been performed -Thousands of samples and hundreds of thousands of SNPs -Replication is necessary to determine which significant results are real -Once we know the results are real, then what??? Eeles RA et al. (2008)

GWAS follow-up  Risk prediction model development  Understand biological function  candidate genes/regions!  Some associated SNPs are not in gene regions  Many types of biological data and techniques can be employed to determine the function of the risk SNPs  Fine mapping  Expression (RNA and protein)  Enhancer activity

GWAS follow-up – 8q24 story Ghoussaini et al. A) Haploview output of the 1.18-Mb 8q24 "desert" showing the five cancer-specific regions reported to date

GWAS follow-up – 8q24 story Pomerantz et al, q24 variation not associated with MYC mRNA expression in prostate tumor or normal tissue

(a) ChIP assay on Colo205, demonstrating a pattern consistent with enhancer activity. (b) Luciferase reporter assay demonstrating enhancer activity in two CRC lines. Error bars denote one standard deviation from the mean of replicate assays. (c) Representative luciferase assay showing increased enhancer activity of G over T alleles, performed on a total of 18 clones (nine G and nine T over 3 d) (P = 0.024). Error bars denote one standard deviation from the mean of assays performed in triplicate. (d) Mass spectrometry plots from Sequenom analysis showing preferential binding of TCF7L2 to risk allele (G) in immunoprecipitated DNA, as evidenced by differential peak heights (right panel) compared to control input DNA (left panel) (P = ). GWAS follow-up – 8q24 story Pomerantz et al, 2009

GWAS follow-up (and beyond) GWAS results mRNA expression

Thank you! Questions?