Post-GWAS and Mechanistic Analyses Session 1, Day 4, Liu Post-GWAS and Mechanistic Analyses
I identified a SNP, now what? Congratulations! But…
More Questions A genotype-phenotype association does not mean a cause-effect relationship
Are You Sure? If you are dedicated to do some real science, every step down the road means time, money, labor, value, reputation and life! Validate the result carefully before you jump into any mechanistic study If you can’t do it, ask other people and collaborate.
You Only Identified a Marker! Is the top SNP what I should choose to go? Should I ignore the SNPs that do not reach the significance? P value is not a smart judgement! Did I miss any other SNP? Fine mapping Narrow down Top SNP Genotyped SNPs Untyped SNPs Unidentified SNPs Other type of variants
“Balloon” “Skyscraper”
Things To Do Imputation Sequencing LD Haplotype
Genotype Imputation Predicting or imputing genotypes that are not directly assayed in a sample of individuals “in silico genotyping” Missing data Untyped No cost Increase the possibility for identification
Application Before GWAS: increase chance After GWAS: fine mapping Alternative GWAS: using existing control data Meta-analysis: combining data
SNPs Patients ?
Process Marchini et al., 2011
Accuracy Howie, et al., Nature Genetics 44, 955–959 (2012)
Strategies SNP tagging-based approaches PLINK SNPMSTAT UNPHASED TUNA Simple multinomial model of haplotype frequencies Low accuracy but quick
Strategies Hidden markov model (HMM)-based approaches High accuracy Rare SNPs IMPUTE v1 IMPUTE v2 fastPHASE/BIMBAM MACH BEAGLE
IMPUTE v1, IMPUTE v2, SNPTEST http://www.stats.ox.ac.uk/~marchini/#software fastPHASE, BIMBAM http://stephenslab.uchicago.edu/software.html MACH, MACH2DAT, MACH2QTL http://www.sph.umich.edu/ csg/abecasis/MACH/BEAGLE http://www.stat.auckland.ac.nz/bbrowning/beagle/beagle.html SNPMSTAT http://www.bios.unc.edu/lin/software/SNPMStat/ GEDI http://dna.engr.uconn.edu/ PLINK http://pngu.mgh.harvard.edu/purcell/plink/ UNPHASED http://homepages.lshtm.ac.uk/frankdudbridge/software/ unphased/TUNA http://www.stat.uchicago.edu/wen/tuna/ ProbABEL http://mga.bionet.nsc.ru/yurii/ABEL/
Other Variants Known or unknown CNVs/nucleotide repeats/indel 1000 Genome (1KG) Other shared NGS data Re-sequencing
Other Factors Common disease: gene X environment Multiple genes
Other Factors Gene X Environment (GXE) Age, gender, diet, smoke, drink, etc
Other Factors Gene X Gene (GXG) Epistasis SNP1 SNP2 Disease X +
Other GWAS Approaches Prior Knowledge-based grouping PWAS Gene-based grouping PORCE
You Only Determined a “Locus”! Direction Transcription Translation Catalysis DNA RNA Protein Metabolites Genome Transcriptome Proteome Metabolome/Lipidome Clinical endpoint dysregulation Genetic effect Environmental effect
Which Gene? The closest? Gene desert mRNA vs other? Candidate variants Top SNP Genotyped SNPs Untyped SNPs Unidentified SNPs Other type of variants The closest? Gene desert mRNA vs other? Knowledge integration Candidate gene
How?—Functional Pathway How the SNP change the gene function? Quantitative regulation Regulatory region: TF binding, stability, miRNA targeting, methylation, etc Gene/protein expression eQTL, p-QTL, mQTL Qualitative regulation Gene function: AA change, frame shift, splicing, CNV gene deletion, etc enzyme activity substrate binding phosphorylation
Integration of Different Omics Transcription Translation Catalysis DNA RNA Protein Metabolites Genome Transcriptome Proteome Metabolome/Lipidome Clinical endpoint dysregulation Genetic effect Environmental effect QTL mapping
Why?—Signaling Pathway Proof-of-concept Cells: Gene knock-down, overexpression, CRISPR/CAS9 Model species: KO, TG, mice, yeast, etc. Signaling pathway: dependency test Gene 1 X Gene 2 Knockdown X Knockdown Omics Gene 3 Knockdown …. disease Gene 4 Knockdown
Tools SNPnexus SNAP GTExPortal http://snp-nexus.org/ https://www.broadinstitute.org/mpg/snap/ GTExPortal http://www.gtexportal.org/home/