Presentation is loading. Please wait.

Presentation is loading. Please wait.

October 2008BMI Chair Talk © Brian S. Yandell1 networking in biochemistry: building a mouse model of diabetes Brian S. Yandell, UW-Madison October 2008.

Similar presentations


Presentation on theme: "October 2008BMI Chair Talk © Brian S. Yandell1 networking in biochemistry: building a mouse model of diabetes Brian S. Yandell, UW-Madison October 2008."— Presentation transcript:

1 October 2008BMI Chair Talk © Brian S. Yandell1 networking in biochemistry: building a mouse model of diabetes Brian S. Yandell, UW-Madison October 2008 www.stat.wisc.edu/~yandell Real knowledge is to know the extent of one’s ignorance. Confucius (on a bench in Seattle)‏

2 October 2008BMI Chair Talk © Brian S. Yandell2 outline 1.how did I got here? 2.what problems caught my eye? 3.what have I done, anyway? 4.how do I work in teams? 5.what challenges remain?

3 October 2008BMI Chair Talk © Brian S. Yandell3 how did I get here? Biostatistics, School of Public Health, UC-Berkeley 1981 –RA/TA with EL Scott, J Neyman, CL Chiang, S Selvin –PhD 1981 non-parametric inference for hazard rates (Kjell A Doksum)‏ –Annals of Statistics (1983) 50 citations to date (2 in 2008)‏ research evolution –early career focus on survival analysis –shift to non-parametric regression (1984-99)‏ –shift to statistical genomics (1991--)‏ joined Biometry Program at UW-Madison in 1982 –attracted by chance to blend statistics, computing and biology –valued balance of mathematical theory against practice –enjoyed developing methodology driven by collaboration

4 October 2008BMI Chair Talk © Brian S. Yandell4 Yandell “Lab” Projects Bayesian QTL Model Selection –R software development (Whipple Neely)‏ –collaboration with UAB & Jackson Labs –data analysis of SCD1, ins10 meta-analysis for fine mapping Sorcs1 –Chr 19 QTL introgressed as congenic lines –combined analysis across to increase power QTL-based causal biochemical networks –algorithm development (Elias Chaibub)‏ –data analysis with Christine Ferrara, Duke U

5 October 2008BMI Chair Talk © Brian S. Yandell5 Rosetta: Schadt, Zhang, Zhu UAB: Allison, Yi stat/hort: Yandell BMI: Kendziorski, Broman, Craven Jax: Churchill, von Smith Duke: Newgaard, Ferrara biochem: Attie, Keller, Zhu

6 October 2008BMI Chair Talk © Brian S. Yandell6 Pareto diagram of QTL effects 5 4 3 2 1 major QTL on linkage map major QTL minor QTL polygenes (modifiers)‏

7 October 2008BMI Chair Talk © Brian S. Yandell7 problems of single QTL approach wrong model: biased view –fool yourself: bad guess at locations, effects –detect ghost QTL between linked loci –miss epistasis completely low power bad science –use best tools for the job –maximize scarce research resources –leverage already big investment in experiment

8 October 2008BMI Chair Talk © Brian S. Yandell8 advantages of multiple QTL approach improve statistical power, precision –increase number of QTL detected –better estimates of loci: less bias, smaller intervals improve inference of complex genetic architecture –patterns and individual elements of epistasis –appropriate estimates of means, variances, covariances asymptotically unbiased, efficient –assess relative contributions of different QTL improve estimates of genotypic values –less bias (more accurate) and smaller variance (more precise)‏ –mean squared error = MSE = (bias) 2 + variance

9 October 2008BMI Chair Talk © Brian S. Yandell9 QTL mapping idea observe phenotype y, marker genotypes m genetic architecture  identifies model –number and location of QTL –gene action and epistasis (pairwise interactions)‏ missing data: genotypes q at may be unknown –pr(q | m,,  )‏ –form of genotype model well known phenotype y depends on genotype q –pr(y | q, µ,  )‏ –often linear model in q –possible interactions among QTL (epistasis)‏

10 October 2008BMI Chair Talk © Brian S. Yandell10

11 October 2008BMI Chair Talk © Brian S. Yandell11 how does phenotype y improve guess of QTL genotypes q? what are probabilities for genotype q between markers? recombinants AA:AB all 1:1 if ignore y and if we use y?

12 October 2008BMI Chair Talk © Brian S. Yandell12 Gibbs sampler for loci indicators QTL at pseudomarkers loci indicators   = 1 if QTL present  = 0 if no QTL present Gibbs sampler on loci indicators  –relatively easy to incorporate epistasis –Yi et al. (2005, 2007 Genetics)‏ (earlier work of Yi, Ina Hoeschele)‏

13 October 2008BMI Chair Talk © Brian S. Yandell13 likelihood and posterior likelihood relates “known” data (y,m,q) to unknown values of interest ( ,,  )‏ –pr(y,q|m, ,,  ) = pr(y|q, ,  ) pr(q|m,,  )‏ –mix over unknown genotypes (q)‏ posterior turns likelihood into a distribution –weight likelihood by priors –rescale to sum to 1.0 –posterior = likelihood * prior / constant

14 October 2008BMI Chair Talk © Brian S. Yandell14 Bayes theorem for QTLs

15 October 2008BMI Chair Talk © Brian S. Yandell15 why use a Bayesian approach? first, do both classical and Bayesian –always nice to have a separate validation –each approach has its strengths and weaknesses classical approach works quite well –selects large effect QTL easily –directly builds on regression ideas for model selection Bayesian approach is comprehensive –samples most probable genetic architectures –formalizes model selection within one framework –readily (!) extends to more complicated problems

16 October 2008BMI Chair Talk © Brian S. Yandell16 Markov chain sampling construct Markov chain around posterior –posterior is stable distribution of Markov chain –use MC samples to estimate posterior sample QTL model unknowns from full conditionals –update unknowns one at a time or in batches

17 October 2008BMI Chair Talk © Brian S. Yandell17 Bayes posterior vs. maximum likelihood LOD: classical Log ODds –maximize likelihood over effects µ –R/qtl scanone/scantwo: method = “em” LPD: Bayesian Log Posterior Density –average posterior over effects µ –R/qtl scanone/scantwo: method = “imp”

18 October 2008BMI Chair Talk © Brian S. Yandell18 LOD & LPD: 1 QTL n.ind = 100, 10 cM marker spacing

19 October 2008BMI Chair Talk © Brian S. Yandell19 marginal LOD or LPD what is contribution of a QTL adjusting for all others? –improvement in LPD due to QTL at locus –contribution due to main effects, epistasis, GxE? how does adjusted LPD differ from unadjusted LPD? –raised by removing variance due to unlinked QTL –raised or lowered due to bias of linked QTL –analogous to Type III adjusted ANOVA tests can ask these same questions using classical LOD –see Broman’s newer tools for multiple QTL inference

20 October 2008BMI Chair Talk © Brian S. Yandell20 1-QTL LOD vs. marginal LPD 1-QTL LOD

21 October 2008BMI Chair Talk © Brian S. Yandell21 hyper data: scanone

22 October 2008BMI Chair Talk © Brian S. Yandell22 what is best estimate of QTL? find most probable pattern –1,4,6,15,6:15 has posterior of 3.4% estimate locus across all nested patterns –Exact pattern seen ~100/3000 samples –Nested pattern seen ~2000/3000 samples estimate 95% confidence interval using quantiles > best <- qb.best(qbHyper)‏ > summary(best)$best chrom locus locus.LCL locus.UCL n.qtl 247 1 69.9 24.44875 95.7985 0.8026667 245 4 29.5 14.20000 74.3000 0.8800000 248 6 59.0 13.83333 66.7000 0.7096667 246 15 19.5 13.10000 55.7000 0.8450000 > plot(best)‏ Manichaikul et al. 2008 Genetics (in review)‏

23 October 2008BMI Chair Talk © Brian S. Yandell23 what patterns are “near” the best? size & shade ~ posterior distance between patterns –sum of squared attenuation –match loci between patterns –squared attenuation = (1-2r) 2 –sq.atten in scale of LOD & LPD multidimensional scaling –MDS projects distance onto 2-D –think mileage between cities

24 October 2008BMI Chair Talk © Brian S. Yandell24 Software for Bayesian QTLs R/qtlbim : www.qtlbim.org Properties –cross-compatible with R/qtl –new MCMC algorithms Gibbs with loci indicators; no reversible jump –epistasis, fixed & random covariates, GxE –extensive graphics Software history –initially designed (Satagopan, Yandell 1996)‏ –major revision and extension (Gaffney 2001)‏ –R/bim to CRAN (Wu, Gaffney, Jin, Yandell 2003)‏ –R/qtlbim to CRAN (Yi, Yandell et al. 2006)‏ Publications –Yi et al. (2005); Yandell et al. (2007); Yi et al. (2007ab)‏

25 October 2008BMI Chair Talk © Brian S. Yandell25 glucoseinsulin (courtesy AD Attie)‏ BTBR mouse is insulin resistant B6 is not make both obese…

26 October 2008BMI Chair Talk © Brian S. Yandell26 studying diabetes in an F2 mouse model: segregating panel from inbred lines –B6.ob x BTBR.ob  F1  F2 –selected mice with ob/ob alleles at leptin gene (Chr 6)‏ –sacrificed at 14 weeks, tissues preserved physiological study (Stoehr et al. 2000 Diabetes)‏ –mapped body weight, insulin, glucose at various ages gene expression studies –RT-PCR for a few mRNA on 108 F2 mice liver tissues (Lan et al. 2003 Diabetes; Lan et al. 2003 Genetics)‏ –Affymetrix microarrays on 60 F2 mice liver tissues U47 A & B chips, RMA normalization design: selective phenotyping (Jin et al. 2004 Genetics)‏

27 October 2008BMI Chair Talk © Brian S. Yandell27 log10(ins10) Chr 19 black=all blue=male red=female purple=sex- adjusted solid=512 mice dashed=311 mice

28 October 2008BMI Chair Talk © Brian S. Yandell28 Sorcs1 study in mice: 11 sub-congenic strains marker regression meta-analysis within-strain permutations Nature Genetics 2006 Clee, Yandell et al.

29 October 2008BMI Chair Talk © Brian S. Yandell29 we were lucky! BTBR background needed to see SORCS1 epistatic interaction of chr 19 and 8 … discovered much later

30 October 2008BMI Chair Talk © Brian S. Yandell30 Sorcs1 gene & SNPs

31 October 2008BMI Chair Talk © Brian S. Yandell31 Sorcs1 study in humans Diabetes 2007 Goodarzi et al.

32 October 2008BMI Chair Talk © Brian S. Yandell32 2M observations 30,000 traits 60 mice

33 October 2008BMI Chair Talk © Brian S. Yandell33 experimental context B6 x BTBR obese mouse cross –model for diabetes and obesity –500+ mice from intercross (F2)‏ –collaboration with Rosetta/Merck genotypes –5K SNP Affymetrix mouse chip –care in curating genotypes! (map version, errors, …)‏ phenotypes –clinical phenotypes (>100 / mouse)‏ –gene expression traits (>40,000 / mouse / 4-6 tissues)‏ –other molecular traits (proteomic, miRNA, metabolomic)‏

34 October 2008BMI Chair Talk © Brian S. Yandell34 QTL mapping thousands of gene expression traits PLoS Genetics 2006 Lan, Chen et al.

35 October 2008BMI Chair Talk © Brian S. Yandell35 red=trans blue=cis QTLs on chr n gray scale for variance

36 October 2008BMI Chair Talk © Brian S. Yandell36 Chaibub Neto et al. (2008)‏ Genetics

37 October 2008BMI Chair Talk © Brian S. Yandell37 causal phenotype networks goal: mimic biochemical pathways with directed (causal) networks problem: association (correlation) does not imply causation resolution: bring in driving causes –genotypes (at conception)‏ –processes earlier in time

38 October 2008BMI Chair Talk © Brian S. Yandell38 Causal vs Reactive? (Elias Chaibub, Brian Yandell) y1 causes y2: y1 ~ g1 and y2 ~ g2*y1

39 October 2008BMI Chair Talk © Brian S. Yandell39 Ferrara et al.

40 October 2008BMI Chair Talk © Brian S. Yandell40 inferring phenotype networks build in prior pathway knowledge (PPI, TF)‏ –co-map correlated traits Banerjee, Yandell, Yi (2008 Genetics)‏ –pathways induce correlation structure ramp up to 100s, 1000s of phenotypes? –danger of mixing unrelated pathways –want closely linked upstream (causal) drivers

41 October 2008BMI Chair Talk © Brian S. Yandell41 Rosetta: Schadt, Zhang, Zhu UAB: Allison, Yi stat/hort: Yandell BMI: Kendziorski, Broman, Craven Jax: Churchill, von Smith Duke: Newgaard, Ferrara biochem: Attie, Keller, Zhu

42 October 2008BMI Chair Talk © Brian S. Yandell42 why build Web eQTL tools? common storage/maintainence of data –one well-curated copy –central repository –reduce errors, ensure analysis on same data automate commonly used methods –biologist gets immediate feedback –statistician can focus on new methods –codify standard choices

43 October 2008BMI Chair Talk © Brian S. Yandell43 how does one build tools? no one solution for all situations use existing tools wherever possible –new tools take time and care to build! –downloaded databases must be updated regularly human component is key –need informatics expertise –need continual dialog with biologists build bridges (interfaces) between tools –Web interface uses PHP –commands are created dynamically for R continually rethink & redesign organization

44 October 2008BMI Chair Talk © Brian S. Yandell44 steps in using Web tools user enters data on Web page PHP tool interprets user data PHP builds R script R run on script –creates plots, summaries, warnings PHP grabs results & displays on page user examines, saves user modifies data and reruns

45 October 2008BMI Chair Talk © Brian S. Yandell45 raw data or fancy results? raw data flexible but slow –LOD profiles for 100 (1000) traits? fancy results from sophisticated analysis –IM, MIM, BIM, MOM analysis –too complicated to put in biologists’ hands? methods are unrefined, state-of-art, research tools use of methods involved many subtle choices –batch computation over weeks compute once, save, display many times

46 October 2008BMI Chair Talk © Brian S. Yandell46

47 October 2008BMI Chair Talk © Brian S. Yandell47 LOD profiles: many traits

48 October 2008BMI Chair Talk © Brian S. Yandell48 1.5 LOD interval approximate 95% CI

49 October 2008BMI Chair Talk © Brian S. Yandell49 red=trans blue=cis QTLs on chr n gray scale for variance

50 October 2008BMI Chair Talk © Brian S. Yandell50 what challenges remain? from eQTL to candidate pathways –statistical issues networks, correlated traits better model selection approaches –biological evidence (Weiss 2007 Genetics)‏ Mouse to human to mouse KOs, etc. upgrade informatics environment –harden local code (R, Python, PHP, …)‏ –build on other high throughput systems Swertz, Jansen (2007); Stein (2008) Nat Rev Gen

51 October 2008BMI Chair Talk © Brian S. Yandell51 many thanks Karl Broman Jackson Labs Gary Churchill Hao Wu Randy von Smith U AL Birmingham David Allison Nengjun Yi Tapan Mehta Samprit Banerjee Ram Venkataraman Daniel Shriner Michael Newton Hyuna Yang Daniel Sorensen Daniel Gianola Liang Li my students Jaya Satagopan Fei Zou Patrick Gaffney Chunfang Jin Elias Chaibub Neto W Whipple Neely Jee Young Moon USDA Hatch, NIH/NIDDK (Attie), NIH/R01 (Yi, Broman)‏ Tom Osborn David Butruille Marcio Ferrera Josh Udahl Pablo Quijada Alan Attie Jonathan Stoehr Hong Lan Susie Clee Jessica Byers Mark Keller


Download ppt "October 2008BMI Chair Talk © Brian S. Yandell1 networking in biochemistry: building a mouse model of diabetes Brian S. Yandell, UW-Madison October 2008."

Similar presentations


Ads by Google