Presentation is loading. Please wait.

Presentation is loading. Please wait.

Complex trait analysis, develop- ment, and genomics

Similar presentations


Presentation on theme: "Complex trait analysis, develop- ment, and genomics"— Presentation transcript:

1 Complex trait analysis, develop- ment, and genomics
The Complex Trait Consortium and the Collaborative Cross Rob Williams, Gary Churchill, and members of the Complex Trait Consortium Background test slide.

2 Elias Zerhouni: The NIH Roadmap. Science 302:63 (2003)
Material included in handouts and on the CD also see “Solving the puzzle of complex diseases, from obesity to cancer, will require a holistic understanding of the interplay between factors such as genetics, diet, infectious agents, environment, behavior, and social structures.” Zerhouni quote from Science article Elias Zerhouni: The NIH Roadmap. Science 302:63 (2003)

3 What is the CTC A group of ~150 mouse geneticists most of whom have interests in pervasive diseases and differences in disease susceptibility. General Aim: Improve resources for complex trait analysis using mice. Main catalysts and models ENU mutagenesis programs Sequencing and SNP consortia Catalyze genotyping of strains Simulation studies of crosses Planning a collaborative cross Improved use of resources Project 1. Catalyzed and assisted in SNP and SSLP analysis of 48 to 68 strains. Project 2. Initiated simulation studies of novel types of crosses by Broman, Churchill, Gaile, Flint and Mott and Williams Project 3. The development of a collaborative cross 4. Improved resource, data, methods, and idea sharing.

4 The short chronology of the CTC
Established Nov 2001, Edinburgh (n = 20) 1st CTC Conference, May 2002, Memphis (n = 80; hosted by R Williams) CTC Collaborative Cross design workshop, Aug 2002, JHU (K Broman and R Reeves, host) CTC Satellite meeting at IMGC Nov (n = 40) 2nd CTC Conference, July 2003, Oxford (n = 80; hosted by R Mott and J Flint) CTC strain selection workshop, Sept (M Daly, host) 3rd CTC Conference, July 2004, TJL CTC chronology

5 Lusis et al. 2002: Genetic Basis of Common Human Disease
Are mouse models appropriate? Yes and No. “If you want to understand where the war on cancer has gone wrong, the mouse is a pretty good place to start.” –Clifton Leaf Fortune, March 2004 Lusis et al. 2002: Genetic Basis of Common Human Disease

6 Mixing mouse genomes (reluctantly)
Current practice: Keep it simple: high power with low n

7 Genetic dissection Vp = Vg + Ve Vp = Vg + Ve + 2(Cov GE) + GXE + Vtech
Aim 1: Convert genetic variation into a small set of responsible gene loci called QTLs. Aim 2: Develop mechanistic insights into virtually any genetically modulated process or disease. The equation should be Vp= Vg + Ve + 2(COV of g by e) + Vi(g+e) + Vt Note: Only a subset of genes will be variable in a particular sample populations. The fixed invariant genes are initially genetically invisible. We are tracking down the genes associated with “natural” variation. These genes are critical in the clinic--they are the susceptibility genes. These normal variants may not have big effects. The size of the gene effect is NOT a measure of its importance. If fact, one can argue that the most important genes will not tolerate much sequence variation.

8 20 generations brother-sister matings
Standard RI strains Standard recombinant inbred strains (RI) female male C57BL/6J (B) DBA/2J (D) BXD fully inbred chromosome pair isogenic F1 hetero- geneous F2 20 generations brother-sister matings BXD RI Strain set Recombined chromosomes are needed for mapping Inbred Isogenic siblings + … + BXD2 BXD80 BXD1

9 Proposal for a Collaborative Cross

10 Integrative and cumulative analysis/synthesis
physiology anatomy pathology development pharmacokinetics endocrine profile immune 1K Reference environment response pathogens Population metabolism epigenetic modifications proteomics Meta- cancer transcriptome analysis susceptibility

11 Design criteria for a Collaborative Cross
Broad utility: a resource that combines diverse haplotypes and that harbors a broad spectrum of alleles Freedom from genotyping. Lowering the entry barrier into this field Unrestricted access to strains, tissues, data, and statistical analysis suites (on-line mapping) Improved power and precision for trait mapping. Epistasis! Powerful new approaches to analysis of complex systems. Pleiotropy Analysis of gene-by-environment interactions A systems biology resource A new type of complex animal model to study common human diseases

12 A set of 420 RI lines

13 Mapping with sequence data in hand

14 B6 and D2 haplotype contrast map of Chr 1
Three pairs of homologous chromosomes from three strains of mice (strains 1, 2, 3) are shown with color-coding. There are green and red regions, which geneticists would call a haplotype. Genes in stretches of one color are also inherited in blocks. There will be as many as 1000 genes in each of these blocks, and we referred to the common derivation of a chunk of a chromosome as linkage disequilibrium. This is just an intimidating way of saying that genes that are close together on a chromosome tend to stay together and do not assort randomly as expected from the the law of independent assortment. Each part of the genome of the mouse has a unique strain distribution pattern (SDP). The SDP for the QTL in this figure is GRR on the left and hhR on the right. The difference between left and right. The Right side is the F1 hybrid of the left side.

15 Celera SNP DB

16 Coincidence analysis !

17 Integrative and cumulative analysis/synthesis
physiology anatomy pathology development pharmacokinetics endocrine profile immune 1K Reference environment response pathogens Population metabolism epigenetic modifications proteomics Meta- cancer transcriptome analysis susceptibility

18 www.webqtl.org QTL/QT gene
Wilt Chamberlain: 7 feet 1 inch Willie Shoemaker: 4 feet 11 inches 1.44-fold Phenotypes: from highly complex such as body size to highly specific, such as transcript expression difference 6 24th QTL/QT gene This slide juxtaposes some extraordinarily complex differences between two humans with some very simple differences in allele types at a single gene locus. Neither Wilt Chamberlain or Willie Shumaker would appreciate being called mutants. They are at the extremes of the normal range of variation and they illustrate the extraordinary scope of morphological variation. Biochemical and molecular variation is also significant . Mice have just as much variation at virtually all levels, but do not have such stylish attire. Clinician live with an appreciate the tremendous variation among their patients and their families a(more so than bench scienstists). They are confronted daily by the complex mixture of environmental and genetic factors (and the agene- by-environmental interactions) that produce differences in disease susceptibility, progression, prognosis. Complex trait analysis is a suite of methods that now for the first time makes it possible to extract single genes and loci that contribute to the variation. Those contributions can be complex, contingent, and are usually individually quite small. The number of genes that contribute to the size difference between Wilt and Willie: 10 to 100 genes--10% to 1% each. I’ll illustrate how we use the simple genotypes on the right to help discover these genes.l

19 Grin2b Cis QTL Trans QTL

20 Ret mRNA correlations in a small data set

21 Ret and Sh3d5

22 Ret GO analysis

23 The App neighborhood Handdrawn sketch of the App neighborhood
Many of the data types in the previous slide are hot-linked and it is easy to generate a small web of correlations between any transcript of interest and many other transcripts. In this case, we have used green lines between transcripts that have positive correlations, and red lines between transcripts that have negative correlations. Correlations have been multiplied by 100. The correlation of 0.96 between App and Hsp84-1 reads 96. These are Pearson product moment correlations and they are sensitive to outliers. If you prefer, you can recompute Spearman rank order correlations. Where did Ndr4 (lower left) come from? It is not in the list in the previous slide. Actually it is. Nomenclature changes rapidly. If you click on R74996 in the previous slide (the active webqtl version) you will see that it now has a new symbol and name. What are all of the conventions in this correlation network sketch. The official gene symbol = App 2. OUr estimate of the location of these gene in the Mouse Genome Sequencing Consoritum version 3 build (MGSCv3). Chromosome followed by the megabase position relative to the centromere. (Mice only have one chromosome arm so this is an unambiguous coordinate. ) 3. The pair of numbers: top is the highest expression among the strain set. The lower number is the lowest expression of that transcript among the strain set. 4. Vertical number on the right side of each box: this is the probe set ID given by Affymetrix. We have truncated these probe set IDs so you will not see the usual “at”. A single gene may be represented by more than 10 probe sets. Thus this ID number is essential to identify the actual data source. 5. Lower right corner: a two digit number followed by plus and minus signs. These numbers are the correlation value (absolute value) of the 100th best correlated transcript. The plus and minus signs indicate the mean polarity of the correlations. 6. The set of numbers that read etc. These are the approximate locations of additive effect QTLs detected by WebQTL that we will describe in other slides. Read this as: App has a suggestive QTL on Chr 2 at about 140 Mb and the D allele inherited from DBA/2J confirms a higher expression level at this marker. If there is no star symbol, then it is not even formally “suggestive” but does make an interesting looking blip on the QTL radar screen.

24 Associational Networks
QTL networks add layer of shared causality

25 Integrative and cumulative analysis/synthesis
physiology anatomy pathology development pharmacokinetics endocrine profile immune 1K Reference environment response pathogens Population metabolism epigenetic modifications proteomics Meta- cancer transcriptome analysis susceptibility

26 Cost Components: 24–28 M over 7–8 yrs
Per diem for 8,000 to 10,000 cages (~1500 K/year) Genotyping intermediate generations (~500 K/year) Prospective tissue harvesting and cryopreservation (~500 K/year) Molecular phenotyping of select tissue as proof-of-principle (500 K/year) Bioinformatics, statistical modeling, administration, colony management (~500 K/year) Cryopreservation of final lines at F25+ (~200 K) Sequencing of parental strains (unfunded)

27 NIH Portfolio

28 Collaborators Ken Manly (UTHSC) David Threadgill (UNC Chapel Hill)
Bob Hitzemann (OHSU) Gary Churchill (TJL) Fernando Pardo Manuel de Villena (UNC) Karl Broman (JHU) Dan Gaile (SUNY Buffalo) Kent Hunter (NCI) Jay Snoddy (ORNL) Jim Cheverud (Wash U) Tim Wiltshire (GNF) Lu Lu Elissa Chesler David Airey Siming Shou Jing Gu Yanhua Qu Supported by: NIAAA-INIA Program, NIMH, NIDA, and the National Science Foundation (P20-MH 62009), NEI, a Human Brain Project and the William and Dorothy Dunavant Endowment.

29

30 Chromosome models B D F1 F2 F2 ➊ ➊ ➊ ➊ ➊ ➊ ➊ ➊ ➊ ➊
Each marker samples piece of chromosome (~50 Mb). The geno-type/haplotype can be estimated for $50 per case. BDHBH at D1Mit004 Genotyping is simple now and reliant on PCR reactions and microsatellites. Trinucleotide repeat expansion disease due to difficulties that polymerase has with slippery-boring repeats. Much more polymorphic than the SNPs that everyone is talking about these days. NEXT SLIDE: RI strains: What were to happen if we cloned every F2 mouse. Then we would have terrific advantages. We could use the same mice to map different traits. We could study the development of a particular genotype of mouse. We could see how particular genotypes respond to particular treatments. We could explore the interaction between genes and environment with a fixed genetic resource. That is what RI strains are good for. F1-CACACACA-R1 F1-CACA-R1 BDHBHBDHBBBHHHDBHHHHBHHH Genotypes as a vector

31 Mouse Strain Lineages China Japan Castle’s Mice Swiss Wild C57-related
Other Strains Wild Derived

32 Fernando’s Interval on Chr 11
Cct6b Ap2b1 b CAST/Ei CASA/Rk CALB/Rk JF1/Ms MOLC/Rk CZECHI/Ei PWK/Ph SKIVE/Ei PERA/Ei PERC/Rk ZALENDE/Ei TIRANO/Ei LEWES/Ei RBA/Dn DDK/Pas BALB/cJ C57BL/6J DBA/2J A/J 129X1/SvJ Fernando Pardo Manuel de Villena (March 2004) c 50 40 # of SNPs/10kb 30 20 10 100 200 300 400 500 600

33 Visualizing QTLs Cerebellar Size QTLs Significant collective effects
Modest individual effects Clues to CNS developmen t and even to abnormalitie s such as autism Interest here because of the curious association of autism with somewhat larger than normal cerebellum. This is a case where the QTLs that we are discovering are being passed to colleagues studying the genetics of autism in humans to see if we can track down some good candidate genes. It is a longshot, but… The Chr 8 QTL may be a gene characterized by Jim Morgan at St. Jude Children’s Research Hospital called Cerebellin 1 that has remarkably high expression in the cerebellum. Airey et al, J Neurosci 2001

34 Information flow Structure Activity Function Physiomics Proteomics
Cell-cell Interactions, Tissue- Dynamics Systems Biology and Biological Process: Physiology, Pharmacology, Pathology, Clinical Outcome Physiomics Proteomics Expression Structure Interaction Localization Pathways Genomics Cis-Regulation mRNA Expression cell type specificity Transcripts GeneticsQTL Genes

35 Variation at different levels of organization
Whole body-brain Functional regions Behaviors Cell populations Single cell properties Protein expression Transcript level We really want to get down to Levels 3 and 4, but we should work down from higher levels. Variation in cell number may actually be due to a global variation: whole body size differences or whole brain differences. We need the basic data before we can explore the more refined phenotypes. In this slide on the right side I illustrate on line of research supported by the NIMH in which we are mapping and characterizing the genes (or gene loci) that modulate the size of different cell populations in the brain. Here is red we are interested in the population of dentate granule cells in the hippocampus. These are the only cortical cell population of neurons that are generated at a low level even in adult humans. Virtually all other neuron populations have no renewal capacity. The question we are asking is what genes make this cell population or and brain region so unusual. Can we discover the gene that modulate the proliferation of these neurons using a model such as the mouse?


Download ppt "Complex trait analysis, develop- ment, and genomics"

Similar presentations


Ads by Google