Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why you should know about experimental crosses. To save you from embarrassment.

Similar presentations


Presentation on theme: "Why you should know about experimental crosses. To save you from embarrassment."— Presentation transcript:

1 Why you should know about experimental crosses

2 To save you from embarrassment

3 Why you should know about experimental crosses To save you from embarrassment To help you understand and analyse human genetic data

4 Why you should know about experimental crosses To save you from embarrassment To help you understand and analyse human genetic data It’s interesting

5 Experimental crosses

6 Inbred strain crosses Recombinant inbreds Alternatives

7 Inbred Strain Cross

8 Backcross

9 F2 cross F2 F1 F0 Generation

10 Data conventions AA = A BB = B AB = H Missing data = -

11 Data conventions Genotype file

12 Data conventions Genotype file Phenotype file

13 Data conventions Genotype file Phenotype file Map file

14 Use the latest mouse build and convert physical to genetic distance: 1 Mb = 1.6 cM Use our genetic map: http://gscan.well.ox.ac.uk/

15 Analysis If you can’t see the effect it probably isn’t there

16 0 200 400 600 800 1000 1200 1400 0.5 Phenotype ABAABB

17 Red = Hom Blue = Het Backcross genotypes

18 Statistical analysis

19 Linear models Also known as ANOVA ANCOVA regression multiple regression linear regression

20

21

22

23

24 QTL snp

25 0 +1

26 QTL snp 0 +1

27 QTL snp 0 +1

28 QTL snp 0 +1

29 qqqQ QQ

30 QTL snp 0 +1 0 1 0

31 QTL snp 0 +1 0 1 0

32 qqqQ QQ 80 90 100 90100

33 qqqQ QQ 80 90 100 90100 9010

34 Hypothesis testing H0:H0: H1:H1:

35 H 0 : y ~1 H 1 : y ~ 1 + x

36 Hypothesis testing H 0 : y ~ 1 H 1 : y ~ 1 + x H 1 vs H 0 : Does x explain a significant amount of the variation?

37 Hypothesis testing H 0 : y ~ 1 H 1 : y ~ 1 + x H 1 vs H 0 : Does x explain a significant amount of the variation? likelihood ratio LOD score

38 Hypothesis testing H 0 : y ~ 1 H 1 : y ~ 1 + x H 1 vs H 0 : Does x explain a significant amount of the variation? likelihood ratio Chi Square test p-value logP LOD score

39 Hypothesis testing H 0 : y ~ 1 H 1 : y ~ 1 + x H 1 vs H 0 : Does x explain a significant amount of the variation? likelihood ratio Chi Square test p-value logP SS explained / SS unexplained F-test (or t-test) linear models only LOD score

40 Hypothesis testing H 0 : y ~ 1 + x H 1 : y ~ 1 + x + x2 H 1 vs H 0 : Does x2 explain a significant extra amount of the variation?

41 H 0 : phenotype ~ 1 H 1 : phenotype ~ a Test: H 1 vs H 0 H 2 vs H 1 H 2 vs H 0 PRACTICAL: hypothesis test for identifying QTLs H 2 : phenotype ~ a + d To start: 1. Copy the folder faculty\valdar\AnimalModelsPractical to your own directory. 2. Start R 3. File -> Change Dir… and change directory to your AnimalModelsPractical directory 4. Open Firefox, then File -> Open File, and open “f2cross_and_thresholds.R” in the AnimalModelsPractical directory

42 PRACTICAL: Chromosome scan of F2 cross

43 Two problems in QTL analysis Missing genotype problem Model selection problem

44 Missing genotype problem

45 Solutions to the missing genotype problem Maximum likelihood interval mapping Haley-Knott regression Multiple imputation

46 Interval mapping

47 qq genotype10 qQ genotype20

48 Interval mapping qq genotype10 qQ genotype20

49 Interval mapping Which is the true situation? qq genotype10 qQ genotype20 qq qQ

50 Interval mapping Which is the true situation? qq genotype10 qQ genotype20 qq qQ 0.5 “mixture” model Fit both situations and then weight them ML interval mapping

51 Interval mapping Which is the true situation? qq genotype10 qQ genotype20 qq qQ 0.5 “mixture” model Fit both situations and then weight them Fit the “average” situation (which is technically false, but quicker) ML interval mapping Haley-Knott regression

52 Imputation

53

54 Key references Maximum likelihood methods Linear regression Imputation

55 r/qtl http://www.rqtl.org/ Broman, Sen & Churchill

56 Is interval mapping necessary?

57 QTL logP score

58 QTL

59 logP score

60 Significance Thresholds

61 Lander, E. Kruglyak, L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results Nature Genetics. 11, 241- 7, 1995

62 Thresholds Permutation test SUBJECT.NAME Sex Phenotype m1 m2 m3 m4 F2$798 F -0.738004 -1 1 1 -1 F2$364 F 0.413330 0 0 0 0 F2$367 F 1.417480 -1 1 1 -1 F2$287 F 0.811208 1 -1 -1 1 F2$205 M 1.198270 0 0 0 0

63 Thresholds Permutation test SUBJECT.NAME Sex Phenotype m1 m2 m3 m4 F2$798 F -0.738004 -1 1 1 -1 F2$364 F 0.413330 0 0 0 0 F2$367 F 1.417480 -1 1 1 -1 F2$287 F 0.811208 1 -1 -1 1 F2$205 M 1.198270 0 0 0 0 SUBJECT.NAME Sex Phenotype m1 m2 m3 m4 F2$798 F 0.413330 -1 1 1 -1 F2$364 F 1.417480 0 0 0 0 F2$367 F 1.198270 -1 1 1 -1 F2$287 F -0.738004 1 -1 -1 1 F2$205 M 0.811208 0 0 0 0 shuffle

64 Permutation tests to establish thresholds Empirical threshold values for quantitative trait mapping GA Churchill and RW Doerge Genetics, 138, 963-971 1994 An empirical method is described, based on the concept of a permutation test, for estimating threshold values that are tailored to the experimental data at hand.

65 PRACTICAL: significance thresholds by permutation

66 Two problems in QTL analysis Missing genotype problem Model selection problem

67 The model problem How QTL genotypes combine to produce the phenotype

68 The model problem Linked QTL corrupt the position estimates Unlinked QTL decreases the power of QTL detection

69 Composite interval mapping ZB Zeng Precision mapping of quantitative trait loci Genetics, Vol 136, 1457-1468, 1994 http://statgen.ncsu.edu/qtlcart/cartographer.html

70 Composite interval mapping M1M1 M2M2 M1M1 M2M2 QQQ

71 M -1 M1M1 M2M2 M3M3 M1M1 M2M2 M3M3 QQQ

72 Model selection Inclusion of covariates: gender, environment and other things too many too enumerate here

73 Inclusion of covariates H 0 : phenotype ~ covariates H 1 : phenotype ~ covariates + LocusX

74 Inclusion of covariates H 0 : phenotype ~ covariates H 1 : phenotype ~ covariates + LocusX H 1 vs H 0 : how much extra does LocusX explain?

75 Inclusion of covariates H 0 : phenotype ~ covariates H 1 : phenotype ~ covariates + LocusX H 0 : startle ~ Sex + BodyWeight + TestChamber + Age H 1 : startle ~ Sex + BodyWeight + TestChamber + Age + Locus432 H 1 vs H 0 : how much extra does LocusX explain?

76 PRACTICAL: Inclusion of gender effects in a genome scan To start: In Firefox, then File -> Open File, and open “gxe.R”

77 Experimental crosses Inbred strain crosses Recombinant inbreds Alternatives

78 Recombinant Inbreds F 0 Parental Generation F 1 Generation F 2 Generation Interbreeding for approximately 20 generations to produce recombinant inbreds

79 RI strain genotypes http://www.well.ox.ac.uk/mouse/INBREDS SNP SELECTOR http://gscan.well.ox.ac.uk/gs/strains.cgi

80

81 RI strain phenotypes

82 RI analysis

83 Power of RIs Effect size of a QTL that can be detected with RI strain sets, at P= 0.00013

84 Experimental crosses Inbred strain crosses Recombinant inbreds Alternatives

85 Why do we need alternatives? Classical strategies don’t find genes because of poor resolution

86

87

88

89 One locus may contain many QTL

90 New approaches Chromosome substitution strains

91 New approaches Chromosome substitution strains Collaborative cross

92 New approaches Chromosome substitution strains Collaborative cross In silico mapping

93 Resources Rhttp://www.r-project.org/ R helphttp://news.gmane.org/gmane.comp.lang.r.general R/qtl http://www.rqtl.org Composite interval mapping (QTL Cartographer) http://statgen.ncsu.edu/qtlcart/index.php Markers http://www.well.ox.ac.uk/mouse/inbreds Gscan (HAPPY and associated analyses) http://gscan.well.ox.ac.uk General reading Lynch & Walsh (1998) Genetics and analysis of quantitative traits (Sinauer). Dalgaard (2002) Introductory statistics with R (Springer-Verlag).

94 END SECTION

95

96 New approaches Advanced intercross lines Genetically heterogeneous stocks

97 F2 Intercross x Avg. Distance Between Recombinations F1 F2 F2 intercross ~30 cM

98 Advanced intercross lines (AILs) F0 F1 F2 F3 F4

99 Chromosome scan for F12 position along whole chromosome (Mb) goodness of fit (logP) 0 100cM significance threshold QTL Typical chromosome

100

101 PRACTICAL: AILs

102 Genetically Heterogeneous Mice

103 F2 Intercross x Avg. Distance Between Recombinations F1 F2 F2 intercross ~30 cM

104 Pseudo-random mating for 50 generations Heterogeneous Stock F2 Intercross x Avg. Distance Between Recombinations: F1 F2 HS ~2 cM F2 intercross ~30 cM

105 Pseudo-random mating for 50 generations Heterogeneous Stock F2 Intercross x Avg. Distance Between Recombinations: F1 F2 HS ~2 cM F2 intercross ~30 cM

106 Genome scans with single marker association

107 High resolution mapping

108 Relation Between Marker and Genetic Effect Observable effect QTL Marker 1

109 Relation Between Marker and Genetic Effect Observable effect QTL Marker 2 Marker 1

110 Relation Between Marker and Genetic Effect No effect observable Observable effect QTL Marker 2 Marker 1

111 Hidden Chromosome Structure Observed chromosome structure Multipoint method (HAPPY) calculates the probability that an allele descends from a founder using multiple markers

112 M1 Q M2 m1 q m2 M1 ? m2 recombination

113 Haplotype reconstruction using HAPPY m183 m184 m185 allele A typical chromosome from an HS mouse

114 Haplotype reconstruction using HAPPY A typical chromosome from an HS mouse actual path another plausible path

115 Haplotype reconstruction using HAPPY A typical chromosome from an HS mouse actual path another plausible path

116 Haplotype reconstruction using HAPPY marker interval A typical chromosome from an HS mouse average over all paths

117 Haplotype reconstruction using HAPPY chromosome genotypes haplotype proportions predicted by HAPPY

118 HAPPY model for additive effects

119 Phenotype y is modeled as is effect of strain s

120 HAPPY effects models Additive model Additive model with covariate effects Full (ie, additive & dominance) model with covariate effects

121 Genome scans with HAPPY

122 Many peaks mean red cell volume

123 Ghost peaks

124 family effects, cage effects, odd breeding …complex pattern of linkage disequilibrium

125

126 How to select peaks: a simulated example

127 Simulate 7 x 5% QTLs (ie, 35% genetic effect) + 20% shared environment effect + 45% noise = 100% variance

128 Simulated example: 1D scan

129 Peaks from 1D scan phenotype ~ covariates + ?

130 1D scan: condition on 1 peak phenotype ~ covariates + peak 1 + ?

131 1D scan: condition on 2 peaks phenotype ~ covariates + peak 1 + peak 2 + ?

132 1D scan: condition on 3 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + ?

133 1D scan: condition on 4 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + ?

134 1D scan: condition on 5 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + ?

135 1D scan: condition on 6 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + ?

136 1D scan: condition on 7 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + ?

137 1D scan: condition on 8 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + ?

138 1D scan: condition on 9 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + peak 9 + ?

139 1D scan: condition on 10 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + peak 9 + peak 10 + ?

140 1D scan: condition on 11 peaks phenotype ~ covariates + peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + peak 9 + peak 10 + peak 11 + ?

141 Peaks chosen by forward selection

142 Bootstrap sampling 1 2 3 4 5 6 7 8 9 10 10 subjects

143 Bootstrap sampling 1 2 3 4 5 6 7 8 9 10 1 2 2 3 5 5 6 7 7 9 10 subjects sample with replacement bootstrap sample from 10 subjects

144

145 Forward selection on a bootstrap sample

146

147

148 Bootstrap evidence mounts up…

149 In 1000 bootstraps… Bootstrap Posterior Probability (BPP)

150 Model averaging by bootstrap aggregation Choosing only one model: very data-dependent, arbitrary can’t get all the true QTLs in one model Bootstrap aggregation averages over models true QTLs get included more often than false ones References: Broman & Speed (2002) Hackett et al (2001)

151 PRACTICAL: http://gscan.well.ox.ac.uk

152 ADDITIONAL SLIDES FROM HERE

153 An individual’s phenotype follows a mixture of normal distributions

154 m Maternal chromosome Paternal chromosome

155 m Chromosome 1 Chromosome 2 ABCDEFABCDEF Strains

156 Markers m ABCDEFABCDEF Strains

157 Markers m ABCDEFABCDEF Strains

158 Markers m 0.5 cM

159 Markers m 0.5 cM 1 cM

160 Markers m 0.5 cM 1 cM

161 Analysis Probabilistic Ancestral Haplotype Reconstruction (descent mapping): implemented in HAPPY http://www.well.ox.ac.uk/~rmott/happy.html

162

163 M1 Q M2 m1 q m2 M1 ? m2 recombination

164 M1 Q M2 m1 q m2 m1 Q M2 M1 q m2 M1 q m2

165 M1 Q M2 m1 q m2 M1 Q m2 M1 Q m2 m1 q M2 m1 Q M2 M1 q m2 M1 q m2

166 M1 Q M2 m1 q m2 M1 m2 M1 m2 m1 m2 m1 M2 M1 m2 M1 m2 Qq Qqq Q

167 M1 ? m2 m1 ? m2 cM distances determine probabilities

168 M1 ? m2 m1 ? m2 cM distances determine probabilities Eg,

169 Interval mapping M1 M2 m1 m2 M1 m1 M2 m2 LOD score

170 Interval mapping M1 M2 m1 m2 M1 m1 M2 m2 LOD score Q q

171 Interval mapping M1 M2 m1 m2 M1 m1 M2 m2 LOD score Q q

172 Interval mapping M1 M2 m1 m2 M1 m1 M2 m2 LOD score Q q


Download ppt "Why you should know about experimental crosses. To save you from embarrassment."

Similar presentations


Ads by Google