Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health September 30, 2008
Outline Has the onset of menopause changed since the early 20 th Century? Tremin I: U. Minnesota Undergraduates 1930’s. Tremin II: U. Minnesota Undergraduates 1960’s. Develop statistical model for menstrual cycle length. Bayesian methods Apply to complete data from Tremin I. Future work Accounting for missing data (hormone, dropout). Including Tremin II data. Relating to existing suggestions for FMP markers (60 days, 90 days, etc.).
Modeling Menstrual Cycle Length: Observed Data Characterized by: Stable trend during a woman’s 20’s and 30’s “Breakdown” several months to several years before FMP Increase in variability Increase in mean length
Modeling Menstrual Cycle Length: Observed Data Easier to see on a log scale:
Modeling Menstrual Cycle Length: Linear Changepoint Model There appears to be a common pattern to how menstrual cycle length changes over age. A linear changepoint model: Can be implemented as a linear spline with one changepoint:, where
Modeling Menstrual Cycle Length: Linear Changepoint Model Different slopes and intercepts either side of θ: But means converge at the “knot” of θ:
Modeling Menstrual Cycle Length: Linear Changepoint Model Variance can be modeled via linear changepoint model, just like the mean. Note that the changepoint(s) θ are estimated from the data, not fixed in advance.
Modeling Menstrual Cycle Length: Hierarchical Model Despite general overall pattern being the same, women have unique ages when their cycle lengths begin to change, as well as difference means and variances at “baseline”. This suggests constructing a hierarchical model in which women will have unique parameters governing mean and variance changepoint models.
Modeling Menstrual Cycle Length: Hierarchical Model Start at age 35 and take log of cycle length to improve normal approximation. Linear changepoint for both mean and variance where is the length in days of the tth menstrual cycle for the ith woman,, is her age in years at the start of her tth cycle, and
Modeling Menstrual Cycle Length: Hierarchical Model Each of the individual parameters is then assumed to follow a common distribution: where. Allows information about cycle parameters to be shared across women. Accommodates “within-woman” correlation in cycle lengths Relates cycle parameters to baseline covariates via regression coefficients.
Bayesian Models Considering a model of this form is easier from a Bayesian perspective. Classical or “frequentist" approach to statistics considers observed data y to be random, governed by fixed (unknown) parameters θ. Determine the joint distribution of. and consider as a function of θ:. Point estimate of made by maximizing or. Inference about θ made by considering repeated sampling properties of y for different θ. Ex: 95% confidence interval, a hypothetical set of which will contain θ with 95% probability
Bayesian Models Bayesian statistics also models, but considers θ to have a probability distribution of its own,. Prior information contained in is updated from data y to obtain a posterior distribution of θ: Hierarchical models model prior parameters with “hyperprior” distributions :
Bayesian Models Prior distributions or hyperprior distributions encode prior knowledge about parameters, but can be chosen to be very weakly informative if little prior information is available, or if it is to be ignored. Here Modern computational techniques such as Markov Chain Monte Carlo allow complex models such as those to be used here to be fit (relatively) painlessly. Results from 3,000 draws of Gibbs sampler, 1,000 draws discarded after “burn-in”.
Results Fit the above model to the 106 women with complete data in Tremin I. Pregnancies, abortions. No hormone use or gaps in reporting. Subject level parameters (random effects). Fit of predicted means and variances. Regression against parity, menarche, means and standard errors at age Correlation of subject-level parameters.
Results: Subject-level estimates of trends i (3.33, 3.39) (-.014, -.006).089 (.007,.303) 49.9 (47.2, 51.4) -4.8 (-5.3, -4.3).01 ( ).76 (.56,.99) 47.3 (46.5, 48.0) (3.06, 3.23).037 (-.010,.084).279 (-.136,.727) 42.0 (41.4, 42.1) -3.6 (-4.1, -3.0).52 (.36,.66) 1.27 (.19, 2.61) 42.0 (41.0, 42.1)
Results: Subject-level estimates of trends Mean cycle length: Posterior mean (95% posterior predictive interval) 2.5 and 97.5 Percentiles: Posterior mean (95% PPI)
Results: Subject-level estimates of trends One-changepoint model provides reasonably good fit to the data. Subject-level differences in mean and variance trends appear to be captured. Uncertainty in the location of the changepoints reflected in the smoothness of the ``elbow’’ for the mean and variances.
Results: Subject-level estimates of trends Posterior means and 80% posterior predictive intervals for mean changepoint and variance changepoint. Variability in cycle length increases well in advance of increases in mean cycle length itself. Changepoints for some subjects are well-estimated, while there is a great deal of uncertainty for others.
Results: Population Mean for trend parameters (unadjusted) Mean intercept3.30 (3.27,3.32) Mean slope before changepoint (-.023,.015) Mean slope after changepoint.264 (.151,.425) Changepoint for mean47.0 (46.4,47.6) Exp(Variance intercept).0088 (.0067,.0112) Exp(Variance slope before changepoint) 1.09 (1.02,1.15) Exp(Variance slope after changepoint) 3.05 (2.60,3.68) Variance changepoint 45.2 (44.6,45.8)
Results: Population Mean for trend parameters (adjusted for parity, age at menarche, and mean and variance of cycles at age 25-29) InterceptNullpariousMenarcheMean 25-29SD Mean intercept3.28 (3.22,3.31) -.00 (-.02,.02).02 (-.03,.07).023 (.015,.029) (-.007,-.003) Mean slope before changepoint (-.044,.028).000 (-.016,.012) (-.050,.035) (-.008,.004).000 (-.002,.002) Mean slope after changepoint.307 (.135,.446) (-.061,.018).006 (-.126,.098) (-.024,.009).001 (-.004,.005) Changepoint for mean (46.11,48.03).09 (-.31,.38) -.02 (-1.16,.81).15 (-.01,.27) (-.001,-.000) Exp(Variance intercept).0091 (.0054,.0133) 1.09 (.87,1.23).920 (.507,1.395) (.973,1.123).972 (.928,1.006) Exp(Variance slope before changepoint) 1.00 (.89,1.09) 1.00 (.95,1.03) (.973,1.228).984 (.969,.997) (.998,1.007) Exp(Variance slope after changepoint) 3.20 (2.41,3.95).96 (.86,1.04).946 (.691,1.197) (.981,1.054).995 (.982,1.004) Variance changepoint (43.65,45.53).172 (-.235,.484).58 (-.57,1.47).181 (.018,.302) (-.087,.005)
Results: Population Mean for trend parameters (adjusted for parity, age at menarche, and mean and variance of cycles at age 25-29) Parity not associated with cycle structure. Weak evidence that earlier menarche associated with increasing variability before changepoint. Higher historical mean: Higher mean at 35. Decline in variability before changepoint. Later changepoints for both mean and variance. Higher historical variance: Lower mean at 35. Earlier changepoints for both mean and variance.
Results: Correlations among random effects Mean intercept Mean slope before change- point Mean slope after change- point Mean Change- point Exp(Var intercept) Exp(Var slope before change- point) Exp(Var slope after change- Point) Var Change -point Mean intercept Mean slope before changepoint Mean slope after changepoint Changepoint for mean Exp(Variance intercept) Exp(Variance slope before changepoint) Exp(Variance slope after changepoint) Variance changepoint 1
Results: Correlations among random effects Longer cycles at age 35 are associated with somewhat later changepoints in both mean and variance. Highly variable cycles at age 35 are associated with slower increases/declines in variability before their variability changepoint, but more rapid increases thereafter. Later mean changepoints are associated with slower increases or even declines in variability before their variability changepoint, and slower increases thereafter. Later mean changepoints are strongly associated with later variance changepoints.
Next Steps: Modeling Account for missing data (hormone, dropout). Impute missing cycles under model, and then obtain draws from posterior distribution of parameters conditional on observed and imputed data. Use results from alternative non-model-based approaches
Next Steps: Modeling Model checking “Eyeball” approach shows good fit Formalize with posterior predictive checks. Generate predictive data under model and compute posterior distributions of statistics of interest (chi-square measures, etc.) Compare with posterior distribution of statistics using observed (fixed) data.
Next Steps: Analysis Include Tremin II data. Add as covariate to population model Assess secular trends pre- and post- birth control use.
Next Steps: Analysis Relate to existing suggestions for FMP markers (60 days, 90 days, etc.). Consider predictive value of model for pre-FMP subjects. “Cross-validation” with Tremin data. Validation with other data sources.
Next Steps: Analysis Look for cycle behavior that might be reflective of disease