Presentation is loading. Please wait.

Presentation is loading. Please wait.

DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Similar presentations


Presentation on theme: "DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010."— Presentation transcript:

1 DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010

2 Overview Introduction to fractional polynomials Going off-piste: DIY fractional polynomials Examples

3 3

4 Fractional polynomial models A fractional polynomial of degree 1 with power p 1 is defined as FP1 = β 1 X p 1 A fractional polynomial of degree 2 with powers (p 1,p 2 ) is defined as FP2 = β 1 X p 1 + β 2 X p 2 Powers (p 1,p 2 ) are taken from a predefined set S = {2,  1,  0.5, 0, 0.5, 1, 2, 3} where 0 means log X  Also, there are ‘repeated’ powers FP2 models Example: FP1 [power 0.5] = β 1 X 0.5 Example: FP2 [powers (0.5, 3)] = β 1 X 0.5 + β 2 X 3 Example: FP2 [powers (3, 3)] = β 1 X 3 + β 2 X 3 lnX

5 Some examples of fractional polynomial (FP2) curves Royston P, Altman DG (1994) Applied Statistics 43: 429-467.

6 FP analysis for the prognostic effect of age in breast cancer

7 FP function selection procedure Simple functions are preferred. More complicated functions are accepted only if the fit is much better Effect of age significant at 5% level? χ 2 dfP-value Any effect? Best FP2 versus null17.61 40.0015 Linear function suitable? Best FP2 versus linear17.03 3 0.0007 FP1 sufficient? Best FP2 vs. best FP111.20 2 0.0037

8 Fractional polynomials in Stata fracpoly command Basic syntax:. fracpoly [, fp_options]: regn_cmd [yvar] xvar1 [xvars] … xvar1 is a continuous predictor which may have a curved relationship with yvar xvars are other predictors, all modelled as linear Can use the fp_option compare to compare the fit of different FP models uses the FP function selection procedure

9 Example (auto data) fracpoly, compare: regress mpg displacement Fractional polynomial model comparisons: -------------------------------------------------------------------------- displacement df Deviance Res. SD Dev. dif. P (*) Powers -------------------------------------------------------------------------- Not in model 0 468.789 5.7855 70.818 0.000 Linear 1 417.801 4.12779 19.830 0.000 1 m = 1 2 400.592 3.67467 2.621 0.284 -2 m = 2 4 397.971 3.6355 -- -- -2 3 -------------------------------------------------------------------------- (*) P-value from deviance difference comparing reported model with m = 2 model Show FP1 and FP2 models in Stata (+ fracplot)

10 But what if fracpoly can’t fit my model … ? fracpoly supports only some of Stata’s rich set of regression-type commands Provided we know what the command we want to fit looks like with a transformed covariate, we can fit an FP model to the data We just create the necessary transformed covariate values, fit the model using them, and assess the fit A new, simple command fracpoly_powers helps by generating strings (local macros) with the required powers:. fracpoly_powers [, degree (#) s (list_of_powers) ]

11 Fitting an FP2 model in the auto example // Store FP2 powers in local macros fracpoly_powers, degree(2) local np = r(np) forvalues j = 1 / `np' { local p`j' `r(p`j')' } // Compute deviance for each model with covariate displacement local x displacement local y mpg local devmin 1e30 quietly forvalues j = 1 / `np' { fracgen `x' `p`j'', replace regress `y' `r(names)' local dev = -2 * e(ll) if `dev' < `devmin' { local pbest `p`j'' local devmin `dev' } di "Best model has powers `pbest', deviance = " `devmin'

12 A real example: modelling fetal growth Prospective longitudinal study of n = 50 pregnant women There are about 6 repeated measurements on each fetus at different gestational ages ( gawks ) gawks = gestational age in weeks Wish to model how y = log fetal abdominal circumference changes with gestational age There is considerable curvature!

13 The raw data

14 A mixed model for fetal growth Multilevel (mixed) model to fit this relationship:. xtmixed y FP(gawks) || id: FP(gawks), covariance(unstructured) But how do we implement “ FP(gawks) ” here? We want the best-fitting FP function of gawks, with random effects for the parameters (β’s) of the FP model

15 Fitting an FP2 mixed model to the fetal AC data [First run fracpoly_powers to create local macros with powers] // Compute deviance for each FP model with covariate gawks gen x = gawks gen y = ln(ac) local devmin 1e30 forvalues j = 1 / `np' { qui fracgen x `p`j'', replace adjust(mean) qui xtmixed y `r(names)' || id: `r(names)', /// nostderr covariance(unstructured) local dev = -2 * e(ll) if `dev' < `devmin' { local p `p`j'' local devmin `dev' } di "powers = `p`j''" _col(20) " deviance = " %9.3f `dev' } di _n "Best model has powers `p', deviance = " `devmin'

16 Plots of some results

17 An “ignorant” example! I know almost nothing about “seemingly unrelated regression” (Stata’s sureg command) It fits a set of linear regression models which have correlated error terms The syntax therefore has a set of “equations”. sureg (depvar1 varlist1) (depvar2 varlist2)... (depvarN varlistN) There may be non-linearities lurking in these “equations” How can we fit FP models to varlist1, varlist2, … ?

18 Example: modelling learning scores Stata FAQ from UCLA ( http://www.ats.ucla.edu/stat/stata/faq/sureg.htm ): What is seemingly unrelated regression and how can I perform it in Stata? Example: High School and Beyond study

19 Example: modelling learning scores Contains data from hsb2.dta obs: 200 highschool and beyond (200 cases) vars: 11 5 Jul 2010 13:23 size: 9,600 (99.9% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- id float %9.0g female float %9.0g fl race float %12.0g rl ses float %9.0g sl schtyp float %9.0g scl type of school prog float %9.0g sel type of program read float %9.0g reading score write float %9.0g writing score math float %9.0g math score science float %9.0g science score socst float %9.0g social studies score ------------------------------------------------------------------------------- [It is unclear to me what “ses” (low, middle, high) is]

20 Example (ctd.) As an example, suppose we wish to model 2 outcomes ( read, math ) as predicted by “ socst female ses ” and “ science female ses ” using sureg as follows:. sureg (read socst female ses) (math science female ses) Are there non-linearities in read as a function of socst ? In math as a function of science ? For simplicity here, will restrict ourselves to FP1 functions of socst and science not necessary in principle We fit the 8 × 8 = 64 FP1 models and look for the best- fitting combination

21 Stata gen x1 = socst gen x2 = science gen y1 = read gen y2 = math local devmin 1e30 forvalues j = 1 / `np' { qui fracgen x1 `p`j'', replace adjust(mean) local x1vars `r(names)' forvalues k = 1 / `np' { qui fracgen x2 `p`k'', replace adjust(mean) local x2vars `r(names)' qui sureg (y1 `x1vars' female ses) (y2 `x2vars' female ses) local dev = -2 * e(ll) if `dev' < `devmin' { local px1 `p`j'' local px2 `p`k'' local devmin `dev' } [Run fpexample3.do in Stata]

22 Comments The results suggest that there is indeed curvature in both relationships Can reject the null hypothesis of linearity at the 1% significance level FP1 vs linear: χ 2 = 10.08 (2 d.f.), P = 0.0065 Shows the importance of considering non-linearity

23 read as a function of socst (adjusted female ses )

24 math as a function of science (adjusted female ses )

25 Conclusions Fractional polynomial models are a simple yet very useful extension of linear functions and ordinary polynomials If you are willing to do some straightforward do-file programming, you can apply them in a bespoke manner to a wide range of Stata regression-type commands and get useful results For (much) more, see Royston & Sauerbrei (2008) book


Download ppt "DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010."

Similar presentations


Ads by Google