Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data.

Similar presentations


Presentation on theme: "1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data."— Presentation transcript:

1 1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data

2 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 2 Cystic Fibrosis Data Cystic fibrosis lung function data lung function data for cystic fibrosis patients (7-23 years old) age a numeric vector. Age in years. sex a numeric vector code. 0: male, 1:female. height a numeric vector. Height (cm). weight a numeric vector. Weight (kg). bmp a numeric vector. Body mass (% of normal). fev1 a numeric vector. Forced expiratory volume. rv a numeric vector. Residual volume. frc a numeric vector. Functional residual capacity. tlc a numeric vector. Total lung capacity. pemax a numeric vector. Maximum expiratory pressure.

3 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 3 Some Stata Commands. insheet using "C:\TD\CLASS\K30Bench2005\cystfibr.csv" (11 vars, 25 obs). graph matrix age sex height weight bmp fev1 rv frc tlc pemax. graph export cystfibr-scm.wmf. regress pemax age sex height weight bmp fev1 rv frc tlc. rvfplot. graph export cystfibr-rvf.wmf

4 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 4

5 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 5 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549.9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274.7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv |.196972.1962136 1.00 0.331 -.2212474.6151915 frc | -.3084314.4923899 -0.63 0.540 -1.357936.7410729 tlc |.1886017.4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------

6 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 6 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549.9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274.7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv |.196972.1962136 1.00 0.331 -.2212474.6151915 frc | -.3084314.4923899 -0.63 0.540 -1.357936.7410729 tlc |.1886017.4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------ T-test of additional value of variable

7 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 7 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549.9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274.7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv |.196972.1962136 1.00 0.331 -.2212474.6151915 frc | -.3084314.4923899 -0.63 0.540 -1.357936.7410729 tlc |.1886017.4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------ Test of whole model

8 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 8

9 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 9 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549.9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274.7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv |.196972.1962136 1.00 0.331 -.2212474.6151915 frc | -.3084314.4923899 -0.63 0.540 -1.357936.7410729 tlc |.1886017.4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------ Least significant variable

10 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 10. regress pemax age height weight bmp fev1 rv frc tlc Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 8, 16) = 3.49 Model | 17063.4886 8 2132.93607 Prob > F = 0.0159 Residual | 9769.15144 16 610.571965 R-squared = 0.6359 -------------+------------------------------ Adj R-squared = 0.4539 Total | 26832.64 24 1118.02667 Root MSE = 24.71 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.114515 4.330841 -0.49 0.632 -11.29549 7.066459 height | -.394836.851725 -0.46 0.649 -2.200412 1.41074 weight | 2.834909 1.841995 1.54 0.143 -1.069947 6.739765 bmp | -1.741637 1.120651 -1.55 0.140 -4.117312.634038 fev1 | 1.26509.7429407 1.70 0.108 -.3098737 2.840054 rv |.1779046.1742911 1.02 0.323 -.1915759.5473852 frc | -.2483218.4122804 -0.60 0.555 -1.122317.6256736 tlc |.2084044.4782484 0.44 0.669 -.8054369 1.222246 _cons | 153.0385 198.7149 0.77 0.452 -268.2183 574.2953 ------------------------------------------------------------------------------ Least significant variable

11 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 11. regress pemax age height weight bmp fev1 rv frc Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 7, 17) = 4.16 Model | 16947.5458 7 2421.07798 Prob > F = 0.0077 Residual | 9885.09416 17 581.476127 R-squared = 0.6316 -------------+------------------------------ Adj R-squared = 0.4799 Total | 26832.64 24 1118.02667 Root MSE = 24.114 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.663193 4.043832 -0.66 0.519 -11.19493 5.868546 height | -.4895733.8036502 -0.61 0.550 -2.185127 1.205981 weight | 3.155659 1.647815 1.92 0.072 -.3209274 6.632245 bmp | -1.962543.9753332 -2.01 0.060 -4.020316.0952305 fev1 | 1.247861.7239953 1.72 0.103 -.2796361 2.775357 rv |.1595988.1650733 0.97 0.347 -.1886753.5078729 frc | -.1764595.368749 -0.48 0.638 -.9544518.6015328 _cons | 198.2942 165.3311 1.20 0.247 -150.5238 547.1123 ------------------------------------------------------------------------------ Least significant variable

12 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 12. regress pemax age height weight bmp fev1 rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 6, 18) = 5.04 Model | 16814.3899 6 2802.39832 Prob > F = 0.0034 Residual | 10018.2501 18 556.569447 R-squared = 0.6266 -------------+------------------------------ Adj R-squared = 0.5022 Total | 26832.64 24 1118.02667 Root MSE = 23.592 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -1.819342 3.560301 -0.51 0.616 -9.299258 5.660573 height | -.4101508.7693006 -0.53 0.600 -2.026391 1.20609 weight | 2.874434 1.506126 1.91 0.072 -.2898203 6.038688 bmp | -1.949083.9538193 -2.04 0.056 -3.952983.0548169 fev1 | 1.411959.6238279 2.26 0.036.1013452 2.722573 rv |.0955779.0946057 1.01 0.326 -.1031813.2943371 _cons | 166.9049 148.4762 1.12 0.276 -145.0321 478.8418 ------------------------------------------------------------------------------ Least significant variable

13 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 13. regress pemax height weight bmp fev1 rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 5, 19) = 6.23 Model | 16669.0534 5 3333.81068 Prob > F = 0.0014 Residual | 10163.5866 19 534.92561 R-squared = 0.6212 -------------+------------------------------ Adj R-squared = 0.5215 Total | 26832.64 24 1118.02667 Root MSE = 23.128 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- height | -.4485274.7505918 -0.60 0.557 -2.019534 1.122479 weight | 2.338692 1.060094 2.21 0.040.1198889 4.557495 bmp | -1.641001.7246036 -2.26 0.035 -3.157614 -.1243885 fev1 | 1.471767.6007182 2.45 0.024.2144491 2.729084 rv |.110117.0884543 1.24 0.228 -.07502.295254 _cons | 137.0958 133.8559 1.02 0.319 -143.0677 417.2594 ------------------------------------------------------------------------------ Least significant variable

14 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 14. regress pemax weight bmp fev1 rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 4, 20) = 7.96 Model | 16478.0401 4 4119.51002 Prob > F = 0.0005 Residual | 10354.5999 20 517.729996 R-squared = 0.6141 -------------+------------------------------ Adj R-squared = 0.5369 Total | 26832.64 24 1118.02667 Root MSE = 22.754 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | 1.748914.3806332 4.59 0.000.9549274 2.542901 bmp | -1.377243.5653421 -2.44 0.024 -2.556526 -.1979604 fev1 | 1.547698.5776112 2.68 0.014.3428223 2.752574 rv |.1257152.0831456 1.51 0.146 -.0477234.2991538 _cons | 63.9467 53.27673 1.20 0.244 -47.18661 175.08 ------------------------------------------------------------------------------ Least significant variable

15 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 15. regress pemax weight bmp fev1 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 3, 21) = 9.28 Model | 15294.4519 3 5098.15064 Prob > F = 0.0004 Residual | 11538.1881 21 549.437528 R-squared = 0.5700 -------------+------------------------------ Adj R-squared = 0.5086 Total | 26832.64 24 1118.02667 Root MSE = 23.44 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | 1.536475.3644235 4.22 0.000.7786149 2.294335 bmp | -1.465406.5792906 -2.53 0.019 -2.670106 -.260705 fev1 | 1.108629.5143694 2.16 0.043.0389396 2.178319 _cons | 126.3336 34.71986 3.64 0.002 54.12965 198.5375 ------------------------------------------------------------------------------

16 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 16. stepwise, pr(.05): regress pemax age sex height weight bmp fev1 rv frc tlc begin with full model p = 0.8123 >= 0.0500 removing sex p = 0.6688 >= 0.0500 removing tlc p = 0.6384 >= 0.0500 removing frc p = 0.6156 >= 0.0500 removing age p = 0.5572 >= 0.0500 removing height p = 0.1462 >= 0.0500 removing rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 3, 21) = 9.28 Model | 15294.4519 3 5098.15064 Prob > F = 0.0004 Residual | 11538.1881 21 549.437528 R-squared = 0.5700 -------------+------------------------------ Adj R-squared = 0.5086 Total | 26832.64 24 1118.02667 Root MSE = 23.44 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- fev1 | 1.108629.5143694 2.16 0.043.0389396 2.178319 weight | 1.536475.3644235 4.22 0.000.7786149 2.294335 bmp | -1.465406.5792906 -2.53 0.019 -2.670106 -.260705 _cons | 126.3336 34.71986 3.64 0.002 54.12965 198.5375 ------------------------------------------------------------------------------

17 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 17. stepwise, pr(.1) pe(.05): regress pemax age sex height weight bmp fev1 rv frc tlc begin with full model p = 0.8123 >= 0.1000 removing sex p = 0.6688 >= 0.1000 removing tlc p = 0.6384 >= 0.1000 removing frc p = 0.6156 >= 0.1000 removing age p = 0.5572 >= 0.1000 removing height p = 0.1462 >= 0.1000 removing rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 3, 21) = 9.28 Model | 15294.4519 3 5098.15064 Prob > F = 0.0004 Residual | 11538.1881 21 549.437528 R-squared = 0.5700 -------------+------------------------------ Adj R-squared = 0.5086 Total | 26832.64 24 1118.02667 Root MSE = 23.44 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- fev1 | 1.108629.5143694 2.16 0.043.0389396 2.178319 weight | 1.536475.3644235 4.22 0.000.7786149 2.294335 bmp | -1.465406.5792906 -2.53 0.019 -2.670106 -.260705 _cons | 126.3336 34.71986 3.64 0.002 54.12965 198.5375 ------------------------------------------------------------------------------

18 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 18 Cautionary Notes The significance levels are not necessarily believable after variable selection The original full model F-statistic is significant, indicating that there is some significant relationship: F(9,15) = 2.93, p = 0.0320 After variable selection, F(3,21) = 9.28, p = 0.0004, which is biased.

19 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 19 set obs 25 generate x1 = invnormal(uniform()) generate x2 = invnormal(uniform()) generate x3 = invnormal(uniform()) generate x4 = invnormal(uniform()) generate x5 = invnormal(uniform()) generate x6 = invnormal(uniform()) generate x7 = invnormal(uniform()) generate x8 = invnormal(uniform()) generate x9 = invnormal(uniform()) generate y = invnormal(uniform()) regress y x1 x2 x3 x4 x5 x6 x7 x8 x9 stepwise, pr(.1): regress y x1 x2 x3 x4 x5 x6 x7 x8 x9

20 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 20. regress y x1 x2 x3 x4 x5 x6 x7 x8 x9 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 0.91 Model | 12.3235639 9 1.36928488 Prob > F = 0.5397 Residual | 22.5105993 15 1.50070662 R-squared = 0.3538 -------------+------------------------------ Adj R-squared = -0.0340 Total | 34.8341632 24 1.45142347 Root MSE = 1.225 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | -.0441858.2998066 -0.15 0.885 -.6832085.594837 x2 | -.9078136.4347798 -2.09 0.054 -1.834525.0188976 x3 |.2076754.3789522 0.55 0.592 -.6000421 1.015393 x4 | -.0056383.3319125 -0.02 0.987 -.7130931.7018166 x5 | -.330546.3854497 -0.86 0.405 -1.152113.4910207 x6 |.0202964.3470704 0.06 0.954 -.7194666.7600594 x7 | -.073401.3135234 -0.23 0.818 -.7416603.5948583 x8 | -.0552909.3026913 -0.18 0.858 -.7004621.5898803 x9 | -.3190092.3137931 -1.02 0.325 -.9878434.349825 _cons | -.2490392.3078424 -0.81 0.431 -.9051898.4071113 ------------------------------------------------------------------------------

21 October 26, 2006EPP 245 Statistical Analysis of Laboratory Data 21. stepwise, pr(.1): regress y x1 x2 x3 x4 x5 x6 x7 x8 x9 begin with full model p = 0.9867 >= 0.1000 removing x4 p = 0.9545 >= 0.1000 removing x6 p = 0.8456 >= 0.1000 removing x1 p = 0.8165 >= 0.1000 removing x7 p = 0.7506 >= 0.1000 removing x8 p = 0.5023 >= 0.1000 removing x3 p = 0.2866 >= 0.1000 removing x5 p = 0.2081 >= 0.1000 removing x9 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 1, 23) = 7.23 Model | 8.33379862 1 8.33379862 Prob > F = 0.0131 Residual | 26.5003646 23 1.15218977 R-squared = 0.2392 -------------+------------------------------ Adj R-squared = 0.2062 Total | 34.8341632 24 1.45142347 Root MSE = 1.0734 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x2 | -.6644002.2470417 -2.69 0.013 -1.175445 -.1533555 _cons | -.1523124.214703 -0.71 0.485 -.5964594.2918346 ------------------------------------------------------------------------------


Download ppt "1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data."

Similar presentations


Ads by Google