Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to.

Similar presentations


Presentation on theme: "Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to."— Presentation transcript:

1 Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to time- series data)

2 Association vs. Causality A high in a simple linear regression of Y on X means that X has a strong linear relationship with Y, in other words changes in X are strongly associated with changes in the mean of Y. It does not imply that changes in X causes changes in Y. Alternative explanations for high : –Reverse is true. Y causes X. –There may be a lurking (confounding) variable related to both x and y which is the common cause of x and y

3

4 More on Checking Normality To check normality, we have thus far examined a histogram of the residuals; the histogram should have approximately a bell shape if normality holds. Another tool for checking normality is the normal quantile plot.

5 Normal Quantile Plots Normal quantile (probability) plot: Scatterplot involving ordered residuals (values) with the x- axis giving the expected value of the kth ordered residual on the standard normal scale (residual / RMSE) and the y-axis giving the actual residual. JMP implementation: Save residuals, then click Analyze, Distribution, red triangle next to Residuals and Normal Quantile Plot. If the residuals follow approximately a normal distribution, they should fall within the two red bands.

6 Normality does not appear to hold. Some of the residuals fall outside the confidence bands.

7 Normality appears reasonable. No observations fall outside of confidence bands.

8 Time Series Data (Chapter 3.6) Cross-sectional data: Data gathered on a different individuals at the same point in time. Time series: Data gathered on a single individual (person, firm, so on) over a sequence of time periods which may be days, weeks, months, quarters, years or any other measure of time. One goal in analyzing time series is to understand the trend in Y over time: E(Y|Time), i.e., we treat Time as our explanatory variable in the regression analysis. Simple Linear Regression Model for Trend in Time Series Data:

9 Hurricane Data Is there a trend in the number of hurricanes in the Atlantic over time (possibly an increase because of global warming)? hurricane.JMP contains data on the number of hurricanes in the Atlantic basin from 1950-2006.

10

11 Inferences for Hurricane Data Residual plots and normal quantile plots indicate that assumptions of linearity, constant variance and normality in simple linear regression model are reasonable. 95% confidence interval for slope (change in mean hurricanes between year t and year t+1): (-0.029,0.057) Hypothesis Test of vs. (is mean number of hurricanes/year increasing over time): Test statistic t=0.66. t< t-ratio is on same side as alternative so p-value = Prob>|t|/2 =0.5129/2=0.2565 We do not reject the null hypothesis – there is not strong evidence of an increasing trend in hurricanes over time.


Download ppt "Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to."

Similar presentations


Ads by Google