Download presentation
Presentation is loading. Please wait.
Published byBeatrice Fox Modified over 9 years ago
1
Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting tests of scientific hypotheses. Scientific philosophy today rests on the idea of falsification: For a theory to be a valid scientific theory it must be possible, at least in principle, to make observations that would prove the theory false. For example, here is a simple theory: All swans are white
2
Time Series Analysis – Chapter 4 Hypothesis Testing All swans are white This is a valid scientific theory because there is a way to falsify it: I can observe one black swan and the theory would fall. For more information on the history and philosophy of falsification I suggest reading Karl Popper.
3
Time Series Analysis – Chapter 4 Hypothesis Testing Besides the idea of falsification, we must keep in mind the other basic tenant of the scientific method: All evidence that supports a theory or falsifies it must be empirically based and reproducible.
4
In other words, data! Just holding a belief (no matter how firm) that a theory is true or false is not a justifiable stance. This chapter gives us the most basic statistical tools for taking data or empirical evidence and using it to substantiate or nullify (show to be false) a hypothesis.
5
All evidence that supports a theory or falsifies it must be empirically based and reproducible. I have just used the word hypothesis and this chapter is concerned with hypothesis testing, not theory testing. This is because theories are composed of many hypotheses and, usually, a theory is not directly supported or attacked but one or more of it’s supporting hypotheses are scrutinized.
6
Discrimination or Not Activity Null Hypothesis H o : No Discrimination Alternative Hypothesis H a : Discrimination How do we choose which hypothesis to support?
7
Discrimination or Not Activity Null Hypothesis H o : No Discrimination Alternative Hypothesis H a : Discrimination How do we choose which hypothesis to support?
8
The p-value p-value measures amount of support for alternative hypothesis. The smaller the p-value the more support for the alternative hypothesis. Typical level of support is 5% or 0.05
9
Fourth Graders Feet Data Set
10
Fourth Graders Feet Data Set – Minitab Output Predictor variable (x): Childs Age Response variable (y): Foot Length The regression equation is Foot Length = 18.1 + 0.0358 Childs Age Predictor Coef SE Coef T P Constant 18.138 3.753 4.83 0.000 Childs Age 0.03575 0.02922 1.22 0.229
11
Fourth Graders Feet Data Set – Minitab Output
12
Fourth Graders Feet Data Set – One Tailed Alternative
13
Statistical vs. Practical Significance 401K data set Predictor variables x 1 : mrate x 2 : age x 3 : totemp Response variable (y): prate
14
Statistical vs. Practical Significance The regression equation is: prate = 80.3 + 5.44 mrate + 0.269 age - 0.000130 totemp Predictor Coef SE Coef T P Constant 80.2943 0.7777 103.25 0.000 mrate 5.4414 0.5244 10.38 0.000 age 0.26941 0.04515 5.97 0.000 totemp -0.00012978 0.00003672 -3.53 0.000
15
Statistical vs. Practical Significance The regression equation is: prate = 80.3 + 5.44 mrate + 0.269 age - 0.000130 totemp All predictors are statistically significant.
16
Statistical vs. Practical Significance The regression equation is: prate = 80.3 + 5.44 mrate + 0.269 age - 0.000130 totemp If total number of employees increases by ten thousand then participation rate decreases by - 0.000130*10,000 = 1.3% (other predictors held constant)
17
Boeing 747 Jet What does an empty Boeing 747 jet weigh?
18
Boeing 747 Jet What does an empty Boeing 747 jet weigh? My point estimate: 250,000 lbs Answer: 358,000 lbs I am wrong! A point estimate is almost always wrong!
19
Boeing 747 Jet What does an empty Boeing 747 jet weigh? My confidence interval estimate: (0, ∞) Answer: 358,000 lbs I am right! But, my interval is not useful!
20
Point and Interval Estimates – Minitab will compute both 401K data set Predictor variables x 1 : age Response variable (y): prate In Minitab go to Regression -> General Regression and select the correct model variables then click on the Results box and make sure the “Display confidence intervals” box is selected.
21
Point and Interval Estimates – Minitab will compute both 401K data set Predictor variables x 1 : age Response variable (y): prate Regression Equation prate = 83.4231 + 0.298893 age Coefficients Term Coef SE Coef T P 95% CI Constant 83.4231 0.737593 113.102 0.000 (81.9763, 84.8699) age 0.2989 0.045938 6.506 0.000 ( 0.2088, 0.3890)
22
Confidence Intervals
25
Where does 1.960 come from? t distribution with n – k – 1 degrees of freedom where k is the number of predictors in the model. For our model, n = 1533 and k = 1 We also need to know the confidence level of the interval (typically 95%) Then, use a t table!
26
Testing Linear Combinations of Parameters TWOYEAR data set Predictor variables x 1 : jc – # years attending a two-year college x 2 : univ – # years attending a four-year college x 3 : exper – months in workforce Response variable (y): log(wage)
27
Testing Linear Combinations of Parameters
35
The ANOVA F Test
36
Multiple Linear Regression Assumptions
37
MLR Assumption 2: Data comes from a random sample
38
Multiple Linear Regression Assumptions MLR Assumption 3: None of the independent or predictor variables are perfectly correlated (if they were, Minitab would not run a regression analysis).
39
Multiple Linear Regression Assumptions MLR Assumption 4: The error, u, has an expected value of zero.
40
Multiple Linear Regression Assumptions MLR Assumption 5: The error, u, has the same variance given any values of the explanatory variables. This is the assumption of homoskedasticity.
41
Multiple Linear Regression Assumptions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.