1 Power 16. 2 Review Post-Midterm Cumulative 3 Projects.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
1 Lecture Twelve. 2 Outline Failure Time Analysis Linear Probability Model Poisson Distribution.
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
1 Power 14 Goodness of Fit & Contingency Tables. 2 II. Goodness of Fit & Chi Square u Rolling a Fair Die u The Multinomial Distribution u Experiment:
QUANTITATIVE DATA ANALYSIS
1 Power Projects 3 Logistics Put power point slide show on a high density floppy disk for a WINTEL machine. the slide-show.
Statistical Methods Chichang Jou Tamkang University.
Final Review Session.
1 Econ 240A Power Outline Review Projects 3 Review: Big Picture 1 #1 Descriptive Statistics –Numerical central tendency: mean, median, mode dispersion:
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
1 Power Projects 3 Logistics Put power point slide show on a high density floppy disk for a WINTEL machine. the slide-show.
1 Final Review Econ 240A. 2 Outline The Big Picture Processes to remember ( and habits to form) for your quantitative career (FYQC) Concepts to remember.
1 Power Fifteen Analysis of Variance (ANOVA). 2 Analysis of Variance w One-Way ANOVA Tabular Regression w Two-Way ANOVA Tabular Regression.
1 Power Review Post-Midterm Cumulative 3 Projects.
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
1 Power 14 Goodness of Fit & Contingency Tables. 2 Outline u I. Projects u II. Goodness of Fit & Chi Square u III.Contingency Tables.
1 Final Review Econ 240A. 2 Outline The Big Picture Processes to remember ( and habits to form) for your quantitative career (FYQC) Concepts to remember.
1 Project I The Challenger Disaster. 2 What to do with the zeros?
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
1 Final Review Econ 240A. 2 Outline The Big Picture Processes to remember ( and habits to form) for your quantitative career (FYQC) Concepts to remember.
1 Regression Econ 240A. 2 Retrospective w Week One Descriptive statistics Exploratory Data Analysis w Week Two Probability Binomial Distribution w Week.
1 Power Fifteen Analysis of Variance (ANOVA). 2 Analysis of Variance w One-Way ANOVA Tabular Regression w Two-Way ANOVA Tabular Regression.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Nonparametrics and goodness of fit Petter Mostad
Inferential Statistics
Chapter 12: Analysis of Variance
Hypothesis Testing in Linear Regression Analysis
5.1 Basic Estimation Techniques  The relationships we theoretically develop in the text can be estimated statistically using regression analysis,  Regression.
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Statistics Definition Methods of organizing and analyzing quantitative data Types Descriptive statistics –Central tendency, variability, etc. Inferential.
A Repertoire of Hypothesis Tests  z-test – for use with normal distributions and large samples.  t-test – for use with small samples and when the pop.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Analysis of variance Petter Mostad Comparing more than two groups Up to now we have studied situations with –One observation per object One.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Research Project Statistical Analysis. What type of statistical analysis will I use to analyze my data? SEM (does not tell you level of significance)
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Linear correlation and linear regression + summary of tests
Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Principles of statistical testing
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
STATS 10x Revision CONTENT COVERED: CHAPTERS
Testing Differences in Means (t-tests) Dr. Richard Jackson © Mercer University 2005 All Rights Reserved.
SUMMARY EQT 271 MADAM SITI AISYAH ZAKARIA SEMESTER /2015.
Chapter 15 Analyzing Quantitative Data. Levels of Measurement Nominal measurement Involves assigning numbers to classify characteristics into categories.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
Chapter 4 Basic Estimation Techniques
Basic Estimation Techniques
Part Three. Data Analysis
Correlation and Simple Linear Regression
Basic Estimation Techniques
SA3202 Statistical Methods for Social Sciences
CHAPTER 29: Multiple Regression*
Introduction to Statistics
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

1 Power 16

2 Review Post-Midterm Cumulative

3 Projects

4 Logistics Put power point slide show on a high density floppy disk, or as an attachment, for a WINTEL machine. the slide-show as a PowerPoint attachment

5 Assignments 1. Project choice 2. Data Retrieval 3. Statistical Analysis 4. PowerPoint Presentation 5. Executive Summary 6. Technical Appendix 7. Graphics Power_13

6 PowerPoint Presentations: Member 4 1. Introduction: Members 1,2, 3 What Why How 2. Executive Summary: Member 5 3. Exploratory Data Analysis: Member 3 4. Descriptive Statistics: Member 3 5. Statistical Analysis: Member 3 6. Conclusions: Members 3 & 5 7. Technical Appendix: Table of Contents, Member 6

7 Executive Summary and Technical Appendix

8

9 Technical Appendix Table of Contents Spreadsheet of data used and sources or if extensive, a subsample of the data Descriptive Statistics and Histograms for the variables in the study If time series data, a plot of each variable against time If relevant, plot of the dependent Vs. each of the explanatory variables

10 Technical Appendix (Cont.) Statistical Results, for example regression Plot of the actual, fitted and error and other diagnostics Brief summary of the conclusions, meanings drawn from the exploratory, descriptive, and statistical analysis.

11 Post-Midterm Review Project I: Power 16 Contingency Table Analysis: Power 14, Lab 8 ANOVA: Power 15, Lab 9 Survival Analysis: Power 12, Power 11, Lab 7 Multi-variate Regression: Power 11, Lab 6

12 Slide Show Challenger disaster

13 Project I Number of O-Rings Failing On Launch i: y i (#) = a + b*temp i + e i Biased because of zeros, even if divide equation by 6 Two Ways to Proceed Tobit, non-linear estimation: y i (#) = a + b*temp i + e i Bernoulli variable: probability models Probability Models: y i (0,1) = a + b*temp i + e i

14 Project I (Cont.) Probability Models: y i (0,1) = a + b*temp i + e i OLS, Linear Probability Model, linear approximation to the sigmoid Probit, non-linear estimate of the sigmoid Logit, non-linear estimate of the sigmoid Significant Dependence on Temperature t-test (or z-test) on slope, H 0 : b=0 F-test Wald test

15 Project I (Cont.) Plots of Number or Probability Vs Temp. Label the axes Answer all parts, a-f The most frequent sins Did not explicitly address significance Did not answer b, 66 0 : all launches at lower temperatures had one or more o-ring failures Did not execute c, estimate linear probability model

16 Challenger Disaster Failure of O-rings that sealed grooves on the booster rockets Was there any relationship between o- ring failure and temperature? Engineers knew that the rubber o-rings hardened and were less flexible at low temperatures But was there launch data that showed a problem?

17 Challenger Disaster What: Was there a relationship between launch temperature and o-ring failure prior to the Challenger disaster? Why: Should the launch have proceeded? How: Analyze the relationship between launch temperature and o-ring failure

18 Launches Before Challenger Data number of o-rings that failed launch temperature

19

20

21

22 Exploratory Analysis Launches where there was a problem

Orings temperature

.

25 Exploratory Analysis All Launches Plot of failures per observation versus temperature range shows temperature dependence: Mean temperature for the 7 launches with o-ring failures was lower, 63.7, than for the 17 launches without o-ring failures, Contingency table analysis

26 Launches and O-Ring Failures (Yes/No)

27 Launches and O-Ring Failures (Yes/No) Expected/Observed

28 Launches and O-Ring Failures Chi- Square, 2dof=9.08, crit(  =0.05)=6

Number of O-ring Failures Vs. Temperature

30 Logit Extrapolated to 31F: Probit extrapolated to 31F:

31 Extrapolating OLS to 31F: OLS: Tobit:

32 Conclusions From extrapolating the probability models to 31 F, Linear Probability, Probit, or Logit, there was a high probability of one or more o-rings failing From extrapolating the Number of O- rings failing to 31 F, OLS or Tobit, 3 or more o-rings would fail. There had been only one launch out of 24 where as many as 3 o-rings had failed. Decision theory argument: expected cost/benefit ratio:

33 Conclusions Decision theory argument: expected cost/benefit ratio:

34 Ways to Analyze Challenger Difference in mean temperatures for failures and successes Difference in probability of one or more o-ring failures for high and low temperature ranges Probabilty models: LPM (OLS), probit, logit Number of o-ring failure per launch Vs. Temp. OLS, Tobit Contingency table analysis ANOVA

35 Contingency Table Analysis Challenger example

36 Launches and O-Ring Failures (Yes/No)

37 ANOVA and O-Rings Probability one or more o-rings fail Low temp: degrees Medium temp: degrees High temp: degrees Average number of o-rings failing per launch Low temp: degrees Medium temp: degrees High temp: degrees

38 Probability one or more o-rings fails

39 Number of o-rings failing per launch

40

41 Outline ANOVA and Regression (Non-Parametric Statistics) (Goodman Log-Linear Model)

42 Anova and Regression: One-Way Salesaj = c(1)*convenience+c(2)*quality+c(3)*price+ e E[salesaj/(convenience=1, quality=0, price=0)] =c(1) = mean for city(1) c(1) = mean for city(1) (convenience) c(2) = mean for city(2) (quality) c(3) = mean for city(3) (price) Test the null hypothesis that the means are equal using a Wald test: c(1) = c(2) = c(3)

43 One-Way ANOVA and Regression Regression Coefficients are the City Means; F statistic

44 Anova and Regression: One-Way Alternative Specification Salesaj = c(1) + c(2)*convenience+c(3)*quality+e E[Salesaj/(convenience=0, quality=0)] = c(1) = mean for city(3) (price, the omitted one) E[Salesaj/(convenience=1, quality=0)] = c(1) + c(2) = mean for city(1) (convenience) c(1) = mean for city(3), the omitted city c(2) = mean for city(1) minus mean for city(3) Test that the mean for city(1) = mean for city(3) Using the t-statistic for c(2)

45 Anova and Regression: One-Way Alternative Specification Salesaj = c(1) + c(2)*convenience+c(3)*price+e E[Salesaj/(convenience=0, price=0)] = c(1) = mean for city(2) (quality, the omitted one) E[Salesaj/(convenience=1, price=0)] = c(1) + c(2) = mean for city(1) (convenience) c(1) = mean for city(2), the omitted city c(2) = mean for city(1) minus mean for city(2) Test that the mean for city(1) = mean for city(2) Using the t-statistic for c(2)

46 ANOVA and Regression: Two-Way Series of Regressions; Compare to Table 11, Lecture 15 Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + c(5)*convenience*television + c(6)*quality*television + e, SSR=501,136.7 Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + e, SSR=502,746.3 Test for interaction effect: F 2, 54 = [( )/2]/( /54) = (1609.6/2)/ = 0.09

Table of Two-Way ANOVA for Apple Juice Sales

48 ANOVA and Regression: Two-Way Series of Regressions Salesaj = c(1) + c(2)*convenience + c(3)* quality + e, SSR=515,918.3 Test for media effect: F 1, 54 = [( )/1]/( /54) = 13172/ = 1.42 Salesaj = c(1) +e, SSR = Test for strategy effect: F 2, 54 = [( )/2]/( /54) = ( /2)/(9280.3) = 5.32

49 Survival Analysis Density, f(t) Cumulative distribution function, CDF, F(t) Probability you failed up to time t* =F(t*) Survivor Function, S(t) = 1-F(t) Probability you survived longer than t*, S(t*) Kaplan-Meier estimates: (#at risk- # ending)/# at risk Applications Testing a new drug

50 Chemotherapy Drug Taxol Current standard for ovarian cancer is taxol and a platinate such as cisplatin Previous standard was cyclophosphamide and cisplatin Kaplan-Meier Survival curves comparing the two regimens Lab 7: ( # at risk- #ending)/# at riak

51 Taxol ( Bristol-Myers Squibb) interrupts cell division (mitosis) It is a cyclical hydrocarbon

52 Top Panel: European Canadian and Scottish, 342 at risk for Tc, 292 Survived 1 year Bottom Panel: Gynecological Oncology Group, 196 at risk For Tc, 168 survived 1 year

Final

54 Nonparametric Statistics What to do when the sample of observations is not distributed normally?

55 3 Nonparametric Techniques Wilcoxon Rank Sum Test for independent samples Data Analysis Plus Signs Test for Matched Pairs: Rated Data Eviews, Descriptive Statistics Wilcoxon Signed Rank Sum Test for Matched Pairs: Quantitative Data Eviews

56 Wilcoxon Rank Sum Test for Independent Samples Testing the difference between the means of two populations when they are non-normal A New Painkiller Vs. Aspirin, Xm17-02

57 Rating scheme

58 Ratings

59 Rank the 30 Ratings 30 total ratings for both samples 3 ratings of 1 5 ratings of 2 etc

continued

Rank Sum

63 Rank Sum, T E (T )= n 1 (n 1 + n 2 + 1)/2 = 15*31/2 = VAR (T) = n 1 * n 2 (n 1 + n 2 + 1)/12 VAR (T) = 15*31/12,  T = 24.1 For sample sizes larger than 10, T is normal Z = [T-E(T)]/  T = ( )/24.1 = 1.83 Null Hypothesis is that the central tendency for the two drugs is the same Alternative hypothesis: central tendency for the new drug is greater than for aspirin: 1- tailed test

%