Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics for Decision Making Descriptive Statistics & Excel Intro to Probability Instructor: John Seydel, Ph.D. QM 2113 -- Fall 2003.

Similar presentations


Presentation on theme: "Statistics for Decision Making Descriptive Statistics & Excel Intro to Probability Instructor: John Seydel, Ph.D. QM 2113 -- Fall 2003."— Presentation transcript:

1

2 Statistics for Decision Making Descriptive Statistics & Excel Intro to Probability Instructor: John Seydel, Ph.D. QM 2113 -- Fall 2003

3 Student Objectives Use Excel to assist in performing basic univariate data analysis (for description) Perform descriptive bivariate analyses with help from Excel Quantitative variables Qualitative variables Understand concepts of events and probability Use standard probability notation Relate probability to relative frequency Define probability distribution

4 Miscellaneous Items Exam 1 Collect homework version Return exam  Grading  Comments Comments Need help with Excel basics? (Probably!)help Homework: Data analyses (using Excel) Probability distribution material

5 Univariate Analysis: Questions About the Exercise/Homework? Looking at the variation in Credits (from Exam 1) Univariate analysis tools Histogram (informal, visual analysis) Descriptive statistics  Measures of location  Measures of variation The analysis (via Excel) Basic descriptive statistics (n, min, max, xbar, s) Histogram:  No good Excel function  Need to create a flexible table/chart

6 Bivariate Analysis: Questions About the Exercise/Homework? Hypothesis: experienced students are likely to be more familiar with issues Appropriate analysis Examine Level vs Credits That is, Level =  0 +  1 Credits Tools Scatterplot (i.e., XY chart) Regression (using Excel functions) b0b0 b1b1 R2R2  S yx Reference material: see Handouts pagematerial Data from Exam 1

7 Bivariate Analysis: Qualitative Data Hypothesis: students with certain majors may be more likely to favor publishing instructor evaluations (from Exam 1) Appropriate analysis Examine Favor vs Major But not, Favor =  0 +  1 Major That is, regression applies only to quantitative data Tools Crosstabulation (i.e., joint frequency table) Contingency analysis  Special versions of crosstabulation  Chi-square analysis  Beyond our scope, at least for now Excel feature that’s helpful: PivotTable

8 Now, Let’s Consider Probability Just a numeric way of expressing about how certain we feel that a particular event will occur; measures chance Uses a scale of 0 to 1 (computations) Conversationally: 0% to 100% Alternatively, in terms of odds Can determine probability Theoretically Subjectively Empirically (i.e., using relative frequencies) Probability allows us to develop inferences based upon descriptive statistics

9 Some Foundations Basic notation: P(... ) is the probability that whatever’s inside the parentheses will occur, e.g.,  P(B) = probability that event B will occur  P(x=5) = probability that x will be 5  P(Raise) = probability that JoJo will get a raise  P(75) = probability that exam score will be 75 Definitive rules: 0.00 ≤ P(... ) ≤ 1.00 or 0% ≤ P(... ) ≤ 100% For exhaustive & mutually exclusive set of events  P(... ) = 1.00 Keep these in mind when doing calculations (i.e., the voice of reason)

10 Relative Frequency Regardless of method used to determine probability, it can be interpreted as relative frequency Recall that relative frequency is observed proportion of time some event has occurred  Sites developed in-house  Incomes between $10,000 and $20,000 Probability is just expected proportion of time we expect something to happen in the future given similar circumstances Note also, proportions are probabilities Example: ASU student credit-hours

11 Getting some use out of Probability: Distributions Random variables are either Discrete  Limited possible outcomes  Examples: daily sales, defects, emergencies,... Or continuous  Infinite possible outcomes  Examples: waiting time, gas mileage, earnings/share,... Normal distributions: the most well known continuous distribution Let’s take a closer look at normally distributed random variables...

12 An Example You need a car that gets at least 30 mpg Suppose a particular model of car has been tested Average mpg =  (not x-bar) = 34not x-bar Standard deviation =  = 3 mpg Typically histograms for this type of thing look like That is, mpg is approximately normally distributed (aka the “bell curve”) Note: Percentage of area indicates probability!

13 If Something’s Normally Distributed It’s described by  (the population/process average)  (the population/process standard deviation) Histogram is symmetric Thus no skew (average = median) So P(x  ) =... ? Shape of histogram can be described by f(x) = (1/  √2  )e -[(x-  ) 2 /2  2 ] We determine probabilities based upon distance from the mean (i.e., the number of standard deviations)

14 Back to Our Problem at Hand We need a car that gets at least 30 mpg How likely is it that this model of vehicle will meet our needs? That is, P(x > 30) =... ? First, sketch  Number line with Average Also x value of concern  Curve approximating histogram Identify areas of importance Then determine how many sigma 30 is from mu Now use the tabletable Finally, put it all together

15 Comments on the Problem A sketch is essential! Use to identify regions of concern Enables putting together results of calculations, lookups, etc. Doesn’t need to be perfect; just needs to indicate relative positioning Make it large enough to work with; needs annotation (probabilities, comments, etc.) Now, what do we do with the probability we’ve just determined? Make a decision!

16 Some Other Exercises Let x ~ N(34,3) as with the mpg problem Determine Tail probabilities  F(30) which is the same as P(x ≤ 30)  P(x > 40) Tail complements  P(x > 30)  P(x < 40) Other  P(32 < x < 33)  P(30 < x < 35)  P(20 < x < 30)

17 Keep In Mind Probability = proportion of area under the normal curve What we get when we use tables is always the area between the mean and z standard deviations from the mean Because of symmetry P(x >  ) = P(x <  ) = 0.5000 Tables show probabilities rounded to 4 decimal places If z < -3.89 then probability ≈ 0.5000 If z > 3.89 then probability ≈ 0.5000 Theoretically, P(x = a) = 0 P(30 ≤ x ≤ 35) = P(30 < x < 35)

18 Summary of Objectives Use Excel to assist in performing basic univariate data analysis (for description) Perform descriptive bivariate analyses with help from Excel Quantitative variables Qualitative variables Understand concepts of events and probability Use standard probability notation Relate probability to relative frequency Define probability distribution

19 Appendix

20 Exam Comments Overall: pretty good! Exhibits Titles  Main  Vertical axis (or columns)  Horizontal axis (or rows) Names, not codes Units of measure (exhibits & answers) Main trouble spots Empirical Rule Histogram interpretation Confidence interval estimate Other issues/questions?


Download ppt "Statistics for Decision Making Descriptive Statistics & Excel Intro to Probability Instructor: John Seydel, Ph.D. QM 2113 -- Fall 2003."

Similar presentations


Ads by Google