Presentation is loading. Please wait.

Presentation is loading. Please wait.

BNAD 276: Statistical Inference in Management Winter, 2015

Similar presentations


Presentation on theme: "BNAD 276: Statistical Inference in Management Winter, 2015"— Presentation transcript:

1 BNAD 276: Statistical Inference in Management Winter, 2015
Welcome Syllabi Green sheet Seating Chart

2 Daily group portfolios
Beginning of each lecture (first 5 minutes) Meet in groups of 3 or 4 Quiz one another on class material Discuss the questions and determine the correct answer for each question Five copies (one for each group member – and typed) multiple choice questions based on lecture Include 4 options (a, b, c, and d) Include a name and describe a person in a certain situation Margaret was interested in taking a Statistics course. It is likely she was interested in studying which of the following? a. economic theories of communism b. theological perspectives of life after death c. musical compositions of the 12th century d. statistical techniques and inference They can be funny or serious, and must be clear and have only one correct answer.

3 Please start portfolios

4 . Homework Assignment Go to D2L - Click on “Content”
Click on “Interactive Online Homework Assignments” Complete the module: Seven Prototypical Designs

5 Height of Daughters (inches)
Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number) Both axes have real numbers listed Both axes and values are labeled Variable name is listed clearly 1. Describe one positive correlation Draw a scatterplot (label axes) 2. Describe one negative correlation Draw a scatterplot (label axes) Hand in Correlation worksheet 3. Describe one zero correlation Draw a scatterplot (label axes) 4. Describe one perfect correlation (positive or negative) Draw a scatterplot (label axes) 5. Describe curvilinear relationship Draw a scatterplot (label axes)

6 Overview Frequency distributions
The normal curve Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric

7 Another example: How many kids in your family?
Number of kids in family 1 4 3 2 1 8 4 2 2 14 14 4 2 1 4 2 2 3 1 8

8 Mean: The balance point of a distribution. Found
Measures of Central Tendency (Measures of location) The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x Mean for a population: ΣX / N = mean = µ (mu) Measures of “location” Where on the number line the scores tend to cluster Note: Σ = add up x or X = scores n or N = number of scores

9 Number of kids in family
Measures of Central Tendency (Measures of location) The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x 41/ 10 = mean = 4.1 Number of kids in family 1 4 3 2 1 8 4 2 2 14 Note: Σ = add up x or X = scores n or N = number of scores

10 How many kids are in your family? What is the most common family size?
Number of kids in family 1 3 1 4 2 4 2 8 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least)

11 Number of kids in family
1 4 3 2 1 8 4 2 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 1, 2, 2, 2, 3, 4, 4, 8, 14

12 Number of kids in family
1 3 1 4 2 4 2 8 2 14 Number of kids in family 1 4 3 2 1 8 4 2 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 8, 8, 14 14 2.5 µ=2.5 If there appears to be two medians, take the mean of the two Median always has a percentile rank of 50% regardless of shape of distribution Median also called the 2nd Quartile

13 Number of kids in family
1 4 3 2 1 8 4 2 2 14 Number of kids in family 1 3 1 4 2 4 2 8 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 8, 14 Lower half Upper half 2.5 2nd Quartile Middle number of all scores (Median) 1, 1, 1, 1, 2, 2, 2, 3, 8, 14 2, 2, 3, 4, 4, 4, 2, 4, 4, 8, 14 1st Quartile Middle number of lower half of scores 3rd Quartile Middle number of upper half of scores

14 Number of kids in family
Mode: The value of the most frequent observation Score f . 1 2 2 3 3 1 4 2 5 0 6 0 7 0 8 1 9 0 10 0 11 0 12 0 13 0 14 1 Number of kids in family 1 3 1 4 2 4 2 8 2 14 Please note: The mode is “2” because it is the most frequently occurring score. It occurs “3” times. “3” is not the mode, it is just the frequency for the value that is the mode Bimodal distribution: If there are two most frequent observations

15 What about central tendency for qualitative data?
Mode is good for nominal or ordinal data Median can be used with ordinal data Mean can be used with interval or ratio data

16 Overview Frequency distributions
The normal curve Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Skewed right, skewed left unimodal, bimodal, symmetric

17 Overview Frequency distributions
The normal curve Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Skewed right, skewed left unimodal, bimodal, symmetric

18 A little more about frequency distributions
An example of a normal distribution

19 A little more about frequency distributions
An example of a normal distribution

20 A little more about frequency distributions
An example of a normal distribution

21 A little more about frequency distributions
An example of a normal distribution

22 A little more about frequency distributions
An example of a normal distribution

23 Measure of central tendency: describes how scores tend to
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Normal distribution In all distributions: mode = tallest point median = middle score mean = balance point In a normal distribution: mode = mean = median

24 Positively skewed distribution
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Positively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a positively skewed distribution: mode < median < mean Note: mean is most affected by outliers or skewed distributions With Bill Gates our Average Income would be $38 million a year

25 Measure of central tendency: describes how scores tend to
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Negatively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a negatively skewed distribution: mean < median < mode Note: mean is most affected by outliers or skewed distributions

26 Mode: The value of the most frequent observation
Bimodal distribution: Distribution with two most frequent observations (2 peaks) Example: Ian coaches two boys baseball teams. One team is made up of 10-year-olds and the other is made up of 16-year-olds. When he measured the height of all of his players he found a bimodal distribution

27 Overview Frequency distributions
The normal curve Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric

28 Frequency distributions
The normal curve

29 Some distributions are more
Variability What might this be? Some distributions are more variable than others Let’s say this is our distribution of heights of men on U of A baseball team 5’ 5’6” 6’ 6’6” 7’ 5’ 5’6” 6’ 6’6” 7’ Mean is 6 feet tall What might this be? 5’ 5’6” 6’ 6’6” 7’

30 Dispersion: Variability
Some distributions are more variable than others 6’ 7’ 5’ 5’6” 6’6” A The larger the variability the wider the curve tends to be The smaller the variability the narrower the curve tends to be B Range: The difference between the largest and smallest observations C Range for distribution A? Range for distribution B? Range for distribution C?

31 84” – 70” = 14” Wildcats Basketball team:
Tallest player = 84” (same as 7’0”) (Kaleb Tarczewski and Dusan Ristic) Shortest player = 70” (same as 5’10”) (Parker Jackson-Cartwritght) Fun fact: Mean is 78 Range: The difference between the largest and smallest scores 84” – 70” = 14” xmax - xmin = Range Range is 14”

32 No reference is made to numbers between the min and max
Baseball Fun fact: Mean is 72 Wildcats Baseball team: Tallest player = 77” (same as 6’5”) (Austin Schnabel) Shortest player = 69” (same as 5’9”) (Justin Behnke and Ernie DeLaTrinidad ) Range: The difference between the largest and smallest score 77” – 69” = 8” xmax - xmin = Range Range is 8” (77” – 69” ) Please note: No reference is made to numbers between the min and max

33 Generally, (on average) how far away is each score from the mean?
Variability Standard deviation: The average amount by which observations deviate on either side of their mean Generally, (on average) how far away is each score from the mean? Mean is 6’

34 Let’s build it up again… U of A Baseball team
Deviation scores Let’s build it up again… U of A Baseball team Diallo is 0” Diallo is 6’0” Diallo’s deviation score is 0 6’0” – 6’0” = 0 Diallo 5’8” 5’10” 6’0” 6’2” 6’4”

35 Let’s build it up again… U of A Baseball team
Deviation scores Diallo is 0” Let’s build it up again… U of A Baseball team Preston is 2” Diallo is 6’0” Diallo’s deviation score is 0 Preston is 6’2” Preston Preston’s deviation score is 2” 6’2” – 6’0” = 2 5’8” 5’10” 6’0” 6’2” 6’4”

36 Let’s build it up again… U of A Baseball team
Deviation scores Diallo is 0” Let’s build it up again… U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Diallo is 6’0” Diallo’s deviation score is 0 Hunter Preston is 6’2” Mike Preston’s deviation score is 2” Mike is 5’8” Mike’s deviation score is -4” 5’8” – 6’0” = -4 5’8” 5’10” 6’0” 6’2” 6’4” Hunter is 5’10” Hunter’s deviation score is -2” 5’10” – 6’0” = -2

37 Let’s build it up again… U of A Baseball team
Deviation scores Diallo is 0” Let’s build it up again… U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Diallo’s deviation score is 0 David Preston’s deviation score is 2” Mike’s deviation score is -4” Shea Hunter’s deviation score is -2” Shea is 6’4” Shea’s deviation score is 4” 5’8” 5’10” 6’0” 6’2” 6’4” 6’4” – 6’0” = 4 David is 6’ 0” David’s deviation score is 0 6’ 0” – 6’0” = 0

38 Let’s build it up again… U of A Baseball team
Deviation scores Diallo is 0” Let’s build it up again… U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Diallo’s deviation score is 0 David Preston’s deviation score is 2” Mike’s deviation score is -4” Shea Hunter’s deviation score is -2” Shea’s deviation score is 4” David’s deviation score is 0” 5’8” 5’10” 6’0” 6’2” 6’4”

39 Let’s build it up again… U of A Baseball team
Deviation scores Diallo is 0” Let’s build it up again… U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”

40 Standard deviation: The average amount
Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”

41 Standard deviation: The average amount
Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”

42 Standard deviation: The average amount
Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”

43 How far away is each score from the mean?
Diallo is 0” Mike is -4” Hunter is -2 Shea is 4 David is 0” Preston is 2” Deviation scores (x - µ) Deviation scores: The amount by which observations deviate on either side of their mean (x - µ) How far away is each score from the mean? Mean Diallo Deviation score Mike Preston Shea (x - µ) = ? Hunter Mike 5’8” ’0” = - 4” 5’9” ’0” = - 3” 5’10’ - 6’0” = - 2” 5’11” - 6’0” = - 1” 6’0” ’0 = 0 6’1” ’0” = + 1” 6’2” ’0” = + 2” 6’3” ’0” = + 3” 6’4” ’0” = + 4” Diallo How do we find each deviation score? (x - µ) Preston Hunter Diallo Mike Preston Find distance of each person from the mean (subtract their score from mean)

44 How far away is each score from the mean?
Diallo is 0” Mike is -4” Hunter is -2 Shea is 4 David is 0” Preston is 2” Deviation scores (x - µ) Deviation scores: The amount by which observations deviate on either side of their mean (x - µ) How far away is each score from the mean? Mean Diallo Deviation score Preston Shea (x - µ) = ? Mike 5’8” ’0” = - 4” 5’9” ’0” = - 3” 5’10’ - 6’0” = - 2” 5’11” - 6’0” = - 1” 6’0” ’0 = 0 6’1” ’0” = + 1” 6’2” ’0” = + 2” 6’3” ’0” = + 3” 6’4” ’0” = + 4” Remember It’s relative to the mean Based on difference from the mean

45 How far away is each score from the mean?
Standard deviation: The average amount by which observations deviate on either side of their mean Deviation scores (x - µ) Diallo is 0” Preston is 2” How far away is each score from the mean? Mike is -4” Hunter is -2 Shea is 4 Mean David is 0” Add up Deviation scores Diallo Preston Σ (x - µ) = ? Shea Mike 5’8” ’0” = - 4” 5’9” ’0” = - 3” 5’10’ - 6’0” = - 2” 5’11” - 6’0” = - 1” 6’0” ’0 = 0 6’1” ’0” = + 1” 6’2” ’0” = + 2” 6’3” ’0” = + 3” 6’4” ’0” = + 4” How do we find the average height? N Σx = average height How do we find the average spread? Σ(x - x) = 0 Σ(x - µ) N = average deviation Σ(x - µ) = 0

46 How far away is each score from the mean?
Standard deviation: The average amount by which observations deviate on either side of their mean Deviation scores (x - µ) Diallo is 0” Preston is 2” How far away is each score from the mean? Mike is -4” Hunter is -2 Shea is 4 Mean David is 0” Diallo Preston Σ (x - µ) = ? Shea Mike 5’8” ’0” = - 4” 5’9” ’0” = - 3” 5’10’ - 6’0” = - 2” 5’11” - 6’0” = - 1” 6’0” ’0 = 0 6’1” ’0” = + 1” 6’2” ’0” = + 2” 6’3” ’0” = + 3” 6’4” ’0” = + 4” Square the deviations Big problem Σ(x - x) 2 2 Σ(x - x) = 0 Σ(x - µ) N Σ(x - µ) 2 Σ(x - µ) = 0

47 These would be helpful to know by heart – please memorize
Standard deviation: The average amount by which observations deviate on either side of their mean These would be helpful to know by heart – please memorize these formula

48 What do these two formula have in common?
Standard deviation: The average amount by which observations deviate on either side of their mean What do these two formula have in common?

49 What do these two formula have in common?
Standard deviation: The average amount by which observations deviate on either side of their mean What do these two formula have in common?

50 How do these formula differ?
Standard deviation: The average amount by which observations deviate on either side of their mean “n-1” is Degrees of Freedom” How do these formula differ?

51 “Sum of Squares” “Sum of Squares” “Sum of Squares” “Sum of Squares”
Standard deviation: The average amount by which observations deviate on either side of their mean “Sum of Squares” “Sum of Squares” “Sum of Squares” “Sum of Squares” Diallo is 0” Mike is -4” Hunter is -2 Shea is 4 David 0” Preston is 2” Deviation scores Remember, it’s relative to the mean “n-1” is “Degrees of Freedom” “n-1” is “Degrees of Freedom” Generally, (on average) how far away is each score from the mean? Based on difference from the mean Mean Remember, We are thinking in terms of “deviations” Diallo Please memorize these Preston Shea Mike

52 Standard deviation (definitional formula) - Let’s do one
This numerator is called “sum of squares” Each of these are deviation scores _ X - µ _ 1 - 5 = - 4 2 - 5 = - 3 3 - 5 = - 2 4 - 5 = - 1 5 - 5 = 0 6 - 5 = 1 7 - 5 = 2 8 - 5 = 3 9 - 5 = 4 (X - µ)2 16 9 4 1 60 Step 1: Find the mean _ X_ 1 2 3 4 5 6 7 8 9 45 ΣX = 45 ΣX / N = 45/9 = 5 Step 2: Subtract the mean from each score Step 3: Square the deviations Step 4: Find standard deviation This is the Variance! a) 60 / 9 = b) square root of = Σ(x - µ) = 0 This is the standard deviation!

53 Another example: How many kids in your family?
3 4 2 1 4 2 2 3 1 8

54 Standard deviation - Let’s do one
Definitional formula How many kids? Step 1: Find the mean X - µ_ 3 - 3 = 0 2 - 3 = -1 1 - 3 = -2 4 - 3 = 1 8 - 3 = 5 (X - µ)2 1 4 25 _ X_ 3 2 1 4 8 = 30 = 30/10 = 3 Step 2: Subtract the mean from each score (deviations) Step 3: Square the deviations Step 4: Add up the squared deviations Step 5: Find standard deviation Σ(x - µ) = 0 Σx = 30 Σ(x - µ)2 = 38 This is the Variance! a) 38 / 10 = 3.8 b) square root of 3.8 = 1.95 This is the standard deviation!

55 These would be helpful to know by heart – please memorize areas
1 sd above and below mean 68% 2 sd above and below mean 95% 3 sd above and below mean 99.7% These would be helpful to know by heart – please memorize areas

56 Raw scores, z scores & probabilities
Please note spatially where 1 standard deviation falls on the curve 68% 95% 99.7%

57 Raw scores, z scores & probabilities
Please note spatially where 1 standard deviation falls on the curve

58 Raw scores, z scores & probabilities
1 sd above and below mean 68% z = +1 z = -1 Mean = 50 S = 10 (Note S = standard deviation) If we go up one standard deviation z score = +1.0 and raw score = 60 If we go down one standard deviation z score = -1.0 and raw score = 40

59 Raw scores, z scores & probabilities
2 sd above and below mean 95% z = -2 z = +2 Mean = 50 S = 10 (Note S = standard deviation) If we go up two standard deviations z score = +2.0 and raw score = 70 If we go down two standard deviations z score = -2.0 and raw score = 30

60 Raw scores, z scores & probabilities
3 sd above and below mean 99.7% z = +3 z = -3 Mean = 50 S = 10 (Note S = standard deviation) If we go up three standard deviations z score = +3.0 and raw score = 80 If we go down three standard deviations z score = -3.0 and raw score = 20

61 z score = raw score - mean standard deviation
If we go up one standard deviation z score = +1.0 and raw score = 105 z = -1 z = +1 68% If we go down one standard deviation z score = -1.0 and raw score = 95 If we go up two standard deviations z score = +2.0 and raw score = 110 z = -2 95% z = +2 If we go down two standard deviations z score = -2.0 and raw score = 90 If we go up three standard deviations z score = +3.0 and raw score = 115 99.7% z = -3 z = +3 If we go down three standard deviations z score = -3.0 and raw score = 85 z score: A score that indicates how many standard deviations an observation is above or below the mean of the distribution z score = raw score - mean standard deviation

62 Writing Assignment – Pop Quiz
1. What is a “deviation score” 2. Preston has a deviation score of 2: What does that tell us about Preston? Is he taller or shorter than the mean? And by how much? Are most people in the group taller or shorter than Preston Mike has a deviation score of -4: What does that tell us about Mike? Are most people in the group taller or shorter than Mike Diallo has a deviation score of 0: What does that tell us about Diallo? Are most people in the group taller or shorter than Diallo? Please write the formula for the standard deviation of a population Please draw 3 curves showing 1, 2 & 3 standard deviations from mean

63 Writing Assignment – Pop Quiz
7. What does this symbol refer to? What is it called? What does it mean? Is it referring to a sample or population? 8. What does this symbol refer to? What is it called? What does it mean? Is it referring to a sample or population? 9. What does this symbol refer to? What is it called? What does it mean? Is it referring to a sample or population? 10. What does this symbol refer to? What is it called? What does it mean? Is it referring to a sample or population? 11. What does this symbol refer to?

64 Writing Assignment – Pop Quiz
12. What does this refer to? What are they called? What do they refer to? How are they different 13. What does this refer to? What are they called? How are they different 14. What do these two refer to? What are they called? How are they different 15. What does this refer to? What is it called? Use it for sample data or population?

65 Writing Assignment – Pop Quiz
16. What does this refer to? What are they called? What do they refer to? How are they different 17. What does this refer to? What are they called? What do they refer to? How are they different

66 Writing Assignment – Pop Quiz 1. What is a “deviation score”
2. Preston has a deviation score of 2: What does that tell us about Preston? Is he taller or shorter than the mean? And by how much? Are most people in the group taller or shorter than Preston Mike has a deviation score of -4: What does that tell us about Mike? Are most people in the group taller or shorter than Mike Diallo has a deviation score of 0: What does that tell us about Diallo? Are most people in the group taller or shorter than Diallo? Please write the formula for the standard deviation of a population Please draw 3 curves showing 1, 2 & 3 standard deviations from mean How far away is each score from the mean? Preston is 2” taller than the mean (taller than most) Mike is 4” shorter than the mean (shorter than most) Diallo is exactly same height as mean (half taller half shorter)

67 Writing Assignment – Pop Quiz
The standard deviation (population) 7. What does this symbol refer to? What is it called? What does it mean? Is it referring to a sample or population? sigma population The mean (population) 8. What does this symbol refer to? mu What is it called? What does it mean? Is it referring to a sample or population? population The mean (sample) 9. What does this symbol refer to? x-bar What is it called? What does it mean? Is it referring to a sample or population? sample The standard deviation (sample) 10. What does this symbol refer to? s What is it called? What does it mean? Is it referring to a sample or population? sample 11. What does this symbol refer to? Each individual score

68 Variance Writing Assignment – Pop Quiz 12. What does this refer to?
population sample 12. What does this refer to? Variance What are they called? What do they refer to? How are they different S squared Sigma squared 13. What does this refer to? population sample What are they called? How are they different Deviation scores 14. What do these two refer to? sample population What are they called? How are they different Sum of squares 15. What does this refer to? Degrees of freedom What is it called? Use it for sample data or population?

69 Writing Assignment – Pop Quiz
Standard Deviation Writing Assignment – Pop Quiz 16. What does this refer to? What are they called? What do they refer to? How are they different sample population Variance 17. What does this refer to? What are they called? What do they refer to? How are they different sample population

70 Connecting intentions of studies with Experimental Methodologies
Appropriate statistical analyses Appropriate graphs Today I want to present some “typical designs”. We will spend the next couple weeks filling in the details. We’ll come back to these distinctions over and over again, and build on them for the rest of the semester. Let’s get this overview well! Not worry about calculation details for now

71 Create example of each type Identify IV (one or two)
Identify DV (one or two) Draw possible graph for each Homework Assignment Think about this as we work through each type of study Study Type 1: Confidence Intervals Study Type 2: t-test Study Type 3: One-way Analysis of Variance (ANOVA) Study Type 4: Two-way Analysis of Variance (ANOVA) Study Type 5: Correlation Study Type 6: Simple and Multiple regression Study Type 7: Chi Square We’ll come back to these distinctions over and over again, and build on them for the rest of the class. Let’s get this overview well! Not worry about calculation details for now

72 Study Type 1: Confidence Intervals
Remember, this is just introduction to the idea Not worry about calculation details for now, we will get to those soon Study Type 1: Confidence Intervals On average newborns weigh 7 pounds, and are 20 inches long. My sister just had a baby - guess how much it weighs? Makes sense, right?!? Guess the mean. On average you would be right most often if you always guessed the mean Point estimate versus confidence interval: Guessing a single number versus a range of numbers What if you really needed to be right?!!? You could guess a range with smallest and largest possible scores. (how wide a range to be completely sure? Confidence interval: Guessing a range (max and min) and assigning a level of confidence that the score falls in that range

73 Study Type 1: Confidence Intervals
Remember, this is just introduction to the idea Not worry about calculation details for now, we will get to those soon Study Type 1: Confidence Intervals Confidence Intervals: A range of values that, with a known degree of certainty, includes an unknown population characteristic, such as a population mean 100% Confidence Interval: We can be 100% confident that our population mean falls between these two scores (Guess absurdly large and small values) 99% Confidence Interval: We can be 99% confident that our population mean falls between these two scores 95% Confidence Interval: We can be 95% confident that our population mean falls between these two scores Which has a wider interval relative to raw scores 95% or 99%?

74 Study Type 1: Confidence Intervals
Remember, this is just introduction to the idea Not worry about calculation details for now, we will get to those soon Study Type 1: Confidence Intervals Confidence Intervals: A range of values that, with a known degree of certainty, includes an unknown population characteristic, such as a population mean This sample of 10,000 newborns a mean weight is 7 pounds. What do you think the minimum and maximum weights would be to capture 95% of all newborns? This sample of 1000 flights, the mean number of empty seats is 12. What do you think the minimum and maximum number of empty seats are likely to be in the flights today with a 95% level of certainty? You can use a mean of a sample to guess the mean of population mean of a smaller sample most likely score for an individual This sample of 500 households produced a mean income of $35,000 a year. What do you think the minimum and maximum income levels are so that we are 95% confident that we captured Mabel’s?

75 We are looking to compare two means
Study Type 1: Confidence Intervals Study Type 2: t-test Comparing Two Means? Use a t-test We are looking to compare two means

76 Study Type 2: t-test analysis
Single Independent Variable (categorical) comparing two groups Single Dependent Variable (numeric) Used to test the effect of the IV on the DV Andrea was interested in the effect of vacation time on productivity of the workers in her department. She randomly assigned workers into two groups, she allowed one group to go on vacation while the other group had no vacation. After the vacation she measured productivity for the two groups. Independent Variable Dependent Variable Between or within Quasi or true Causal relationship? Productivity Yes Vacation No Vacation

77 workers in her department. She randomly assigned workers into two
Andrea was interested in the effect of vacation time on productivity of the workers in her department. She randomly assigned workers into two groups, she allowed one group to go on vacation while the other group had no vacation. After the vacation she measured productivity for the two groups. This is an example of a true experiment. Dependent variable is always quantitative If “true” experiment (randomly assigned to groups) we can conclude that vacation had an effect - it increased productivity In t-test, independent variable is qualitative (with two groups) If “quasi” experiment (not randomly assigned to groups), we can conclude only that data suggest that vacation may have had an effect; productivity increased for those who went on vacation, but we can’t rule out other explanations.

78 p < 0.05 is most common value
Study Type 2: t-test analysis Single Independent Variable (categorical) comparing two groups Single Dependent Variable (numerical/continuous) Comparing two means (2 bars on graph) Used to test the effect of the IV on the DV Please note: a t-test allows us to compare two means If the means are statistically different - we say that there is “real” difference that is not just due to chance - we say there is a statistically significant difference p < 0.05 p < 0.05 is most common value – the “p value” can vary (p < 0.01, or p < 0.001)

79 Comparing more than two means
Study Type 1: Confidence Intervals Study Type 2: t-test Study Type 3: One-way Analysis of Variance (ANOVA) Comparing more than two means

80 Study Type 3: One-way ANOVA
Single Independent Variable comparing more than two groups Single Dependent Variable (numerical/continuous) Used to test the effect of the IV on the DV Ian was interested in the effect of incentives for girl scouts on the number of cookies sold. He randomly assigned girl scouts into one of three groups. The three groups were given one of three incentives and looked to see who sold more cookies. The 3 incentives were 1) Trip to Hawaii, 2) New Bike or 3) Nothing. This is an example of a true experiment How could we make this a quasi-experiment? Independent Variable: Type of incentive Levels of Independent Variable: None, Bike, Trip to Hawaii Dependent Variable: Number of cookies sold Levels of Dependent Variable: 1, 2, 3 up to max sold Between participant design Causal relationship: Incentive had an effect – it increased sales

81 Study Type 3: One-way ANOVA
Single Independent Variable comparing more than two groups Single Dependent Variable (numerical/continuous) Used to test the effect of the IV on the DV Ian was interested in the effect of incentives for girl scouts on the number of cookies sold. He randomly assigned girl scouts into one of three groups. The three groups were given one of three incentives and looked to see who sold more cookies. The 3 incentives were 1) Trip to Hawaii, 2) New Bike or 3) Nothing. This is an example of a true experiment Dependent variable is always quantitative Sales Sales None New Bike Trip Hawaii None New Bike Trip Hawaii In an ANOVA, independent variable is qualitative (& more than two groups)

82 Comparing two independent variables
Study Type 1: Confidence Intervals Study Type 2: t-test Study Type 3: One-way Analysis of Variance (ANOVA) Study Type 4: Two-way Analysis of Variance (ANOVA) Comparing two independent variables Each one has multiple levels

83 Study Type 4: Two-way ANOVA
“Two-way” = “Two IVs” Study Type 4: Two-way ANOVA Ian was interested in the effect of incentives (and age) for girl scouts on the number of cookies sold. He randomly assigned girl scouts into one of three groups. The three groups were given one of three incentives and he looked to see who sold more cookies. The 3 incentives were: 1) Trip to Hawaii, 2) New Bike or 3) Nothing. He also measured the scouts’ ages. Independent Variable #1 Independent Variable #2 Dependent Variable

84 Study Type 4: Two-way ANOVA
Multiple Independent Variables (categorical), each variable comparing two or more groups Single Dependent Variable (numerical/continuous) Used to test the effect of two IV on the DV Independent Variable #1: Type of incentive Levels of Independent Variable: None, Bike, Trip to Hawaii Independent Variable #2: Age Levels of Independent Variable: Elementary girls versus college Dependent Variable: Number of cookies sold Levels of Dependent Variable: 1, 2, 3 up to max sold Between participant design Results: Incentive had an effect – it increased sales Data suggest age had an effect – older girls sold more

85 Study Type 4: Two-way ANOVA
Two Independent Variables (categorical) Single Dependent Variable (numerical/continuous) Used to test the effect of two IV on the DV Dependent variable is always quantitative College College Elementary Sales Sales Elementary None New Bike Trip Hawaii None New Bike Trip Hawaii In an ANOVA, both independent variables are qualitative (with more than two groups)

86 Study Type 1: Confidence Intervals
Study Type 2: t-test Study Type 3: One-way Analysis of Variance (ANOVA) Study Type 4: Two-way Analysis of Variance (ANOVA) Study Type 5: Correlation

87 Study Type 5: Correlation plots relationship between two
Pretty much all correlations are “quasi-experimental” Study Type 5: Correlation plots relationship between two continuous / quantitative variables Neutral relative to causality – but especially useful for predictions Relationship between amount of money spent on advertising and amount of money made in sales Dependent variable is always quantitative Dollars spent on Advertising Positive Correlation In correlation, both variables are quantitative Dollars in Sales Describe strength and direction of correlation – in this case positive/strong Graphing correlations use scatterplots (not bar graphs)

88 Study Type 1: Confidence Intervals
Study Type 2: t-test Study Type 3: One-way Analysis of Variance (ANOVA) Study Type 4: Two-way Analysis of Variance (ANOVA) Study Type 5: Correlation Study Type 6: Simple and Multiple regression

89 You probably make this much
Study Type 6: Regression: Using the correlation to predict the value of one variable based on its relationship with the other variable You probably make this much The predicted variable goes on the “Y” axis and is called the dependent variable. The predictor variable goes on the “X” axis and is called the independent variable Expenses per year Yearly Income If you spend this much

90 Dustin spends $12 for his own Birthday
Angelina Jolie Buys Brad Pitt a $24 million Heart-Shaped Island for his 50th Birthday You probably make this much Expenses per year Yearly Income You probably make this much If you spend this much If you spend this much Dustin spends $12 for his own Birthday

91 Study Type 6: Regression: Using the correlation to predict
the value of one variable based on its relationship with the other variable You probably make this much The predicted variable goes on the “Y” axis and is called the dependent variable. The predictor variable goes on the “X” axis and is called the independent variable Expenses per year Yearly Income You probably make this much Dependent Variable (Predicted) If you spend this much If you save this much Multiple regression will use multiple independent variables to predict the dependent variable Independent Variable 1 (Predictor) If you spend this much Independent Variable 2 (Predictor)

92 Study Type 1: Confidence Intervals
Study Type 2: t-test Study Type 3: One-way Analysis of Variance (ANOVA) Study Type 4: Two-way Analysis of Variance (ANOVA) Study Type 5: Correlation Study Type 6: Simple and Multiple regression Study Type 7: Chi Square

93 Study Type 7: Chi-squared is used to evaluate whether the differences found in your sample match what you would expect to find. It is used with nominal or ordinal data when we simply count how many participants fall into each category. We are comparing frequencies, not means. or objects or events What is your favorite type of restaurant? (Do university students show the same results as the general population?) What is your political affiliation? (Do the proportions change when the country is at war or otherwise stressed? What is the most popular ride at Disneyland? (Are all the rides at Disneyland equally popular?) Do more children, teens or adults play video games? What is the most popular ride at Disneyland? Just count how many people ride each one. a. Dumbo b. Small World c. Space Mountain d. Splash Mountain We could gather this data using clickers

94 Study Type 5: Correlation Study Type 1: Confidence Intervals
Connecting intentions of studies with Experimental Methodologies Appropriate statistical analyses Appropriate graphs Study Type 5: Correlation Study Type 1: Confidence Intervals Study Type 6: Regression Study Type 2: t-test Study Type 3: One-way ANOVA Study Type 7: Chi-squared Study Type 4: Two-way ANOVA Remember when p < 0.05 we say: - results are statistically significant there is “real” difference (not just due to chance)

95 What type of analysis is this?
Marietta is a manager of a movie theater. She wanted to know whether there is a difference in concession sales for afternoon (matinee) movies vs. evening movies. She took a random sample of 25 purchases from the matinee movie (mean of $7.50) and 25 purchases from the evening show (mean of $10.50). She compared these two means. This is an example of a _____. a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA t-test Let’s try another one Let’s try one This is an example of a a. between participant design b. within participant design c. mixed participant design Between

96 What type of analysis is this?
Marietta is a manager of a movie theater. She wanted to know whether there is a difference in concession sales for people of all ages. She simply measured their age and how much they spent on treats. This is an example of a _____. a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA Correlation Let’s try one

97 What type of analysis is this?
Marietta is a manager of a movie theater. She wanted to know whether there is a difference in concession sales for afternoon (matinee) movies and evening movies. She took a random sample of 25 purchases from the matinee movie (mean of $7.50) and 25 purchases from the evening show (mean of $10.50). Which of the following would be the appropriate graph for these data Matinee Evening Concession purchase a. c. Concession purchase Movie Times Two means t-test Movie Times Concession purchase d. Movie Time Concession b. Let’s try one

98 What type of analysis is this?
Gabriella is a manager of a movie theater. She wanted to know whether there is a difference in concession sales between teenage couples and middle-aged couples. She also wanted to know whether time of day makes a difference (matinee versus evening shows). She gathered the data for a sample of 25 purchases from each pairing. a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA Two-way ANOVA What are the two IV? What are the levels of each? Let’s try one

99 What type of analysis is this?
Gabriella is a manager of a movie theater. She wanted to know whether there is a difference in concession sales between teenage couples and middle-aged couples. She also wanted to know whether time of day makes a difference (matinee versus evening shows). She gathered the means for a sample of 25 purchases from each pairing. Matinee Older couples Evening Teenagers Concession purchase a. c. Concession purchase Matinee Older couples Evening Teenagers Movie Times Concession purchase d. Older couples Teenagers Movie Time Old / young b. Matinee Evening Four means Let’s try one

100 What type of analysis is this?
Pharmaceutical firm tested whether fish-oil capsules taken daily decrease cholesterol. They measured cholesterol levels for 30 male subjects and then had them take the fish-oil daily for 2 months and tested their cholesterol levels again. Then they compared the mean cholesterol before and after taking the capsules. This is an example of a _____. a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA t-test Let’s try another one Let’s try one This is an example of a a. between participant design b. within participant design c. mixed participant design Within

101 What type of analysis is this?
Elaina was interested in the relationship between the grade point average and starting salary. She recorded for GPA and starting salary for 100 students and looked to see if there was a relationship. This is an example of a _____. a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA correlation GPA Starting Salary Relationship between GPA and Starting salary Let’s try one

102 What type of analysis is this?
An automotive firm tested whether driving styles can affect gas efficiency in their cars. They observed 100 drivers and found there were four general driving styles. They recruited a sample of 100 drivers all of whom drove with one of these 4 driving styles. Then they asked all 100 drivers to use the same model car for a month and recorded their gas mileage. Then they compared the mean mpg for each driving style. This is an example of a _____. a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA Let’s try one One-way ANOVA Let’s try another one Between Let’s try another one Quasi-experiment This is an example of a a. between participant design b. within participant design c. mixed participant design This is an example of a a. true experimental design b. quasi-experimental design c. mixed design

103

104 Thank you! See you next time!!


Download ppt "BNAD 276: Statistical Inference in Management Winter, 2015"

Similar presentations


Ads by Google