Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M table Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M Projection Booth table Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M R/L handed broken desk Stage Lecturer’s desk Screen 1
MGMT 276: Statistical Inference in Management Spring 2015
Schedule of readings Before next exam: February 17 th Please read chapters & Appendix D & E in Lind Please read Chapters 1, 5, 6 and 13 in Plous Chapter 1: Selective Perception Chapter 5: Plasticity Chapter 6: Effects of Question Wording and Framing Chapter 13: Anchoring and Adjustment
By the end of lecture today 2/5/15 Questionnaire design and evaluation Surveys and questionnaire design Correlational methodology Positive, Negative and Zero correlation Strength and direction
No homework due Tuesday (February 10 th )
Designed our study / observation / questionnaire Collected our data Organize and present our results
Scatterplot displays relationships between two continuous variables Correlation: Measure of how two variables co-occur and also can be used for prediction Range between -1 and +1 Range between -1 and +1 The closer to zero the weaker the relationship The closer to zero the weaker the relationship and the worse the prediction Positive or negative Positive or negative
Correlation Range between -1 and +1 Range between -1 and perfect relationship = perfect predictor perfect relationship = perfect predictor 0 no relationship = very poor predictor strong relationship = good predictor strong relationship = good predictor strong relationship = good predictor weak relationship = poor predictor weak relationship = poor predictor weak relationship = poor predictor
Height of Mothers by Height of Daughters Positive Correlation Height of Daughters Height of Mothers Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down
Brushing teeth by number cavities Negative Correlation Number Cavities Brushing Teeth Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down
Perfect correlation = or One variable perfectly predicts the other Negative correlation Positive correlation Height in inches and height in feet Speed (mph) and time to finish race
Correlation Perfect correlation = or The more closely the dots approximate a straight line, (the less spread out they are) the stronger the relationship is. One variable perfectly predicts the other No variability in the scatterplot The dots approximate a straight line
Correlation
Is it possible that they are causally related? Correlation does not imply causation Yes, but the correlational analysis does not answer that question What if it’s a perfect correlation – isn’t that causal? No, it feels more compelling, but is neutral about causality Number of Birthday Cakes Number of Birthdays
Number of bathrooms in a city and number of crimes committed Positive correlation Positive correlation: as values on one variable go up, so do values for other variable Negative correlation: as values on one variable go up, Negative correlation: as values on one variable go up, the values for other variable go down
Linear vs curvilinear relationship Linear relationship is a relationship that can be described best with a straight line Curvilinear relationship is a relationship that can be described best with a curved line
Correlation - How do numerical values change? Let’s estimate the correlation coefficient for each of the following r = +1.0r = -1.0 r = +.80 r = -.50r =
r = This shows a strong positive relationship (r = 0.97) between the price of the house and its eventual sales price Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
r = +0.97r = This shows a moderate negative relationship (r = -0.48) between the amount of pectin in orange juice and its sweetness Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
r = This shows a strong negative relationship (r = -0.91) between the distance that a golf ball is hit and the accuracy of the drive Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
r = r = 0.61 This shows a moderate positive relationship (r = 0.61) between the price of the length of stay in a hospital and the number of services provided Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
r = +0.97r = r = r = 0.61
Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
1. Describe one positive correlation Draw a scatterplot (label axes) 2. Describe one negative correlation Draw a scatterplot (label axes) 3. Describe one zero correlation Draw a scatterplot (label axes) Break into groups of 2 or 3 Each person hand in own worksheet. Be sure to list your name and names of all others in your group Use examples that are different from those is lecture 4. Describe one perfect correlation (positive or negative) Draw a scatterplot (label axes) 5. Describe curvilinear relationship Draw a scatterplot (label axes)
Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed 1. Describe one positive correlation Draw a scatterplot (label axes) 2. Describe one negative correlation Draw a scatterplot (label axes) 3. Describe one zero correlation Draw a scatterplot (label axes) 4. Describe one perfect correlation (positive or negative) Draw a scatterplot (label axes) 5. Describe curvilinear relationship Draw a scatterplot (label axes) Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number)
Review of Homework Worksheet , , , , ,000 Notice Gillian asked 1300 people = /1300 =.10.10x100=10.10 x 1,000,000 = 100,000
Review of Homework Worksheet , , , , ,000
Review of Homework Worksheet
Age Dollars Spent Strong Negative Down -.9
Review of Homework Worksheet =correl(A2:A11,B2:B11) = Strong Negative Down
Review of Homework Worksheet =correl(A2:A11,B2:B11) = Strong Negative Down This shows a strong negative relationship (r = ) between the amount spent on snacks and the age of the moviegoer Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Correlation r (actual number)
Review of Homework Worksheet =correl(A2:A11,B2:B11) = Strong Negative Down Must be complete and must be stapled Hand in your homework
Height of Daughters (inches) Height of Mothers (in) This shows the strong positive (r = +0.8) relationship between the heights of daughters (in inches) with heights of their mothers (in inches). Both axes and values are labeled Both axes have real numbers listed 1. Describe one positive correlation Draw a scatterplot (label axes) 2. Describe one negative correlation Draw a scatterplot (label axes) 3. Describe one zero correlation Draw a scatterplot (label axes) 4. Describe one perfect correlation (positive or negative) Draw a scatterplot (label axes) 5. Describe curvilinear relationship Draw a scatterplot (label axes) Variable name is listed clearly Description includes: Both variables Strength (weak,moderate,strong) Direction (positive, negative) Estimated value (actual number) Hand in Homework and Correlation worksheet
Overview Frequency distributions The normal curve Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape
Another example: How many kids in your family? Number of kids in family
Measures of Central Tendency (Measures of location) The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Mean for a population: ΣX / N = mean = µ (mu) Note: Σ = add up x or X = scores n or N = number of scores Σx / n = mean = x Measures of “location” Where on the number line the scores tend to cluster
Measures of Central Tendency (Measures of location) The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Note: Σ = add up x or X = scores n or N = number of scores Σx / n = mean = x Number of kids in family / 10 = mean = 4.1
How many kids are in your family? What is the most common family size? Number of kids in family Median: The middle value when observations are ordered from least to most (or most to least)
How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 2, 3, 4, 8, 14 Number of kids in family
Number of kids in family , 4, 2, 1, How many kids are in your family? What is the most common family size? Number of kids in family Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, , 3, 1, 2, 4, 2, 4,8, 1, 14 2, 3, 1, Median always has a percentile rank of 50% regardless of shape of distribution µ = 2.5 If there appears to be two medians, take the mean of the two
Mode: The value of the most frequent observation Number of kids in family Score f Please note: The mode is “2” because it is the most frequently occurring score. It occurs “3” times. “3” is not the mode, it is just the frequency for the value that is the mode Bimodal distribution: If there are two most frequent observations
What about central tendency for qualitative data? Mode is good for nominal or ordinal data Median can be used with ordinal data Mean can be used with interval or ratio data
Overview Frequency distributions The normal curve Mean, Median, Mode, Trimmed Mean Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Skewed right, skewed left unimodal, bimodal, symmetric
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Normal distribution In a normal distribution: mode = mean = median In all distributions: mode = tallest point median = middle score mean = balance point
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Positively skewed distribution In a positively skewed distribution: mode < median < mean In all distributions: mode = tallest point median = middle score mean = balance point Note: mean is most affected by outliers or skewed distributions
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Negatively skewed distribution In a negatively skewed distribution: mean < median < mode In all distributions: mode = tallest point median = middle score mean = balance point Note: mean is most affected by outliers or skewed distributions
Mode: The value of the most frequent observation Bimodal distribution: Distribution with two most frequent observations (2 peaks) Example: Ian coaches two boys baseball teams. One team is made up of 10-year-olds and the other is made up of 16-year-olds. When he measured the height of all of his players he found a bimodal distribution