Download presentation
Presentation is loading. Please wait.
Published byLesley Lyons Modified over 9 years ago
1
Types of data and how to present them 47:269: Research Methods I Dr. Leonard March 31, 2010 47:269: Research Methods I Dr. Leonard March 31, 2010
2
Scientific Theory 1. Formulate theories 2. Develop testable hypotheses (operational definitions) 3. Conduct research, gather data 4. Evaluate hypotheses based on data 5. Cautiously draw conclusions 1. Formulate theories 2. Develop testable hypotheses (operational definitions) 3. Conduct research, gather data 4. Evaluate hypotheses based on data 5. Cautiously draw conclusions
3
Scales of Measurement / Nominal / Categories / Ordinal / Categories that can be ranked / Interval / Scores with equidistant intervals between them / Ratio / Scores with equidistant intervals and absolute zero / Nominal / Categories / Ordinal / Categories that can be ranked / Interval / Scores with equidistant intervals between them / Ratio / Scores with equidistant intervals and absolute zero
4
Responses are distinct Responses can be ranked Equal intervals Absolute zero NominalYESNO OrdinalYES NO IntervalYES NO RatioYES
5
Two major approaches to using data / Descriptive statistics / Describe or summarize data to characterize sample / Organizes responses to show trends in data / Inferential statistics / Draw inferences about population from sample (is population distinct from sample?) / Significance tests / Capture impact of random error on responses / Margin of error / Note: Statistics describe responses from a sample; parameters describe responses from a population (e.g., a census) / Descriptive statistics / Describe or summarize data to characterize sample / Organizes responses to show trends in data / Inferential statistics / Draw inferences about population from sample (is population distinct from sample?) / Significance tests / Capture impact of random error on responses / Margin of error / Note: Statistics describe responses from a sample; parameters describe responses from a population (e.g., a census)
6
Descriptive Statistics / N, total number of cases (responses) in a sample / Our class would be N = 33 / f, or frequency, is the number of participants who gave a particular response, x / Can also be given as percentages or proportions / Can be univariate or bivariate / How participants vary on one variable (uni-) / How participants vary on two variables (bi-) / Descriptive statistics are a good first step for analyzing any data! / They are the only statistics appropriate for nominal data / N, total number of cases (responses) in a sample / Our class would be N = 33 / f, or frequency, is the number of participants who gave a particular response, x / Can also be given as percentages or proportions / Can be univariate or bivariate / How participants vary on one variable (uni-) / How participants vary on two variables (bi-) / Descriptive statistics are a good first step for analyzing any data! / They are the only statistics appropriate for nominal data
7
Frequency distribution (nominal data) x (response) f (frequency) % Democrat47947.9 Republican41141.1 Independent10110.1 Green party90.9 Totaln = 1,000 100%
8
Frequency distribution (interval or ratio data) / When you need to present a wide range of scores, show responses grouped in intervals to make it easier to grasp “big picture” of data Intervalf.90 - 1.11 1.2 - 1.43 1.5 - 1.73 1.8 - 2.05 2.1 - 2.36 2.4 - 2.67 2.7 - 2.910 3.0 - 3.214 3.3 - 3.512 3.6 - 3.83 / When you need to present a wide range of scores, show responses grouped in intervals to make it easier to grasp “big picture” of data Intervalf.90 - 1.11 1.2 - 1.43 1.5 - 1.73 1.8 - 2.05 2.1 - 2.36 2.4 - 2.67 2.7 - 2.910 3.0 - 3.214 3.3 - 3.512 3.6 - 3.83 2.71.91.03.31.31.82.63.7 3.12.23.03.43.12.21.93.1 3.43.03.53.02.43.03.42.4 3.23.32.73.53.23.13.3 2.11.52.72.43.43.33.03.8 1.42.62.92.12.61.52.82.3 3.11.62.82.32.83.22.8 3.81.41.93.32.92.03.2
9
/ Frequency distributions can be depicted graphically in… Bar graphs / Bars not touching because of discrete data / Nominal and ordinal data Histograms / Bars touching because of continuous data / Interval and ratio data Frequency polygons (single line) / Interval and ratio data / Frequency distributions can be depicted graphically in… Bar graphs / Bars not touching because of discrete data / Nominal and ordinal data Histograms / Bars touching because of continuous data / Interval and ratio data Frequency polygons (single line) / Interval and ratio data
12
What else can we do besides frequencies? Measures of central tendency show the central or “ typical ” scores in a distribution / Mean- the average score / Median- the middle score / Mode- the most frequent score / The mean, median, and mode are related to the horizontal shape (skew) of the distribution. / In a normal distribution: Mean = Median = Mode / In a positively skewed distribution: Mode < Median < Mean / In a negatively skewed distribution: Mean < Median < Mode Measures of central tendency show the central or “ typical ” scores in a distribution / Mean- the average score / Median- the middle score / Mode- the most frequent score / The mean, median, and mode are related to the horizontal shape (skew) of the distribution. / In a normal distribution: Mean = Median = Mode / In a positively skewed distribution: Mode < Median < Mean / In a negatively skewed distribution: Mean < Median < Mode
13
Which measure of central tendency??? Different measures of central tendency are appropriate depending upon the level of measurement used: NominalOrdinal Interval/Ratio Mode Mode Mode Median Median Mean
14
The Mean / The most informative and elegant measure of central tendency. / The average / The fulcrum point of the distribution / The most informative and elegant measure of central tendency. / The average / The fulcrum point of the distribution 246810246815
15
The Median / The middle most score in a distribution. / The scale value below which and above which 50% of the distribution falls / Not the fulcrum: The halfway point / The middle most score in a distribution. / The scale value below which and above which 50% of the distribution falls / Not the fulcrum: The halfway point 246810246815
16
The Median / If N is odd, then median is the center score / If N is even, then median is the average of the two centermost score / If N is odd, then median is the center score / If N is even, then median is the average of the two centermost score 246815 2468102468 12 2 4 68 1510
17
The Median / If the median occurs at a value where there are tied scores, use the tied score as the median 2468 15 10 8
18
The Mode / The most frequent score in the distribution 2468 1510 8 2468 15 10 8
19
One more thing… These measures of central tendency vary in their sampling stability = match between the sample mean (e.g., x) and the population mean ( μ ). Mode Median Mean Note: Roman (r, s, x) characters are used for sample statistics while Greek ( , , ) characters are used for population statistics. These measures of central tendency vary in their sampling stability = match between the sample mean (e.g., x) and the population mean ( μ ). Mode Median Mean Note: Roman (r, s, x) characters are used for sample statistics while Greek ( , , ) characters are used for population statistics. Least sampling stability Most sampling stability
20
Review of central tendency / Which one is the only appropriate measure for nominal data? / The mode / How do you find the median when there is an odd number of scores? / Simply locate the score in the middle / …when there is an even number of scores? / Average the two middle scores / Which measure is most sensitive to extreme scores and why? / The mean because it takes all scores into account and can be swayed by positive or negative skew / Which measure has the most sampling stability and why? / The mean because it is the most accurate representation of the overall sample / Which one is the only appropriate measure for nominal data? / The mode / How do you find the median when there is an odd number of scores? / Simply locate the score in the middle / …when there is an even number of scores? / Average the two middle scores / Which measure is most sensitive to extreme scores and why? / The mean because it takes all scores into account and can be swayed by positive or negative skew / Which measure has the most sampling stability and why? / The mean because it is the most accurate representation of the overall sample
21
Application of central tendency / In 2006, the median home price in Boston was $386,300. (San Francisco was $518,400; Washington D.C was $258,700). / How do you interpret these numbers? / Why are housing prices framed in terms of the median rather than the mean or the mode? / In 2006, the median home price in Boston was $386,300. (San Francisco was $518,400; Washington D.C was $258,700). / How do you interpret these numbers? / Why are housing prices framed in terms of the median rather than the mean or the mode?
22
Measures of variability / Measures of central tendency …indicate the typical scores in a distribution …are related to skew (horizontal) / Measures of variability …show the dispersion of scores in a distribution …are related to kurtosis (vertical) / Measures of central tendency …indicate the typical scores in a distribution …are related to skew (horizontal) / Measures of variability …show the dispersion of scores in a distribution …are related to kurtosis (vertical)
23
Measures of variability / Range - the difference between the highest and lowest score / Variance - the total variation (distance) from the mean of all the scores / Standard deviation - the average variation (distance) from the mean of all the scores / Range - the difference between the highest and lowest score / Variance - the total variation (distance) from the mean of all the scores / Standard deviation - the average variation (distance) from the mean of all the scores
24
Measures of variability Range = Highest Score – Lowest Score Most sensitive to extreme scores! Range = Highest Score – Lowest Score Most sensitive to extreme scores! 246810 246815
25
Measures of variability / Again, variance is the overall distance from the mean of all scores (requires squaring the distance of each score from the mean) / Not as useful as the standard deviation -- the average distance scores fall from the mean / Again, variance is the overall distance from the mean of all scores (requires squaring the distance of each score from the mean) / Not as useful as the standard deviation -- the average distance scores fall from the mean
26
Measures of variability / Standard deviation, like the mean, is the most informative and elegant measure of variability. / The average distance of scores from the mean score -- deviation is distance! / Also like the mean, standard deviation has the most sampling stability / Standard deviation, like the mean, is the most informative and elegant measure of variability. / The average distance of scores from the mean score -- deviation is distance! / Also like the mean, standard deviation has the most sampling stability 246810
27
How would these standard deviations differ? 2468 12 10 82468 Mean = 6 Mean = 7.9 Range = 8 Range = 10 6
28
Standard deviation and shape of distribution 0 0 5 5 1010 1010 3030 3030 2020 2020 2525 2525 1515 1515 Mean = 15 Std. Dev. = 10 1414 1414 1414 1414 1414 1414 1616 1616 1515 1515 1616 1616 1515 1515 Mean = 15 Std. Dev. = 0.9 Mean = 15
29
Properties of Normal Distributions All normal distributions are single peaked, symmetric, and bell-shaped Normal distributions can have different values for mean and standard deviation but… All normal distributions follow the 68-95-99 rule 68.3% of data within 1 standard deviation of the mean 95.4% of data within 2 standard deviations of the mean 99.7% of data within 3 standard deviations of the mean
30
99.7% - 95.4% - 68.3% - 95.4% - 99.7% Mean
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.