Download presentation
Presentation is loading. Please wait.
Published byMarilynn White Modified over 8 years ago
1
An Introduction to Statistics
2
Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion Inferential statistics Drawing inferences based on data. Using statistics to draw conclusions about the population from which the sample was taken.
3
A parameter is a characteristic of a population e.g., the average height of all Americans. A statistics is a characteristic of a sample e.g., the average height of a sample of Americans. Inferential statistics infer population parameters from sample statistics e.g., we use the average height of the sample to estimate the average height of the population Populations and Samples
4
Descriptive Statistics Numerical Data Properties Mean Median Mode Central Tendency Range Interquartile Range Variance Standard Deviation Variation Skewness Kurtosis Shape
5
Ordering the Data: Frequency Tables Frequency table (distribution) A listing in order of magnitude of each score achieved and the number of times the score occurred. Grouped frequency table (distribution) Range of scores in each of several equally sized intervals Why Frequency Tables? Gives some order to a set of data Can examine data for outliers Is an introduction to distributions
6
Frequency Tables
7
Grouped Frequency Tables RangeNumberPercentCumulative 30-3912.082.08 40-4936.258.33 50-5948.3316.67 60-691225.0041.67 70-791939.5881.25 80-89714.5895.83 90-10024.17100.00 Total48100
8
Making a Frequency Table 1) List each possible value, from highest to lowest 2) Go one by one through the scores, making a mark for each score next to its value on the list 3) Make a table showing how many times each value on your list was used 4) Calculate the percentage of scores for each value
9
Making a Stem-and-Leaf Plot Each data point is broken down into a “stem” and a “leaf.” Select one or more leading digits for the stem values. The trailing digit(s) becomes the leaves First, “stems” are aligned in a column. Record the leaf for every observation beside the corresponding stem value
10
Stem and Leaf Plot Stem-and-leaf of Shoes N = 139 Leaf Unit = 1.0 12 0 223334444444 63 0 555555555555566666666677777778888888888888999999999 (33) 1 000000000000011112222233333333444 43 1 555555556667777888 25 2 0000000000023 12 2 5557 8 3 0023 4 3 4 4 00 2 4 2 5 0 1 5 1 6 1 7 1 7 5
11
Stem and Leaf / Histogram Stem Leaf 21 3 4 32 2 3 6 43 8 8 52 5 Stem 2 3 4 5 Leaf 1 3 4 2 2 3 6 3 8 8 2 5 By rotating the stem-leaf, we can see the shape of the distribution of scores.
12
Histograms Depicts information from a frequency table or a grouped frequency table as a bar graph
13
Frequency Polygons Depicts information from a frequency table or a grouped frequency table as a line graph
14
Shapes of Frequency Distributions Frequency tables, histograms & polygons describe how the frequencies are distributed Distributions are a fundamental concept in statistics Unimodal Bimodal One peak Two peaks
15
Symmetrical vs. Skewed Frequency Distributions Symmetrical distribution Approximately equal numbers of observations above and below the middle Skewed distribution One side is more spread out that the other, like a tail Direction of the skew Positive or negative (right or left) Side with the fewer scores Side that looks like a tail
16
Symmetrical vs. Skewed Symmetric Skewed Right Skewed Left
17
Positively Skewed Positively skewed distribution Cluster towards the low end of the variable
18
Skewed Frequency Distributions Positively skewed AKA Skewed right Tail trails to the right
19
Negatively Skewed Negatively skewed distribution Cluster towards the high end of the variable
20
Skewed Frequency Distributions Negatively skewed Skewed left Tail trails to the left
21
Kurtosis How peaked or flat the curve is Leptokurtic: high and thin Mesokurtic: normal shape Platykurtic: flat and spread out Leptokurtic Mesokurtic Platykurtic
22
Comparing the Kurtosis of Three Curves Curve A: Mesokurtic (Intermediate)
23
Curve A: Mesokurtic (Intermediate) Curve B Leptokurtic (High & Peaked) Comparing the Kurtosis of Three Curves
24
Curve A: Mesokurtic (Intermediate) Curve B Leptokurtic (High & Peaked) Curve C Platykurtic (Broad & Flat) Comparing the Kurtosis of Three Curves
25
The Normal Curve Seen often in the social sciences and in nature generally Characteristics Bell-shaped Unimodal Symmetrical Average tails
26
Central Tendency Give information concerning the average or typical score of a number of scores mean median mode
27
Central Tendency: The Mean The Mean is a measure of central tendency What most people mean by “average” Sum of a set of numbers divided by the number of numbers in the set
28
Central Tendency: The Mean so if N = the number of numbers in X (10 for this example) then
29
Important conceptual point: The mean is the balance point of the data in the sense that if we took each individual score (X) and subtracted the mean from them, some are positive and some are negative. If we add all of those up we will get zero. Also, the sum of the absolute values of the negative numbers is equal to the sum of the absolute values of the positive numbers Central Tendency: The Mean
30
Central Tendency:The Median Middlemost or most central item in the set of ordered numbers; it separates the distribution into two equal halves If odd n, middle value of sequence if X = [1,2,4,6,9,10,12,14,17] then 9 is the median If even n, average of 2 middle values if X = [1,2,4,6,9,10,11,12,14,17] then 9.5 is the median; i.e., (9+10)/2 Median is not affected by extreme values
31
Central Tendency: The Mode The mode is the most frequently occurring number in a distribution if X = [1,2,4,7,7,7,8,10,12,14,17] then 7 is the mode Mode is not affected by extreme values There may be no mode or several modes
32
Mean, Median, Mode Negatively Skewed Mode Median Mean Symmetric (Not Skewed) Mean Median Mode Positively Skewed Mode Median Mean
33
When to Use What Mean is a great measure. But, there are time when its usage is inappropriate or impossible. Nominal data: Mode The distribution is bimodal: Mode You have ordinal data: Median or mode Are a few extreme scores: Median
34
Measures of Central Tendency Central Tendency MeanMedian Mode Overview Midpoint of ranked values Most frequently observed value
35
Variability How tightly clustered or how widely dispersed the values are in a data set. Example Data set 1: [0,25,50,75,100] Data set 2: [48,49,50,51,52] Both have a mean of 50, but data set 1 clearly has greater Variability than data set 2. Variability
36
Variability: The Range The Range is one measure of variability The range is the difference between the maximum and minimum values in a set Example Data set 1: [0,25,50,75,100]; R: 100-0 = 100 Data set 2: [48,49,50,51,52]; R: 52-48 = 4 The range ignores how data are distributed and only takes the extreme scores into account
37
Quartiles Split Ordered Data into 4 Quarters = first quartile = second quartile= Median = third quartile 25%
38
Variability: Interquartile Range Difference between third & first quartiles Interquartile Range = Q 3 - Q 1 Spread in middle 50% Not affected by extreme values
39
Variability: Standard Deviation “The Standard Deviation tells us approximately how far the scores vary from the mean on average” The typical deviation in a given distribution
40
Standard Deviation can be calculated with the sum of squares (SS) divided by N Variability: Standard Deviation
41
let X = [3, 4, 5,6, 7] M = 5 (X - M) = [-2, -1, 0, 1, 2] subtract M from each number in X (X - M) 2 = [4, 1, 0, 1, 4] squared deviations from the mean (X - M) 2 = 10 sum of squared deviations from the mean (SS) (X - M) 2 /N = 10/5 = 2 average squared deviation from the mean (X - M) 2 /N = 2 = 1.41 square root of averaged squared deviation
42
Variability: Standard Deviation let X = [1, 3, 5, 7, 9] M = 5 (X - M) = [-4, -2, 0, 2, 4 ] subtract M from each number in X (X - M) 2 = [16, 4, 0, 4, 16] squared deviations from the mean (X - M) 2 = 40 sum of squared deviations from the mean (SS) (X - M) 2 /N = 40/5 = 8 average squared deviation from the mean (X - M) 2 /N = 8 = 2.83 square root of averaged squared deviation
43
Variability: Standard Deviation The square of the standard deviation is called the variance Variance Standard Deviation
44
Z scores are expressed in the following way Z scores express how far a particular score is from the mean in units of standard deviation if ( X - M ) = SD then ( X - M)/SD = 1, and X is said to be one standard deviation above the mean Standard Deviation & Standard Scores
45
Z scores provide a common scale to express deviations from a group mean Standard Deviation & Standard Scores
46
Let’s say someone has an IQ of 145 and is 52 inches tall IQ in a population has a mean of 100 and a standard deviation of 15 Height in a population has a mean of 64” with a standard deviation of 4 How many standard deviations is this person away from the average IQ? How many standard deviations is this person away from the average height? Standard Deviation and Standard Scores
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.