Download presentation
Presentation is loading. Please wait.
Published byAshley Craig Modified over 9 years ago
1
STATISTICS I COURSE INSTRUCTOR: TEHSEEN IMRAAN
2
CHAPTER 4 DESCRIBING DATA
3
INTRODUCTION We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem- and-leaf display, and box plots give additional insight into where the values are concentrated and dispersed and the general shape of the data. Finally we consider bivariate data where we observe two variables for each individual or observation selected.
4
DOT PLOTS A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line. To develop a dot plot we display a dot for each observation along a horizontal number line indicating the value of each piece of data. For multiple observations we pile the dots on top of each other.
5
STEPS TO CONSTRUCT DOT PLOT Sort the data from smallest to largest. Draw and label a number line. Place a dot. for each observation.
6
FOR EXAMPLE Length of Service (in years) 7621066 584847 653375
7
Step 1: Sort the data from smallest to largest. 2334455566667778810
8
Step 2: Draw the number line and label it as shown.
9
Step 3: Place a dot for each observation.
10
STEM AND LEAF DISPLAYS A statistical technique for displaying a set of data. Each numerical value is divided into two parts: The leading digit(s) become the stem, and the trailing digits the leaf. The stems are located along the main vertical axis, and the leaf for each observation along the horizontal axis. To develop a stem-and-leaf chart the first step is to locate the largest value and the smallest value. This will provide the range of the stem values. The stem is the leading digit or digits of the number, and the leaf is the trailing digit. For example, the number 15 has a stem value of 1 and a leaf value of 5. For another problem the number 231 has a stem value of 23 and a leaf value of 1.
11
FOR EXAMPLE $12$28$32$24$17$6 $34$18$22$42$36$26
12
FOR EXAMPLE Leading DigitTrailing Digit 06 1278 22468 3246 42
13
OTHER MEASURES OF DISPERSION QUARTILES: – First Quartile The point below which one-fourth or 25% of the ranked data values lie. (It is designated Q 1 ) – Third Quartile The point below which three- fourths or 75% of the ranked data values lie. (It is designated Q 3 ) – Logically the median is the Second Quartile (designated Q 2 ). The values corresponding to Q 1, Q 2 and Q 3 divide a set of data into four equal parts.
14
DECILES AND PERCENTILES Just as quartiles divide a distribution into 4 equal parts, deciles divide a distribution into ten equal parts; and percentiles divide a distribution into 100 equal parts. The procedure for finding the quartile, decile, and a percentile for ungrouped data is to order the data from smallest to largest. Then use text formula [4-1].
15
DECILES AND PERCENTILES Location of a Percentile,
16
BOX PLOTS A graphical display based on five statistics: the minimum value, Q 1 (the first quartile), Q 2 the median, Q 3 (the third quartile) and the maximum value. To construct a box plot we need five pieces of information. We need the minimum value, Q 1 (the first quartile), Q 2 the median, Q 3 (the third quartile) and the maximum value.
17
RELATIVE DISPERSION Coefficient of variation: The ratio of the standard deviation to the arithmetic mean, expressed as a percent.
18
FORMULA FOR CV Coefficient of Variation, Multiplying by 100 converts the decimal to a percent [4-2]
19
COEFFICIENT OF VARIATION Characteristics of the coefficient of variation are: – It reports the variation relative to the mean. – It is useful for comparing distributions with different units.
20
SKEWNESS Four shapes of distribution Coefficient of skewness: A measure to describe the degree of skewness. How the distribution is skewed?
21
Text Formula [4–3] is for Pearson’s Coefficient of Skewness.
22
Characteristics of the coefficient of skewness are: The coefficient of skewness, designated sk, measures the amount of skewness and may range from -3.0 to +3.0. A value near -3, such as -2.57, indicates considerable negative skewness. A value such as 1.63 indicates moderate positive skewness. A value of 0, which will occur when the mean and median are equal, indicates the distribution is symmetrical and that there is no skewness.
23
SUMMARY OF CHARTS
24
RELATIONSHIP BETWEEN TWO VARIABLES Bivariate data: A collection of paired data values. Scatter diagram: A graph in which paired data values are plotted on an X,Y Axis. The steps to follow in developing a scatter diagram are: – We need two variables. – We scale one variable (x) along the horizontal axis (X – Axis) of a graph and the corresponding variable (y) along the vertical axis (Y – Axis). – Place a dot for each (x, y) pair of observations.
25
GRAPH
26
CONTINGENCY TABLE A table used to classify sample observations according to two or more identifiable characteristics. When we study the relationship between two or more variables when one or both are nominal or ratio scale, we tally the results into a two-way table. This two-way table is referred to as a contingency table.
27
CONTINGENCY TABLE Gender Bought Lunch BoysGirlsTotal 0 up to 1010515 10 up to 20202545 Total30 60
28
CONTINGENCY TABLE A contingency table is a cross tabulation that simultaneously summarizes two variables of interest and their relationship. A survey of 60 school children classified each as to gender and the number of times lunch was purchased at school during a four-week period. Each respondent is classified according to two criteria – the number of times lunch was purchased and gender.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.