Download presentation
Presentation is loading. Please wait.
1
Chapter 2 Organizing Data
Nutan s. Mishra 11/22/2018 University of South Alabama
2
University of South Alabama
Raw Data A data recorded in the form as it was collected without ranking or processing is called raw data. Example: consider the status of following 20 students. The four status are SO, F, J, SE This also called ungrouped data. J F SO SE 11/22/2018 University of South Alabama
3
University of South Alabama
Grouped Data Status x # of students (frequency f) F 6 SO 4 J SE This is categorical (qualitative) data x= status f= frequency The data set consist of 20 members. The sum of frequencies of all the categories is equal to size of the data set . i.e. Σf = f1 + f2 + f3 + f4 = 20 11/22/2018 University of South Alabama
4
University of South Alabama
Relative frequency Status x # of students (frequency f) Relative frequency Percentage F 6 6/20 (6/20)*100 SO 4 4/20 (4/20)*100 J SE Relative frequency of a category = frequency of that category/ sum of all frequencies 11/22/2018 University of South Alabama
5
Graphical presentation
11/22/2018 University of South Alabama
6
Organizing Data (quantitative)
Consider the following table This is the organized data for the quantitative variable GPA. The table shows that there are 10 students whose GPA falls between 0 and 1 and … The values of x are divided into four distinct classes. Each class is an interval of values. GPA (x) Number of Students (f) 0 to 1 10 1 to 2 34 2 to 3 40 3 to 4 16 11/22/2018 University of South Alabama
7
Organizing Data (quantitative)
This is called Frequency Distribution Table or just frequency distribution . Class width = upper boundary-lower boundary Class midpoint = (upper boundary+lower boundary)/2 11/22/2018 University of South Alabama
8
Construction of frequency distribution
That is how to divide data into classes? How many classes? Depends on size of the data set. Vary between 5 and 20. What should be the class width? Approximately (largest value-smallest value)/number of classes 11/22/2018 University of South Alabama
9
Example of classification
Consider the following raw data on GPA of 30 students. We know that the variable GPA =x varies between 0 and 4 Number of classes can be 3 or 4 or 5. let it be 3 Then class width = ( )/3 = 1.68 2.35 2.59 3.99 2.87 3.23 3.78 2.00 2.74 1.54 2.15 3.22 1.78 3.66 3.01 3.27 2.55 2.29 3.39 3.94 3.00 2.75 1.77 1.99 3.45 2.28 3.07 11/22/2018 University of South Alabama
10
Example of classification
And the frequency table is as follows Relative frequency of a class = frequency of that class/ sum of all frequencies = f/Σf x f r.f. percent 1 to 2 5 5/30 (5/30)*100= 16.67 2 to 3 12 12/30 (12/30)*100= 40 3 to 4 13 13/30 (13/30)*100= 43.33 11/22/2018 University of South Alabama
11
Important note about classification
Most of the statistical software accept only raw data as input and they classify data for us. Thus if we are using a software to analyze our data, we do not have to worry about the classification part. Software gives us ability to change the number and width of the classes according to the need of the problem. 11/22/2018 University of South Alabama
12
Cumulative frequency distribution
A cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class. Example: Application 2.34 from the textbook frequency cumulative frequency Number of Checks Number of students 0 to 99 39 100 to 199 21 200 to 299 18 300 to 399 15 400 to 499 7 Number of Checks Number of students 0 to 99 39 0 to 199 60 0 to 299 78 0 to 399 93 0 to 499 100 11/22/2018 University of South Alabama
13
University of South Alabama
Frequency Curve Number of Checks Modified classes Midpoint of class Number of students 0 to 99 0 1o 100 50 39 100 to 199 100 to 200 150 21 200 to 299 200 to 300 250 18 300 to 399 300 to 400 350 15 400 to 499 400 to 500 450 7 11/22/2018 University of South Alabama
14
Ogive (Cumulative frequency curve)
11/22/2018 University of South Alabama
15
University of South Alabama
Stem and Leaf display A way of organizing and display quantitative data. Each value is divided into two parts – a stem and a leaf. \ To draw a stem-n-leaf plot its helpful to know the range of the data that is max value and min value If can not find out exact values of max and min, the approximate value can given us some idea about the range of the data. Consider the following data set of scores of 30 students in Statistics exam 75 52 80 96 65 79 71 87 93 95 69 72 81 61 76 86 68 50 92 83 84 77 64 57 98 11/22/2018 University of South Alabama
16
Stem and leaf display In this data set the values range between 50’s and 90’s. Thus we would like to count the number of values in 50’s, in 60’s in 70’s and so on That we declare the tenth place as stem and unit place as a leaf. Thus the resulting stem and leaf plot is as follows: Stem leaves Leaves arranged in order 5 2 0 7 0 2 7 6 7 8 9 11/22/2018 University of South Alabama
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.