Class Data (Major) Ungrouped data:

Class Data (Major) Ungrouped data:
Public Health Other Psychology Cog Science Biology Biology Psychology Public Health Other Cog Science Cog Science Psychology Other Biology Public Health Other Psychology Public Health Biology Cog Science Ungrouped data: A set of scores or categories distributed individually, where the frequency for each individual score or category is counted. Natural, distinct Categories/groupings What kind of data is this? What is the scale of measurement?

Class Data (Major) Major Count of Major Psychology 22 Biology 17 Cognitive Science 12 Public Health Other Total 80 Notice: each category is represented by a different rectangle. Also notice: each rectangle does not touch along the x-axis. This is meant to show that they are discrete, distinct categories. A bar chart is a graphical display used to summarize the frequency of discrete and categorical data that are distributed in whole units or classes.

Class Data (Major) Major Count of Major Psychology 22 Biology 17 Cognitive Science 12 Public Health Other Total 80 A graphical display is the shape of a circle that is used to summarize the relative percent of categorical data.

Class Data (Year) Natural, distinct Categories/groupings Ordered?
Sophomore Senior Freshman Junior Sophomore Freshman Junior Sophomore Junior Freshman Sophomore Freshman Junior Senior Natural, distinct Categories/groupings Ordered? What kind of data is this? What is the scale of measurement?

Class Data (Year) Year Count of Year Freshman 11 Sophomore 60 Junior 7
Senior 2 Total 80

Definitions Grouped data: Interval:
A set of scores distributed into intervals, where the frequency of each score can fall into only one interval. Interval: A range of values within which the frequency of a subset of scores is contained.

Steps to Summarize Grouped Data
Step 1: Find the real range The real range is one more than the difference between the largest and smallest value in a list of data Step 2: Find the interval width The interval width is the range of scores in each interval There should be between 5 and 20 intervals. Step 3: Construct the frequency distribution

Steps to Summarize Grouped Data
Step 1: Find the real range The range is = 65. The real range is = 66 94 84 95 65 90 62 92 58 88 86 97 96 73 93 91 64 71 78 100 87 85 47 35 81 69 53 77

Steps to Summarize Grouped Data (cont.)
Step 2: Find the interval width Number of intervals is 6 Interval width is real range divided by the number of intervals: 66/6 = 11 Step 3: Construct the frequency distribution

Steps to Summarize Grouped Data (cont.)
Intervals f(x) 90-100 17 79-89 10 68-78 5 57-67 46-56 2 35-45 1 Total 40 Intervals f(x) 17 10 5 2 1 Total 40 Rules for a simple frequency distribution: Each interval is defined Each interval is equal length No interval overlaps Book version: slightly easier to construct and understand My version: used for creating the visualization

Cumulative Frequency Distributes the sum of frequencies across a series of intervals Add from bottom up: Discuss in terms of “less than”, “at or below” a certain value, or “at most” Add from top down: Discuss in terms of “greater than”, “at or above” a certain value, or “at least” Intervals f(x) Cum. Freq. (bottom up) Cum. Freq. (top down) 90-100 17 40 79-89 10 23 27 68-78 5 13 32 57-67 8 37 46-56 2 3 39 35-45 1 Total

Relative Frequency Distributes the proportion of scores in each interval Equals the frequency in an interval divided by the total number of frequencies Often used to summarize large data sets Intervals f(x) Relative Frequency Cum. Rel. Frequency 90-100 17 0.425 1.000 79-89 10 0.250 0.575 68-78 5 0.125 0.325 57-67 0.200 46-56 2 0.050 0.075 35-45 1 0.025 Total 40

Class Data (Excitement)
3 4 2 5 1 4 5 3 2 4 3 1 5 3 5 2 4 1 In the perspective of scores on a continuum from 1 to 5, categories/groupings arbitrary

Creating a Histogram Histogram: Rules for creating a histogram:
Summarizes the frequency of continuous, grouped (or ungrouped) data. Rules for creating a histogram: Rule 1: Vertical rectangles represent each interval, and the height of the rectangle equals the frequency recorded for each interval. Rule 2: The base of each rectangle begins and ends at the upper and lower boundaries of each interval. Rule 3: Each rectangle touches adjacent rectangles at the boundaries of each interval.

Intervals f(x) 1-5 80 Total Frequency = Number of observations 80 observations between 0.5 and 5.5 (i.e., all data) Not very useful Intervals f(x) 80 Total Point out why we set the boundaries as .5s even though we don’t actually have that as options, clearer on next slide

Intervals f(x) 5-6 10 3-4 61 1-2 9 Total 80 More informative e.g., 61 observations between 2.5 and 4.5 (i.e., 3’s and 4’s) Intervals f(x) 10 61 9 Total 80

Intervals f(x) 5 10 4 17 3 44 2 6 1 Total 80 Even more informative e.g., 10 observations between 4.5 and 5.5 (i.e., 5’s) Intervals f(x) 10 17 44 6 3 Total 80

Relative Frequency = Frequency/Total e.g. 10/80 = 0.125

Definitions Frequency polygon: Ogive:
A figure that summarizes the frequency of continuous data at the midpoint of each interval. Ogive: A figure that summarizes the cumulative frequency of continuous data at the upper boundary of each interval.

Connect midpoints of each class/group

Frequency Polygon by itself

e.g., cumulative frequency up to 3.5 is = 53 Ogive always starts at 0 and ends at total

e.g., cumulative relative frequency up to 3.5 is = Relative Frequency Ogive always starts at 0 and ends at 1

Scatterplot Scatterplot (called a scattergram in the book):
A display of paired data points (x, y) that summarizes the relationship between two variables. Data points are plotted to see whether a pattern emerges.

Measures on Central Tendency
The “center” of a distribution. A measure of central tendency is a statistical measure that tends toward the center of a distribution. They are used to locate a single score that is most representative or descriptive of all the scores in a distribution. They can help us know if the distribution tends to be composed of high or low scores. The types: Mean Median Mode

Mode Most frequent value in a data set
53, 67, 75, 75, 84, 91, 91, 91, 94, 99 Mode = 91 (one mode) 53, 75, 75, 75, 84, 91, 91, 91, 94, 99 Mode = 75 and 91 (two modes) 53, 72, 73, 75, 84, 91, 92, 93, 94, 99 Mode = none (no mode)

Class Data (Major) Ungrouped data:

Similar presentations

Presentation on theme: "Class Data (Major) Ungrouped data:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Class Data (Major) Ungrouped data:

Similar presentations

Presentation on theme: "Class Data (Major) Ungrouped data:"— Presentation transcript:

Similar presentations

About project

Feedback