Download presentation
Presentation is loading. Please wait.
Published byDarren Marshall Modified over 6 years ago
1
Section 1.4: Types of Data and Some Simple Graphical Displays
2
Definitions Variable – Is any characteristic whose value may change
Data – Result from making observations either on a single variable or simultaneously on two or more variables Univariate Data Set – Consists of observations on a single variable made on individuals in a sample or population
3
Univariate Data Sets Two Types of Univariate Data Sets:
Categorical Data Set – Individual observations are categorical responses (nonnumerical) Numerical Data Set – Individual observations are numerical (quantitative) in nature
4
Airline Safety Violations
The Federal Aviation Administration (FAA) monitors airlines and can take administrative actions for safety violations. Information about the tines assessed by the FAA appeared in the article “Just How Safe Is That Jet?” Violations that could lead to a fine were categorized as Security (S), Maintenance (M), Flight Operations (F), Hazardous Materials (h), or Other (O). Data for the variable type of violation for 20 administrative actions are given in the following list.
5
Airline Safety Violations List
S S M H M O S M S S F S O M S M S M S M What type of data set is this list? Because type of violation is nonnumerical, this is a categorical data set.
6
More than one variable If an observation consists of two or more responses or values: Two: Bivariate data set More than two: Multivariate data set
7
Revisiting Airline Safety Violations Example of Bivariate Data Set
The same article gave data on both the number of violations and the average fine per violation for 10 major airlines. The resulting data are given in the following table: Airline # of Violations Average Fine Alaska 258 America West 257 American 1745 Continental 973 Delta 1280 Northwest 1097 Southwest 535 TWA 642 United 1110 US Airways 891
8
Numerical Data Two types of Numerical Data
Discrete – If the possible values are isolated points on the number line Continuous – if the set of possible values forms an entire interval on the number line
9
Definitions Frequency Distribution for Categorical Data – A table that displays the possible categories along with the associated frequencies or relative frequencies Frequency – For a particular category is the number of times the category appears in the data set Relative Frequency – For a particular category is the fraction or proportion of the time that the category appears in the data set.
10
Definitions continued
Relative Frequency is calculated as: When the table includes relative frequencies it is sometimes referred to as a relative frequency distribution.
11
Preferred Leisure Activities
Below is a report on physical activity patterns in urban woman. The following coding is used: W = walking, T = weight training, C = cycling, G = gardening, A = aerobics W T A W G T W W C W T W A T T W G W W C A W A W W W T W W T The corresponding frequency distribution is given.
12
Frequency Distribution for Preferred Activity
13
Bar Charts A graph of the frequency distribution of categorical data.
How to construct 1. Draw a horizontal line, and write the category names or labels below the line at regularly spaced intervals. 2. Draw a vertical line and label the scale using either frequency or relative frequency. 3. Place a rectangular bar above each category label. The height is determined by the frequency and weight should be the same.
14
Why students drop out An article examined the reasons that college seniors leave before graduating. 42 seniors that dropped out were interviewed. Here is the data from the interview. Reason for leaving Frequency Academic Problems 7 Poor teaching 3 Needed a break 2 Economic Reasons 11 Family Responsibilities 4 To Attend another school 9 Personal Problems Other
15
Bar Chart for why students drop out
16
Dotplots A picture of numerical data in which each observation is represented by a dot on or above a horizontal measurement scale. How to construct 1. Draw a horizontal line and mark it with an appropriate measurement scale 2. Locate each value in the data set along the measurement scale, and represent it by a dot. If there are two or more observations with the same value, stack the dots vertically.
17
Graduation Rates for NCAA Division I Schools in California and Texas
The data is of freshmen who earned a bachelor's degree by 6 years. California: Texas:
18
Dotplot of Graduation Rates for California and Texas
19
Activity: Head Sizes
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.