Download presentation
Presentation is loading. Please wait.
Published byAudra Walton Modified over 9 years ago
1
Math 227 Statistics
2
Chapter 1
3
Outline 1 The Nature of Probability and Statistics 1-1Descriptive and Inferential Statistics 1-2Variables and Types of Data 1-3Data Collection and Sampling Techniques 1-4Observational and Experimental Studies 1-5Uses and Misuses of Statistics 1-6Computers and Calculators
4
1-1 Descriptive and Inferential Statistics Descriptive statistics consists of the collection, organization, summarization, and presentation of data. Inferential statistics consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables, and making predictions. 4 Bluman Chapter 1
5
Example 1 : a)In the 1996 presidential election, voters in Massachusetts cast 1,571,763 votes for Bill Clinton, 718,107 for Bob Dole, and 227,217 for H. Ross Perot. b) Descriptive Statistics Inferential Statistics Determine whether the results given are example of descriptive or inferential statistics Massachusetts Institute of Technology Professor Richard Larson studies the physics and psychology of queues. He estimates that people spend an average of 30 minutes a day in line.
6
II. Parameter and Statistic Population – Sample – Parameter – Statistic – consists of all subjects (human or otherwise) that are being studied. a group of subjects selected from a population (subset). a characteristic or measure obtained by using all the data values for a specific population. a characteristic or measure obtained by using the data values from a sample.
7
Example 1 : Statistic A national organization of personnel managers has estimated that about 25% of all resumes contain a major fabrication. Is 25 the value of a parameter or a statistic?
8
Example 2 : a) b) What is the parameter of interest? Population – seniors at a college Data values – 750 Their GPA Consider the problem of estimating the average point average (GPA) of the 750 seniors at a college. What is the population? How many data values are in the population?
9
c) d) No, because another group of 10 seniors would have different GPA’s. Suppose that a sample of 10 seniors is selected, and their GPAs are 2.72, 2.81, 2.65, 2.69, 3.17, 2.74, 2.57, 2.17, 3.48, 3.10. Calculate a statistic that you would use to estimate the parameter. Suppose that another sample of 10 seniors was selected. Would it be likely that the value of the statistic is the same as in part (c)? Why or why not? Would the value of the parameter remain the same? Yes, the parameter would be the same because we’re still looking at the GPA of all seniors.
10
Section 1 - 2 I. Variables A variable - Variables can be classified as qualitative (categorical) or quantitative (numerical). Quantitative variables can be further classified into discrete or continuous data. Discrete variables assume values that can be counted. (e.g. # of books, # of desks) Continuous variables assume all values between any two specific values. (e.g. length, time, etc.) a characteristic that changes for different individuals or objects under study.
11
1-2 Variables and Types of Data Data Qualitative Categorical Quantitative Numerical, Can be ranked Discrete Countable 5, 29, 8000, etc. Continuous Can be decimals 2.59, 312.1, etc. 11 Bluman Chapter 1
12
1-2 Recorded Values and Boundaries VariableRecorded ValueBoundaries Length15 centimeters (cm) Temperature 86 Fahrenheit ( F) Time0.43 second (sec) Mass1.6 grams (g) 14.5-15.5 cm 85.5-86.5 F 0.425-0.435 sec 1.55-1.65 g 12 Bluman Chapter 1
13
Example 1 : a) Number of people in the classroom b) Weights of new born babies in a hospital c) Eye colors of students in Math 227 Quantitative – Discrete because # of people can be counted. Quantitative – Continuous because the measurements are within a range. Qualitative Classify each variable as qualitative or quantitative. If the variable is quantitative, further classify it as discrete or continuous.
14
II. Measurement Scales Nominal level of measurement – Ordinal level of measurement – categorical data in which no ordering or ranking can be imposed on the data. (e.g. eye colors) categorical data that can be ranked. (e.g. rating scale - poor, good, excellent)
15
1-2 Variables and Types of Data Determine the measurement level. VariableNominalOrdinalIntervalRatioLevel Hair ColorYesNoNominal Zip CodeYesNoNominal LetterGradeYes NoOrdinal ACT ScoreYes NoInterval HeightYes Ratio AgeYes Ratio Temperature(F)Yes NoInterval 15 Bluman Chapter 1
16
Interval level of measurement – Ratio level of measurement – numerical data can be ranked; the differences between units of measure do exist; however, there is no true zero. (e.g. sea level, temperature) numerical data that can be ranked. The differences and ratios between units of measure do exist, and there exists a true zero.
17
Example 1 : a) Sizes of cars b) Nationality of each student c) IQ of each student d) Weights of new born babies Categorical – ordinal Categorical – nominal Numerical – interval Numerical – ratio Classify each as nominal-level, ordinal-level, interval-level, or ratio-level data.
18
1-3 Data Collection and Sampling Techniques Some Sampling Techniques Random – random number generator Systematic – every k th subject Stratified – divide population into “layers” Cluster – use intact groups Convenient – mall surveys 18 Bluman Chapter 1
19
II. Methods of Sampling Random Sampling – Systematic Sampling – each experimental unit has an equal chance of being selected. (e.g. Lottery) an initial experimental unit is randomly selected, then every k th unit is being chosen for sampling. e.g. A quality control engineer selects every 200 th TV remote control from an assembly line and conducts a test of qualities.
20
Stratified Sampling – the population is divided into subgroups (or strata) that share the same characteristics, then a sample from each subgroup (or stratum) is selected. e.g. A General Motors research team partitioned all registered cars into categories of subcompact, compact, mid-sized, and full-size. He surveyed 200 car owners from each category.
21
Convenience Sampling – Cluster Sampling – the population area is divided into sections (or clusters), then randomly select some of those clusters, and then choose a sample or all the members from those selected clusters. e.g. Two of nine colleges in the L.A. district are randomly selected, then all faculty from the two selected college are interviewed. use results that are very easy to get. e.g. An NBC television news reporter gets a reaction to a breaking story by polling people as they pass the front of his studio.
22
Example 1 : a) b) c) Stratified Convenience Random A marketing expert for MTV is planning a survey in which 500 people will be randomly selected from each age group of 10-19, 20-29, and so on. A news reporter stands on a street corner and obtains a sample of city residents by selecting five passing adults about their smoking habits. In a Gallup poll of 1059 adults, the interview subjects were selected by using a computer to randomly generate telephone numbers that were then called. Identify which of these types of sampling is used: random, systematic, convenience, stratified, or cluster.
23
d) e) f) g) Systematic Cluster Stratified Cluster At a police sobriety checkpoint at which every 10th driver was stopped and interviewed. A market researcher randomly selects 10 blocks in the Village of Newport, then asks all adult residents of the selected blocks whether they own a DVD player. General Foods plan to conduct a market survey of 100 men and 100 women in Orange County. CNN is planning an exit poll in which 100 polling stations will be randomly selected and all voters will be interviewed as they leave the premises.
24
h) i) Random Systematic An executive mixes all the returned surveys in a bin, then obtains a sample group by pulling 50 of those surveys. The Dutchess County Commissioner of Jurors obtains a list of 42,763 car owners and constructs a pool of jurors by selecting every 150th name on that list.
25
1-4 Observational and Experimental Studies In an observational study, the researcher merely observes and tries to draw conclusions based on the observations. The researcher manipulates the independent (explanatory) variable and tries to determine how the manipulation influences the dependent (outcome) variable in an experimental study. A confounding variable influences the dependent variable but cannot be separated from the independent variable. 25 Bluman Chapter 1
26
Section 1 - 4 I. Observational and Experimental Studies Observational Study – Experimental Study – The experimenter records the outcomes of an experiment without control. The experimenter intervenes by administering treatment to the subjects in order to study its effect on the subject.
27
An Independent Variable – A Dependent Variable – the outcome variable. A Treatment Group – the group that is being treated. A Controlled Group – the group that is not being treated. Confounding Factors – factors other than the treatment that can influence a study. the variable that is being manipulated by the researcher.
28
Example 1 : a) What is the treatment? b) Identify the treatment group and the control group. c) Is this an observational or experimental study? d) What factor could confound the result? Lipitor Treatment group – The group given Lipitor. Experimental Change eating habits, diet, exercise, smoking, genes. Control group – The group given a placebo. Lipitor is a drug that is supposed to lower the cholesterol level. To test the effectiveness of the drug, 100 patients were randomly selected and 50 were randomly chosen to use Lipitor. The other 50 were given a placebo that contained no drug at all.
29
1-5 Uses and Misuses of Statistics Suspect Samples Is the sample large enough? Is the sample large enough? How was the sample selected? How was the sample selected? Is the sample representative of the population? Is the sample representative of the population? Ambiguous Averages What particular measure of average was used and why? What particular measure of average was used and why? 29 Bluman Chapter 1
30
1-5 Uses and Misuses of Statistics Changing the Subject Are different values used to represent the same data? Are different values used to represent the same data? Detached Statistics One third fewer calories…….than what? One third fewer calories…….than what? Implied Connections Studies suggest that some people may understand what this statement means. Studies suggest that some people may understand what this statement means. 30 Bluman Chapter 1
31
1-5 Uses and Misuses of Statistics Misleading Graphs Are the scales for the x-axis and y-axis appropriate for the data? Are the scales for the x-axis and y-axis appropriate for the data? Faulty Survey Questions Do you feel that statistics teachers should be paid higher salaries? Do you feel that statistics teachers should be paid higher salaries? Do you favor increasing tuition so that colleges can pay statistics teachers higher salaries? Do you favor increasing tuition so that colleges can pay statistics teachers higher salaries? 31 Bluman Chapter 1
32
Section 1 - 5 I. Bias Statistics can be misused in ways that are deceptive: 1) Using samples that are not representative of the population. 2) Questionnaire or interview process may be flawed. 3) Conclusions are based on samples that are far too small. 4) Using graphs that produce a misleading impression.
33
Outline 2 Frequency Distributions and Graphs 2-1Organizing Data 2-2Histograms, Frequency Polygons, and Ogives 2-3Other Types of Graphs 2-4Paired Data and Scatter Plots
34
Objectives 2 Frequency Distributions and Graphs 1Organize data using a frequency distribution. 2Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives. 3Represent data using bar graphs, Pareto charts, time series graphs, and pie graphs. 4Draw and interpret a stem and leaf plot. 5Draw and interpret a scatter plot for a set of paired data.
35
Section 2 - 1 The most convenient method of organizing the data is to construct a frequency distribution. The most useful method of presenting the data is by constructing statistical graphs.
36
2-1 Organizing Data Data collected in original form is called raw data. A frequency distribution is the organization of raw data in table form, using classes and frequencies. Nominal- or ordinal-level data that can be placed in categories is organized in categorical frequency distributions. 36 Bluman, Chapter 2
37
Categorical Frequency Distribution Twenty-five army inductees were given a blood test to determine their blood type. Raw Data: A,B,B,AB,O O,O,B,AB,B B,B,O,A,O A,O,O,O,AB AB,A,O,B,A Construct a frequency distribution for the data. 37 Bluman, Chapter 2
38
Categorical Frequency Distribution Twenty-five army inductees were given a blood test to determine their blood type. Raw Data: A,B,B,AB,O O,O,B,AB,B B,B,O,A,O A,O,O,O,AB AB,A,O,B,A ClassTallyFrequencyPercent A B O AB IIII IIII II IIII 57945794 20 28 36 16 38 Bluman, Chapter 2
39
Section 2 - 1 I. Categorical Frequency Distributions – Example 1 : Letter grades for Math 227, Spring 2005: C A B C D F B B A C C F C B D A C C C F C C Construct a frequency distribution for the categorical data. A 3 D 2 F 3 B 4 C 10 Letter Grade Frequency count how many times each distinct category has occurred and summarize the results in a table format.
40
II. Ungrouped Frequency Distribution – count how many times each distinct values has occurred and summarize the results in a table format.
41
Example 2 : The number of incoming telephone calls per day over the first 25 days of business: 4, 4, 1, 10, 12, 6, 4, 6, 9, 12, 12, 1, 1, 1, 12, 10, 4, 6, 4, 8, 8, 9, 8, 4, 1 (a) Construct an ungrouped frequency distribution 1 5 2 0 3 0 4 6 5 0 6 3 7 0 8 3 9 2 10 2 11 0 12 4 Number of Calls Frequency
42
(b) Total number of days = 25 Days of less than 8 telephone calls = 14 Percentage = What is the percentage of days in which there were less than 8 telephone calls?
43
Grouped Frequency Distribution Grouped frequency distributions are used when the range of the data is large. The smallest and largest possible data values in a class are the lower and upper class limits. Class boundaries separate the classes. To find a class boundary, average the upper class limit of one class and the lower class limit of the next class. 43 Bluman, Chapter 2
44
Grouped Frequency Distribution The class width can be calculated by subtracting successive lower class limits (or boundaries) successive lower class limits (or boundaries) successive upper class limits (or boundaries) successive upper class limits (or boundaries) upper and lower class boundaries upper and lower class boundaries The class midpoint X m can be calculated by averaging upper and lower class limits (or boundaries) upper and lower class limits (or boundaries) 44 Bluman, Chapter 2
45
Rules for Classes in Grouped Frequency Distributions 1. There should be 5-20 classes. 2. The class width should be an odd number. 3. The classes must be mutually exclusive. 4. The classes must be continuous. 5. The classes must be exhaustive. 6. The classes must be equal in width (except in open-ended distributions). 45 Bluman, Chapter 2
46
Example 1 : Construct a grouped frequency table for the following data values. 44, 32, 35, 38, 35, 39, 42, 36, 36, 40, 51, 58, 58, 62, 63, 72, 78, 81, 25, 84, 20 1. Let number of classes be 5 2. Range = High – Low = 84 – 20 = 64 Class width = 64 / 5 = 12.8 ≈ 13 (Round-up) 20 – 33 – 46 – 59 – 72 – +13 Class Limit Frequency Reorder: 20, 25, 32, 35, 35, 36, 36, 38, 39, 40, 42, 44, 51, 58, 58, 62, 63, 72, 78, 81, 84 32 45 58 71 84 3 9 3 2 4
47
IV. Class Boundaries, Class Mark, and Relative Frequency Class Boundaries – e.g. data were whole numbers lower class boundary = lower class limit – 0.5 upper class boundary = upper class limit + 0.5 e.g. data were one decimal place lower class boundary = lower class limit – 0.05 upper class boundary = upper class limit + 0.05 closing the gap between one class to the next class. The class limits should have the same decimal value as the data, but the class boundaries have an additional place value and end with a 5.
48
e.g. data were two decimal place lower class boundary = lower class limit – 0.005 upper class boundary = upper class limit + 0.005 Class Mark – the midpoint of each class Class Mark = (lower class limit + upper class limit) / 2 Cumulative frequency – Relative frequency – the frequency of each class divided by the total number. Relative frequency = f / n the sum of the frequencies accumulated up to the upper boundary of a class.
49
Example 1 : Complete the table
50
Chapter 2 Frequency Distributions and Graphs Section 2-1 Example 2-2 Page #41 50 Bluman, Chapter 2
51
Constructing a Grouped Frequency Distribution The following data represent the record high temperatures for each of the 50 states. Construct a grouped frequency distribution for the data using 7 classes. 112 100 127 120 134 118 105 110 109 112 110 118 117 116 118 122 114 114 105 109 107 112 114 115 118 117 118 122 106 110 116 108 110 121 113 120 119 111 104 111 120 113 120 117 105 110 118 112 114 114 51 Bluman, Chapter 2
52
Constructing a Grouped Frequency Distribution STEP 1 Determine the classes. Find the class width by dividing the range by the number of classes 7. Range = High – Low = 134 – 100 = 34 Width = Range/7 = 34/7 = 5 Rounding Rule: Always round up if a remainder. 52 Bluman, Chapter 2
53
Constructing a Grouped Frequency Distribution For convenience sake, we will choose the lowest data value, 100, for the first lower class limit. The subsequent lower class limits are found by adding the width to the previous lower class limits. Class Limits 100 - 105 - 110 - 115 - 120 - 125 - 130 - 104 109 114 119 124 129 134 The first upper class limit is one less than the next lower class limit. The subsequent upper class limits are found by adding the width to the previous upper class limits. 53 Bluman, Chapter 2
54
Constructing a Grouped Frequency Distribution The class boundary is midway between an upper class limit and a subsequent lower class limit. 104,104.5,105 Class Limits Class Boundaries Frequency Cumulative Frequency 100 - 104 105 - 109 110 - 114 115 - 119 120 - 124 125 - 129 130 - 134 99.5 - 104.5 104.5 - 109.5 109.5 - 114.5 114.5 - 119.5 119.5 - 124.5 124.5 - 129.5 129.5 - 134.5 Bluman, Chapter 254
55
Constructing a Grouped Frequency Distribution STEP 2 Tally the data. STEP 3 Find the frequencies. Class Limits Class Boundaries Frequency Cumulative Frequency 100 - 104 105 - 109 110 - 114 115 - 119 120 - 124 125 - 129 130 - 134 2 8 18 13 7 1 99.5 - 104.5 104.5 - 109.5 109.5 - 114.5 114.5 - 119.5 119.5 - 124.5 124.5 - 129.5 129.5 - 134.5 55 Bluman, Chapter 2
56
Class Limits Class Boundaries Frequency Cumulative Frequency 100 - 104 105 - 109 110 - 114 115 - 119 120 - 124 125 - 129 130 - 134 Constructing a Grouped Frequency Distribution STEP 4 Find the cumulative frequencies by keeping a running total of the frequencies. 2 10 28 41 48 49 50 99.5 - 104.5 104.5 - 109.5 109.5 - 114.5 114.5 - 119.5 119.5 - 124.5 124.5 - 129.5 129.5 - 134.5 2 8 18 13 7 1 56 Bluman, Chapter 2
57
2-2 Histograms, Frequency Polygons, and Ogives The histogram is a graph that displays the data by using vertical bars of various heights to represent the frequencies of the classes. The class boundaries are represented on the horizontal axis. 57 Bluman, Chapter 2
58
Section 2 - 3 Histogram – a graph that displays that data by using contiguous vertical bars. Polygon – Ogive – Relative Frequency Graphs – use relative frequency instead of frequency. x-axis: midpoints y-axis: frequency a graph that displays data by using lines that connect points plotted for the frequencies at the midpoints of the classes. x-axis: class boundaries y-axis: frequency a line graph that represents the cumulative frequencies for the classes in a frequency distribution. x-axis: class boundaries y-axis: cumulative frequency
59
Example 1 : The following data are the number of the English-language Sunday Newspaper per state in the United States as of February 1, 1996. 2 3 3 4 4 4 4 4 5 6 6 6 7 7 7 8 10 11 11 11 12 12 13 14 14 14 15 15 16 16 16 16 16 16 18 18 19 21 21 23 27 31 35 37 38 39 40 44 62 85
60
(for part b) (for part e) (for part c) (for part d) a) Using 1 as the starting value and a class width of 15, construct a grouped frequency distribution.
61
b) Construct a histogram for the grouped frequency distribution. (x-axis: class boundaries; y-axis: frequency) c) Construct a frequency polygon (x-axis: class marks; y-axis: frequency)
62
d) Construct an ogive (x-axis: class boundaries; y-axis: cumulative frequency)
63
e)Construct a (i) relative frequency histogram, (ii) relative frequency polygon, and (iii) relative cumulative frequency ogive. (i)Histogram using Relative Frequencies (x-axis: class boundaries; y-axis: relative frequency)
64
(ii) Polygon using Relative Frequency (x-axis: class marks; y-axis: relative frequency)
65
(iii) Ogive using Relative Cumulative Frequency (x-axis: class boundaries; y-axis: relative cumulative frequency)
66
Chapter 2 Frequency Distributions and Graphs Section 2-2 Example 2-4 Page #51 66 Bluman, Chapter 2
67
Histograms Construct a histogram to represent the data for the record high temperatures for each of the 50 states (see Example 2–2 for the data). 67 Bluman, Chapter 2
68
Histograms Class Limits Class Boundaries Frequency 100 - 104 105 - 109 110 - 114 115 - 119 120 - 124 125 - 129 130 - 134 99.5 - 104.5 104.5 - 109.5 109.5 - 114.5 114.5 - 119.5 119.5 - 124.5 124.5 - 129.5 129.5 - 134.5 2 8 18 13 7 1 Histograms use class boundaries and frequencies of the classes. 68 Bluman, Chapter 2
69
Histograms Histograms use class boundaries and frequencies of the classes. 69 Bluman, Chapter 2
70
2.2 Histograms, Frequency Polygons, and Ogives The frequency polygon is a graph that displays the data by using lines that connect points plotted for the frequencies at the class midpoints. The frequencies are represented by the heights of the points. The class midpoints are represented on the horizontal axis. 70 Bluman, Chapter 2
71
Chapter 2 Frequency Distributions and Graphs Section 2-2 Example 2-5 Page #53 71 Bluman, Chapter 2
72
Frequency Polygons Construct a frequency polygon to represent the data for the record high temperatures for each of the 50 states (see Example 2–2 for the data). 72 Bluman, Chapter 2
73
Frequency Polygons Class Limits Class Midpoints Frequency 100 - 104 105 - 109 110 - 114 115 - 119 120 - 124 125 - 129 130 - 134 102 107 112 117 122 127 132 2 8 18 13 7 1 Frequency polygons use class midpoints and frequencies of the classes. 73 Bluman, Chapter 2
74
Frequency Polygons Frequency polygons use class midpoints and frequencies of the classes. A frequency polygon is anchored on the x-axis before the first class and after the last class. 74 Bluman, Chapter 2
75
2.2 Histograms, Frequency Polygons, and Ogives The ogive is a graph that represents the cumulative frequencies for the classes in a frequency distribution. The upper class boundaries are represented on the horizontal axis. 75 Bluman, Chapter 2
76
Chapter 2 Frequency Distributions and Graphs Section 2-2 Example 2-6 Page #54 76 Bluman, Chapter 2
77
Ogives Construct an ogive to represent the data for the record high temperatures for each of the 50 states (see Example 2–2 for the data). 77 Bluman, Chapter 2
78
Ogives Ogives use upper class boundaries and cumulative frequencies of the classes. Class Limits Class Boundaries Frequency Cumulative Frequency 100 - 104 105 - 109 110 - 114 115 - 119 120 - 124 125 - 129 130 - 134 99.5 - 104.5 104.5 - 109.5 109.5 - 114.5 114.5 - 119.5 119.5 - 124.5 124.5 - 129.5 129.5 - 134.5 2 8 18 13 7 1 2 10 28 41 48 49 50 78 Bluman, Chapter 2
79
Ogives Ogives use upper class boundaries and cumulative frequencies of the classes. Class Boundaries Cumulative Frequency Less than 104.5 Less than 109.5 Less than 114.5 Less than 119.5 Less than 124.5 Less than 129.5 Less than 134.5 2 10 28 41 48 49 50 79 Bluman, Chapter 2
80
Ogives Ogives use upper class boundaries and cumulative frequencies of the classes. 80 Bluman, Chapter 2
81
Constructing Statistical Graphs Step 1 Draw and label the x and y axes. Step 2 Choose a suitable scale for the frequencies or cumulative frequencies, and label it on the y axis. Step 3 Represent the class boundaries for the histogram or ogive, or the midpoint for the frequency polygon, on the x axis. Step 4 Plot the points and then draw the bars or lines.
82
2.2 Histograms, Frequency Polygons, and Ogives If proportions are used instead of frequencies, the graphs are called relative frequency graphs. Relative frequency graphs are used when the proportion of data values that fall into a given class is more important than the actual number of data values that fall into that class. 82 Bluman, Chapter 2
83
Chapter 2 Frequency Distributions and Graphs Section 2-2 Example 2-7 Page #57 83 Bluman, Chapter 2
84
Class Boundaries Frequency 5.5 - 10.5 10.5 - 15.5 15.5 - 20.5 20.5 - 25.5 25.5 - 30.5 30.5 - 35.5 35.5 - 40.5 12354321235432 Construct a histogram, frequency polygon, and ogive using relative frequencies for the distribution (shown here) of the miles that 20 randomly selected runners ran during a given week. 84 Bluman, Chapter 2
85
Histograms Class Boundaries Frequency Relative Frequency 5.5 - 10.5 10.5 - 15.5 15.5 - 20.5 20.5 - 25.5 25.5 - 30.5 30.5 - 35.5 35.5 - 40.5 12354321235432 1/20 = 2/20 = 3/20 = 5/20 = 4/20 = 3/20 = 2/20 = The following is a frequency distribution of miles run per week by 20 selected runners. f = 20 rf = 1.00 0.05 0.10 0.15 0.25 0.20 0.15 0.10 Divide each frequency by the total frequency to get the relative frequency. 85 Bluman, Chapter 2
86
Histograms Use the class boundaries and the relative frequencies of the classes. 86 Bluman, Chapter 2
87
Frequency Polygons Class Boundaries Class Midpoints Relative Frequency 5.5 - 10.5 10.5 - 15.5 15.5 - 20.5 20.5 - 25.5 25.5 - 30.5 30.5 - 35.5 35.5 - 40.5 8 13 18 23 28 33 38 The following is a frequency distribution of miles run per week by 20 selected runners. 0.05 0.10 0.15 0.25 0.20 0.15 0.10 87 Bluman, Chapter 2
88
Frequency Polygons Use the class midpoints and the relative frequencies of the classes. 88 Bluman, Chapter 2
89
Ogives Class Boundaries Frequency Cumulative Frequency Cum. Rel. Frequency 5.5 - 10.5 10.5 - 15.5 15.5 - 20.5 20.5 - 25.5 25.5 - 30.5 30.5 - 35.5 35.5 - 40.5 12354321235432 1/20 = 3/20 = 6/20 = 11/20 = 15/20 = 18/20 = 20/20 = The following is a frequency distribution of miles run per week by 20 selected runners. f = 20 0.05 0.15 0.30 0.55 0.75 0.90 1.00 1 3 6 11 15 18 20 89 Bluman, Chapter 2
90
Ogives Ogives use upper class boundaries and cumulative frequencies of the classes. Class Boundaries Cum. Rel. Frequency Less than 10.5 Less than 15.5 Less than 20.5 Less than 25.5 Less than 30.5 Less than 35.5 Less than 40.5 0.05 0.15 0.30 0.55 0.75 0.90 1.00 90 Bluman, Chapter 2
91
Ogives Use the upper class boundaries and the cumulative relative frequencies. 91 Bluman, Chapter 2
92
Shapes of Distributions 92 Bluman, Chapter 2
93
Shapes of Distributions 93 Bluman, Chapter 2
94
2.3 Other Types of Graphs Bar Graphs 94 Bluman, Chapter 2
95
2.3 Other Types of Graphs Pareto Charts 95 Bluman, Chapter 2
96
2.3 Other Types of Graphs Time Series Graphs 96 Bluman, Chapter 2
97
2.3 Other Types of Graphs Pie Graphs 97 Bluman, Chapter 2
98
Section 2 – 4 Graphs related to Categorical Data I.Pareto Chart x-axis: categorical variables y-axis: frequencies, which are arranged in order from highest to lowest II. Pie Graph A pie graph is a circle that is divided into sections or wedges according to the percentage of frequencies in each category of the distribution.
99
Example 1 : a) Construct a Pareto chart Arrange the frequency in descending order. A 2 B 4 C 5 D 1 F 2 Grade Frequency Grade received for Math 227 C A B B D C C C C B B A F F C 5 B 4 A 2 F 2 D 1 Grade Frequency
101
b) Construct a pie graph
103
III. Time Series Graph A time series graph represents data that occur over a specific period of time. Example 1: Year 1984 1988 1992 1996 2000 % of voters voting74.63%72.48%78.01%65.97%67.50% The percentages of voters voting in the last 5 Presidential elections are shown here. Construct a time series graph.
104
2.3 Other Types of Graphs Stem and Leaf Plots A stem and leaf plot is a data plot that uses part of a data value as the stem and part of the data value as the leaf to form groups or classes. It has the advantage over grouped frequency distribution of retaining the actual data while showing them in graphic form. 104 Bluman, Chapter 2
105
IV. Stem and Leaf Plot Digits of each data to the left of a vertical bar are called the stems. Digits of each data to the right of the appropriate stem are called the leaves. Example 1: The test scores on a 100-point test were recorded for 20 students 61 93 91 86 55 63 86 82 76 57 94 89 67 62 72 87 68 65 75 84 Construct an ordered stem-and-leaf plot Reorder the data: 55 57 61 62 63 65 67 68 72 75 76 82 84 86 86 87 89 91 93 94 5 5 7 6 1 2 3 5 7 8 7 2 5 6 8 2 4 6 6 7 9 9 1 3 4 Stem Leaf
106
Example 2 : A stem-and-leaf plot portrays the shape of a distribution and restores the original data values. 0 – 4 → belongs to the first stem 5 – 9 → belongs to the second stem 5 5 5 7 6 1 2 3 6 5 7 8 7 2 7 5 6 8 2 4 8 6 6 7 9 9 1 3 4 9 Stem Leaf Use the data in example 1 to construct a double stem and leaf plot. e.g. split each stem into two parts, with leaves 0 – 4 on one part and 5 – 9 on the other. It is also useful for spotting outliers. Outliers are data values that are extremely large or extremely small in comparison to the norm.
107
Chapter 2 Frequency Distributions and Graphs Section 2-3 Example 2-13 Page #80 107 Bluman, Chapter 2
108
At an outpatient testing center, the number of cardiograms performed each day for 20 days is shown. Construct a stem and leaf plot for the data. 108 Bluman, Chapter 2 2531203213 1443 25723 3632333244 3252445145
109
109 Bluman, Chapter 2 2531203213 1443 25723 3632333244 3252445145 02 134 2035 31222236 43445 5127 Unordered Stem PlotOrdered Stem Plot 02 134 2503 31262322 43445 5721
110
V. Misleading Graphs Is the picture misleading? Yes, the bar didn’t start at zero at y-axis.
111
This is the proper picture –
112
2.4 Paired Data and Scatter Plots A scatter plot is a graph of order pairs of data values that is used to determine if a relationship exists between the two variables. 112 Bluman, Chapter 2
113
Section 2 - 5 Paired Data and Scatter Plots I.Scatter Plot – is a graph of order pairs values that is used to determine if a relationship exists between two variables. Example 1: A researcher wishes to determine if there is a relationship between the number of days an employee missed a year and the person’s age. Draw a scatter plot and comment on the nature of the relationship. Age (x) 22 30 25 35 65 50 27 53 42 58 Days missed (y) 0 4 1 2 14 7 3 8 6 4 The relationship of the data shows a positive linear relationship.
114
Chapter 2 Frequency Distributions and Graphs Section 2-4 Example 2-16 Page #95 114 Bluman, Chapter 2
115
A researcher is interested in determining if there is a relationship between the number of wet bike accidents and the number of wet bike fatalities. The data are for a 10-year period. Draw a scatter plot for the data. 115 Bluman, Chapter 2
116
Step 1 Draw and label the x and y axes. Step 2 Plot the points for pairs of data. 116 Bluman, Chapter 2
117
2.4 Paired Data and Scatter Plots Analyzing the Scatter Plot 1.A positive linear relationship exists when the points fall approximately in an ascending straight line and both the x and y values increase at the same time. 117 Bluman, Chapter 2
118
2.4 Paired Data and Scatter Plots Analyzing the Scatter Plot 2.A negative linear relationship exists when the points fall approximately in a descending straight line from left to right. 118 Bluman, Chapter 2
119
2.4 Paired Data and Scatter Plots Analyzing the Scatter Plot 3.A nonlinear relationship exists when the points fall in a curved line. 119 Bluman, Chapter 2
120
2.4 Paired Data and Scatter Plots Analyzing the Scatter Plot 4.No relationship exists when there is no discernible pattern of the points. 120 Bluman, Chapter 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.