Download presentation
1
SULIDAR FITRI, M.Sc FEBRUARY 20,2013
STATISTIKA CHATPER 2 (Summarizing and Graphing Data Summarizing and Graphing Data) 2-2 Frequency Distributions 2-3 Histograms 2-4 Statistical Graphics SULIDAR FITRI, M.Sc FEBRUARY 20,2013
2
Introduction To Statistics Math 13
Essentials of Statistics 3rd edition by Mario F. Triola Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
3
Important Characteristics of Data
Overview Important Characteristics of Data 1. Center: An average value that indicates where the middle of the data set is located. 2. Variation: A measure of spread - the amount by which the values vary among themselves. 3. Distribution: The nature (shape) of the distribution of data: bell-shaped, uniform, or skewed. 4. Outliers: Data values that lie very far away from the vast majority of other values. 5. Time: Changing characteristics of the data values over time. Center of the data: mean Median ============= Variation range quartiles (interquartile range) variance standard deviation== Skewed : has tail (tdk simetris) Vast=luas sekali
4
Frequency Distributions
Section 2-2 Frequency Distributions Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
5
Definition Frequency Distribution Table
lists data values (individually or by groups) in one column, and their corresponding frequencies (counts) in the second column. Frequencies are denoted: F (for a population) or f (for a sample). The total sum of the frequencies in all classes must add up to the population size (or the sample size):
6
Frequency Distribution: Ages of Best Actresses
Original Data Frequency Distribution
7
Frequency Distributions
Definitions
8
Lower Class Limit Lower Class Limits
is the smallest number that can actually belong to a class. Lower Class Limits
9
Upper Class Limit Upper Class Limits
is the largest number that can actually belong to a class. Upper Class Limits
10
Class Width Class Width Editor: Substitute Table 2-2 10
is the difference between two consecutive lower class limits (or two consecutive upper class limits). Editor: Substitute Table 2-2 Class Width 10 Consecutive= berturut-turut
11
Class Boundaries Class Boundaries Editor: Substitute Table 2-2
are the numbers used to separate classes: class boundary falls in the middle of the gap created by class limits. Editor: Substitute Table 2-2 Class Boundaries 20.5 30.5 40.5 50.5 60.5 70.5 80.5
12
Class Midpoints Class Midpoints
are the numbers that fall in the middle of each class: class midpoint can be found as (lower class limit + upper class limit) Class Midpoints 25.5 35.5 45.5 55.5 65.5 75.5
13
Reasons for Constructing Frequency Distributions
1. Large data sets can be summarized. 2. We can gain some insight into the nature of data. 3. We have a basis for constructing important graphs.
14
Constructing A Frequency Distribution
1. Decide on the number of classes (best: between 5 and 20). 2. Calculate the class width (round up): class width (maximum data value) – (minimum data value) number of classes 3. Choose the starting point, which will be the lower limit of the first class. List the lower class limits by adding the calculated class width to the lower limit of the first class. List the upper class limits, which should be one less then the next lower class limit. Enter a count (frequency) of data values in each class.
15
Grouping with Equal Class widths
Ex. 1 Given the ages of the male Oscar award recipients, construct a frequency distribution table with 6 classes: (1) Smallest data value = 29 Largest data value = 76 (2) Class width = (3) Let’s choose the starting point to be 29 (4) then the lower class limits are: Age of Actor 29 – 36 37 – 44 45 – 52 53 – 60 61 – 68 69 – 76 Age of Actor 29 – 36 37 – 44 45 – 52 53 – 60 61 – 68 69 – 76 (5) then the upper class limits are: (6) Now, tally the values in each class and fill in the 2nd column:
16
Frequency Distribution: Ages of Best Actors
Solution: Frequency Distribution: Ages of Best Actors Age of Actor Frequency, f 29 – 36 15 37 – 44 32 45 – 52 17 53 – 60 8 61 – 68 3 69 – 76 1 n = 76 (total)
17
Definition Relative Frequency Table
lists the same classes as the Frequency Distribution table in one column, and their corresponding relative frequencies (percentages) in the second column. Relative Frequency is denoted sometimes p-hat, sometimes p: and is equal to the percent of the subjects in a class out of the whole sample:
18
Frequency and Relative Frequency
distribution tables:
19
Relative Frequency Distribution:
Ex. 2 Given the Frequency Distribution table constructed in Example 1, create the corresponding Relative Frequency Distribution table: Frequency Distribution: Ages of Best Actors Age of Actor Frequency 29 – 36 15 37 – 44 32 45 – 52 17 53 – 60 8 61 – 68 3 69 – 76 1 n = 76 (1) Calculate the relative frequency in each class by finding the ratio f/n : Relative Frequency: Ages of Best Actors Age of Actor Relative Frequency 29 – 36 19.7% 37 – 44 42.1% 45 – 52 22.4% 53 – 60 10.5% 61 – 68 3.9% 69 – 76 1.3% 99.9% (total) (2) Write the relative frequency values in the second column:
20
Definition Cumulative Frequency Table
replaces both columns of the Frequency Distribution table with cumulative groups and cumulative frequencies. The cumulative groups of values are found as all the values that are less than each of the lower class limits of the original groups; Cumulative frequencies in each such cumulative group are found by adding the frequency in the original group to the cumulative frequency in the previous group.
21
Cumulative Frequency Distribution
Frequencies
22
Frequency Tables
23
Critical Thinking Interpreting Frequency Distributions
In later chapters, there will be frequent references to data with normal distribution. One key characteristic of a normal distribution is that it has a “bell” shape: The frequencies start low, then increase to some maximum frequency, then decrease to a low frequency. The frequencies are distributed approximately symmetric, nearly evenly distributed on both sides of the maximum frequency.
24
Comparing Two Samples Ex. 3
Given their corresponding Relative Frequency Distribution tables, make a comparison of the ages of Oscar-winning actresses and actors: Ages of Best Actresses and Actors Age Relative Frequency for Actresses Relative Frequency for Actors 21 – 30 37% 4% 31 – 40 39% 33% 41 – 50 16% 51 – 60 3% 18% 61 – 70 71 – 80 1% 101% 99% (3) Neither distribution appears to be normal – data values are not distributed symmetrically on each side of the maximum frequencies. (1) actresses tend to be somewhat younger than actors. 37% of the actresses are in the youngest age group while only 4% of the actors fall into that age. (2) The highest relative frequency for the actresses (39%) corresponds to the age group from 31 to 40; the highest relative frequency for actors (39%) corresponds to the age group from 41 to 50 years old, ten years older that the most frequent age group for actresses.
25
Recap In this Section we have discussed
Important characteristics of data: C V D O T Frequency distributions; Procedures for constructing frequency distributions; Relative frequency distributions; Cumulative frequency distributions; Comparing samples using their frequency distributions.
26
CARA MENENTUKAN BANYAK KELAS RUMUS STURGES
K = 1 + 3,3 LOG n K: Banyaknya Kelas N: Jumlah data yang kita miliki Kasus: Sampel yang berupa penjualan produk suatu perusahaan terhadap 80 pelanggan. Maka, bagaimana menentukan jumlah kelas yang sesuai???
27
Answer: K = 1 + 3,3 log n = 1 + 3,3 log 80 = 1 + 3,3 (1,9031)
Jumlah data yang dimiliki (n)= 80 Maka: K = 1 + 3,3 log n = 1 + 3,3 log 80 = 1 + 3,3 (1,9031) = 7,280 … dibulatkan menjadi 7 kelas
28
Section 2-3 Histograms Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
29
Key Concept A histogram is an important type of graph that portrays the nature of the distribution:
30
Relative Frequency Histogram
Has the same shape and horizontal scale as a histogram, but the vertical scale is marked with relative frequencies instead of actual frequencies
31
Key Concept A histogram makes visible the nature of the distribution, as well as where its center is and whether there are any outliers: The shape of the distribution of the ages of Best Actresses is skewed, heavier on the left indicating that actresses who win Oscars tend to be disproportionally younger. disproportionally = tidak seimbang
32
Normal Distribution: One key characteristic of a normal distribution is that it is “bell-shaped”:
33
Recap In this Section we have discussed Histograms
Relative Frequency Histograms
34
Section 2-4 Statistical Graphics
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
35
Key Concept This section presents other graphs beyond histograms commonly used in statistical analysis. The main objective is to understand a data set by using a suitable graph that is effective in revealing some important characteristic.
36
Frequency Polygon Uses line segments connected to points directly above class midpoint values
37
Ogive A line graph that depicts cumulative frequencies
Insert figure 2-6 from page 58
38
Dot Plot Consists of a graph in which each data value is plotted as a point (or dot) along a scale of values
39
Stemplot (or Stem-and-Leaf Plot)
Represents data by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit) Stem=batang Leaf= daun
40
Pareto Chart A bar graph for qualitative data, with the bars arranged in order according to frequencies: complaints against phone carriers: Slamming=membanting Cramming=menjejalkan
41
Pie Chart A graph depicting qualitative data as slices of a pie
42
Pie Chart analysis Ex. 1 What percent of total complaints corresponds to complaints due to Access Charges? (1) Percent is the ratio of the Access Charges complaints out of the total number of complaints, so we need to find the total number of all complaints first: Make up =menyusun (2) Now the proportion of the Access Charges complaints is: (3) Complaints due to Access Charges make up 2.9% of all complaints against phone carriers.
43
Scatter Plot (or Scatter Diagram)
A plot of paired (x,y) data with a horizontal x-axis and a vertical y-axis: Number of cricket chirps per minute related to the temperature: cricket chirps =celetukan kriket
44
Time-Series Graph Data that have been collected at different points in time
45
Other Graphs
46
Recap In this section we have discussed graphs that are pictures of distributions. Keep in mind that a graph is a tool for describing, exploring and comparing data.
47
QUIZZ
48
1. A sample value that lies very far away from the majority of the other sample values is
The center. A distribution. An outlier. A variance.
49
2. A table that lists data values along with their counts is
An ogive. A frequency distribution. A cumulative table. A histogram.
50
3. The smallest numbers that can actually belong to different classes are
Upper class limits. Class boundaries. Midpoints. Lower class limits.
51
4. A bar graph where the horizontal scale represents the classes of data values and the vertical scale represents the frequencies is called A frequency distribution. A histogram. A dot plot. A pie chart.
52
5. The pie chart below shows the percent of the total population of 12,200 of Springfield inhabitants living in the given types of housing. Find the number of people who live in single family housing (to nearest whole number.) Apartments 35% Single family 39% 4758 people 39 people . 5368 people 7442 people Duplex 2% Condo 18% Townhouse 6%
53
6. Berdasarkan table distribusi frekuensi yang telah kalian buat sebelumnya (sertakan table tersebut dalam lembar jawaban apa adanya !) a. Tentukan jumlah kelas yang sesuai dengan menggunakan rumus Sturges! b. Buatlah graphic berdasarkan table distribusi frekuensi anda: Polygon untuk frekuensi ! Histogram untuk frekuensi relatif ! Ogive untuk frekuensi kumulatif !
54
ANSWER
55
A sample value that lies very far away from the majority of the other sample values is
The center. A distribution. An outlier. A variance.
56
A sample value that lies very far away from the majority of the other sample values is
The center. A distribution. An outlier. A variance.
57
A table that lists data values along with their counts is
An ogive. A frequency distribution. A cumulative table. A histogram.
58
A table that lists data values along with their counts is
An ogive. A frequency distribution. A cumulative table. A histogram.
59
The smallest numbers that can actually belong to different classes are
Upper class limits. Class boundaries. Midpoints. Lower class limits.
60
The smallest numbers that can actually belong to different classes are
Upper class limits. Class boundaries. Midpoints. Lower class limits.
61
A bar graph where the horizontal scale represents the classes of data values and the vertical scale represents the frequencies is called A frequency distribution. A histogram. A dot plot. A pie chart.
62
A bar graph where the horizontal scale represents the classes of data values and the vertical scale represents the frequencies is called A frequency distribution. A histogram. A dot plot. A pie chart.
63
The pie chart below shows the percent of the total population of 12,200 of Springfield inhabitants living in the given types of housing. Find the number of people who live in single family housing (to nearest whole number.) Apartments 35% Single family 39% Condo 18% Duplex 2% Townhouse 6% 4758 people 39 people . 5368 people 7442 people
64
The pie chart below shows the percent of the total population of 12,200 of Springfield living in the given types of housing. Find the number of people who live in single family housing (round to nearest whole number.) Apartments 35% Single family 39% Condo 18% Duplex 2% Townhouse 6% 4758 people 39 people . 5368 people 7442 people
65
Any Queries ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.