Download presentation
Presentation is loading. Please wait.
Published byPhoebe Neal Modified over 9 years ago
1
Percentages of State Residents in 2000 who were 65 or older AL13.0 AK5.7 AZ13.0 AR14.0 CA10.6 CO9.7 CT13.8 DE13.0 FL17.6 GA9.6 HI13.3 ID11.3 IL12.1 IN12.4 IO14.9 KS13.3 KY12.5 LA11.6 ME14.4 MD11.3 MA13.5 MI12.3 MN12.1 MS12.1 MO13.5 MT13.4 NE13.6 NV11.0 NH12.0 NJ13.2 NM11.7 NY12.9 NC12.0 ND14.7 OH13.3 OK13.2 OR12.8 PA15.6 RI14.5 SC12.1 SD14.3 TN12.4 TX9.9 UT8.5 VT12.7 VA11.2 WA11.2 WV15.3 WI13.1 WY11.7
2
Statistics and Data (Graphical) Section 9.6
3
Statistics Statistics measures characteristics of individuals (people, animals, things, etc.), called variables; two varieties: Categorical Variable – identifies individuals as belonging to a distinct class (e.g., gender, school grade, etc.) Quantitative Variable – takes on numerical values for the characteristic being measured (e.g., height, weight, etc.)
4
Leading Causes of Death in the U.S. in 1999 Cause of Death Heart Disease Cancer Stroke Other Number of Deaths 725,192 549,838 167,366 949,003 Percentage 30.3 23.0 7.0 39.7 What type of variables are these? Categorical!!! We can display categorical data using: bar chart, circle graph, or even a pie chart.
5
Leading Causes of Death in the U.S. in 1999 Causes of Death Bar Chart Number of Deaths (thousands) 200 400 600 800 1000 Heart Disease CancerStrokeOther
6
Leading Causes of Death in the U.S. in 1999 Circle Graph Heart Disease 30.3% Cancer 23.0% Stroke 7.0% Other 39.7%
7
Stemplots Stemplot (also called a stem-and-leaf plot) – a quick way to organize and analyze a small set of quantitative data. Each number in the data set is split into a stem, consisting of its initial digit or digits, and a leaf, which is its final digit. Now, let’s create a stemplot from the “Do Now” data…
8
To create the stem-and-leaf plot: 1.Use the whole number part of each number as the stem, and the tenths digit as the leaf. 2. Write the stems in order down the first column and, for each number, write the leaf in the appropriate stem row. 3. Finally, arrange the leaves in each stem row in ascending order.
9
StemLeaf 5 6 7 8 9 10 11 12 13 14 15 16 17 7 5 6 7 9 6 0 2 2 3 3 6 7 7 0 0 1 1 1 1 3 4 4 5 7 8 9 0 0 0 1 2 2 3 3 3 4 5 5 6 8 0 3 4 5 7 9 3 6 6 Notes: The “leafless stems” Spacing among the leaves
10
By looking at both the stemplot and the table, answer the follow- ing questions about the distribution of senior citizens among the 50 states. 1. Judging from the stemplot, what was the approximate average national percentage of residents who were 65 or older? 12-13% 2. In how many states were more than 15% of the residents 65 or older? 3 states 3. Which states were in the bottom tenth of all states in this statistic? Bottom 5 states in the stemplot: AK, CO, GA, TX, UT 4. The numbers 5.7 and 17.6 are so far above or below the other numbers in this stemplot that statisticians would call them outliers. Quite often there is some special circumstance that explains the presence of outliers. What could explain the two outliers in this stemplot?
11
The average annual salaries for the top 15 U.S. metro areas are shown below. Make a stemplot that provides a good visualization of the data. What is the average of the 15 numbers? Why is the stemplot a better summary of the data than the average? San Jose, CA76,076 San Francisco, CA59,314 New York, NY56,377 New Haven, CT50,585 Middlesex, NJ48,977 Newark, NJ48,733 Jersey City, NJ47,514 Boulder, CO45,565 Washington, D.C.45,333 Boston, MA45,191 Seattle, WA45,171 Trenton, NJ44,576 Oakland, CA44,170 Bergen, NJ43,789 Hartford, CT42,349 Round the data to $1000 units, then create a split-stemplot : StemLeaf 4 4 5 5 6 6 7 7 2 4 4 5 5 5 5 6 8 9 9 1 6 9 6
12
The average annual salaries for the top 15 U.S. metro areas are shown below. Make a stemplot that provides a good visualization of the data. What is the average of the 15 numbers? Why is the stemplot a better summary of the data than the average? San Jose, CA76,076 San Francisco, CA59,314 New York, NY56,377 New Haven, CT50,585 Middlesex, NJ48,977 Newark, NJ48,733 Jersey City, NJ47,514 Boulder, CO45,565 Washington, D.C.45,333 Boston, MA45,191 Seattle, WA45,171 Trenton, NJ44,576 Oakland, CA44,170 Bergen, NJ43,789 Hartford, CT42,349 The average of the 15 numbers is $49,582, but this is misleading; The salaries are actually fairly tightly clustered around $45,000; The few highest salaries skew the average upward…
13
Mark McGwire and Barry Bonds entered the major leagues in 1986. From 1986 to 2001, they averaged 36.44 and 35.44 home runs per year, respectively. Compare their annual home run totals with a back-to-back stemplot. Can you tell which player has been more consistent as a home run hitter? Year 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 McGwire 3 49 32 33 39 22 42 9 9 39 52 58 70 65 32 29 Bonds 16 25 24 19 33 25 34 46 37 33 42 40 37 34 49 73 Mark McGwireBarry Bonds 0 1 2 3 4 5 6 7 6 9 4 5 5 3 3 4 4 7 7 0 2 6 9 3 9 9 3 9 2 9 9 3 2 2 9 2 8 2 5 0 Which player do you think was more consistent as a home run hitter?
14
Frequency Tables Frequency Table for Mark McGwire’s Yearly HR Totals Home RunsFrequency 0 – 9 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 Total
15
Frequency Tables and Histograms
16
Frequency Tables First, think back to the stemplots we just completed – where does the visual impact of a stemplot come from? TThe rows of leaves let us see how many leaves branch off each stem!!! Ex. from last class: Mark McGwire HRs 0123456701234567 3 9 9 2 9 2 2 3 9 9 2 9 2 8 5 0 The number of leaves for a particular stem is the frequency of observations within each stem interval. We can also record this information in a frequency table, which gives a frequency distribution of the data.
17
Frequency Tables Frequency Table for Mark McGwire’s Yearly HR Totals Home RunsFrequency 0 – 93 10 – 190 20 – 292 30 – 395 40 – 492 50 – 592 60 – 691 70 – 791 Total16 Notes: Higher frequencies in this table correspond to longer leaf rows in a stemplot. Unlike a stemplot, a frequency table does not display what the numbers in each interval actually are.
18
Histograms HR Intervals Histogram – gives a visual display of information from a frequency table. A histogram is to quantitative data what a bar chart is to categorical data. Frequency 0-910-1920-2930-3940-4950-5960-6970-79 1 2 3 4 5 What are the differences between a histogram and a bar chart? Histogram for HR frequency of Mark McGwire
19
Histograms To create this histogram with your calculator: Put the lowest value of each subinterval into L (start with 0, 10, 20,…) 1 Put the corresponding frequencies into L 2 Settings for STAT PLOT1 – Type: Histogram (it’s a picture!), Xlist: L, Freq: L 12 Settings for WINDOW: Xmin = –10, Xmax = 80, Xscl = 10, Ymin = –1, Ymax = 6, Yscl = 1 NOW GRAPH!!!
20
Histograms Now, create a frequency table and histogram for the HR data for Barry Bonds from last class. Barry Bonds 0123456701234567 6 9 4 5 5 3 3 4 4 7 7 0 2 6 9 3 The stemplot: Home RunsFrequency 0 – 90 10 – 192 20 – 293 30 – 396 40 – 494 50 – 590 60 – 690 70 – 791 Total16 Frequency Table:
21
Histograms HR Intervals Frequency 0-910-1920-2930-3940-4950-5960-6970-79 1 2 3 4 5 Histogram for HR frequency of Barry Bonds 6
22
Make a histogram of Hank Aaron’s annual home run totals given below, using interval width 5. Year 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 HR 13 27 26 44 30 39 40 34 45 44 24 32 Year 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 HR 44 39 29 44 38 47 34 40 20 12 10 First, create a frequency table: HR 10 – 14 15 – 19 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49 Total Frequency 3 0 2 3 4 3 6 2 23 Next, create the histogram (by hand and using a calculator)
23
Histograms HR Intervals Frequency 10-1415-1920-2425-2930-3435-3940-4445-49 1 2 3 4 5 Histogram for HR frequency of Hank Aaron 6
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.