Download presentation
Presentation is loading. Please wait.
1
Introduction to Statistics
MATH 124 Sections
2
The name Data analysis usually refers to an informal approach to statistics. It is a relatively new term in mathematics. Statistics once referred to numerical information about state or political territories; it comes from the Latin statisticus, meaning “of the state.” Today, much of statistics involves making sense of data.
3
Statistic and statistics
A statistic is a numerical value of a quantity. Statistics is the science of obtaining, organizing, describing, and analyzing data for the purpose of making decisions, as well as making predictions.
4
Should we believe statistics?
In most instances, statistics are reliable and useful.... But statistics can also be unreliable or misleading. Just as it is possible to lie when giving an account of an event, it is possible to “ lie” with statistics by inappropriately manipulating data, withholding information that is crucial to the interpretation of data, or presenting statistical information in a way that hides important information about the data. A person who understands fundamental ideas in statistics is more likely to recognize these unethical uses of data than a person who does not have this understanding.
5
On the surface, statistics as a discipline appears to be a precise science that yields only one “right answer” to a question. One reason for this belief is that statistics is associated with mathematics, which is in turn associated with precision. Another reason is that a statistical analysis produces apparently precise numerical results. (For example, you’ve probably encountered statements such as “Families in that community have on average 2.6 children.”) There are, however, some gray areas in interpreting statistics. As a rule of thumb, expect precision in some aspects of statistics, such as computing statistics or making honest graphs, but expect gray areas in aspects such as interpreting graphs and interpreting statistics.
6
Relationship between probability and statistics
There is a close relationship between probability and statistics. Much statistical decision making is based on probabilities, as statistics typically deals with population samples instead of entire populations, and therefore there is an element of chance involved.
7
Conducting data analysis
The following framework holds for statistical problem solving: Formulate the questions Collect the data Analyze the data Interpret the results
8
1. Formulating questions
Conceiving the object to be measured clearly enough to imagine a way to measure is it often the most difficult part of a statistical study of a new concept. Curricula for grades K-8 often have children decide what data should be collected that will answer the questions they may have suggested for a statistics project. For example, how should we measure parents’ tolerance of violence on television?
9
2. Collecting data It is of utmost importance to pick an unbiased sample. A sample is biased if the process of gathering the sample makes it likely that the sample will not reflect that population of interest. There are different types of sampling. The book only discusses random sampling. It is one in which every member of the population has equal chance of being selected for the sample. What can go wrong if a sample is biased? Consider this famous example: the 1936 election.
10
Types of data Numerical (or quantitative) Categorical (or qualitative)
For example: test scores are quantitative; favorite colors are qualitative.
11
3.&4. Analyzing and interpreting data
In this class, we will discuss the following ways to analyze data: Creating statistical graphs Finding averages We will use these to draw conclusions about the data.
12
Some terminology A population is the entire group that is of interest.
A sample is the part of the population that is actually used to collect data. A sample is biased if the process of gathering the sample makes it likely that the sample will not reflect the population of interest. A sample statistic is the result of a calculation or count based on data gathered from the sample. A population parameter is the same calculation or count based on the entire population.
13
Some types of bias: self- selected or voluntary sample
convenience sample.
14
Problem to consider An elementary school with grades 1 through 6 has 100 students in each grade. A fifth-grade class is trying to raise some money to go on a field trip to Disneyland. They are considering several options to raise money and decide to do a survey to help them determine the best way to raise the most money. One option is to sell raffle tickets for a Wii U. How could they find out whether or not students were interested in buying a raffle ticket to win the game system?
15
Proposed surveys Raffi asked 60 friends. (75% yes, 25% no)
Marta got the names of all 600 students in the school, put them in a hat, and pulled out 60 of them. (35% yes, 65% no) Spence had blond hair so he asked the first 60 students he found who also had blond hair. (55% yes, 45% no) Jinfa asked 60 students at an after- school meeting of the Games Club. The Games Club met once a week and played different games— especially computerized ones. Anyone who was interested in games could join. (90% yes, 10% no)
16
Abby sent out a questionnaire to every student in the school and then used the first 60 that were returned to her. (50% yes, 50% no) SuLin set up a booth outside the lunchroom, and anyone who wished could stop by and fill out her survey. To advertise her survey, she posted a sign that said WIN A Wii U. She stopped collecting surveys when 60 students had completed the survey. (100% yes) Jazmine asked the first 60 students she found whose telephone number ended in a 3 because 3 is her favorite number. (25% yes, 75% no)
17
Dong wanted the same number of boys and girls and some students from each grade. So he asked 5 boys and 5 girls from each grade to get his total of 60 students. (30% yes, 70% no) Paula didn’t know many boys, so she decided to ask 60 girls. But she wanted to make sure she got some young girls and some older ones, so she asked 10 girls from each grade. (10% yes, 90% no)
18
Questions What is the population in this case? What is the sample?
What is the sample statistic we are trying to find? What is the population parameter we are trying to find? What would your estimate for the population parameter be based on these nine surveys?
19
Discussion For each sample, why do you think the percentages came out the way they did? What kinds of biases could show up in the students’ samples? Do you think the percentages would have changed if the sample size had changed?
20
Types of samples Random sampling. It is one in which every member of the population has equal chance of being selected for the sample. A simple random sample is one in which every possible sample of a particular size has an equal chance of being selected.
21
Questions Why do we need a sample? Why don’t we just use the entire population? How do we choose a sample? How big should a sample be? Can you think of examples of parameters that are estimated by collecting data from a sample? Can you think of examples of parameters that are calculated/counted by using the entire population? Why can we assume that the sample statistics is a good estimate for the parameter?
22
Reasons for using a sample
It is not always possible to include all of the population. Gathering information is costly in terms of both time and money. Results are more timely. The discipline of statistics allows us to interpret results from samples and to make assertions about the whole population. A result from a relatively small, but carefully chosen, sample can give information about the whole population. In this statement relatively small does not necessarily mean small in number, but rather it means far fewer than the whole population.
23
Random sampling When statisticians want to find a random sample, they do not usually draw names from a hat, spin a spinner, or toss a die, although these are legitimate ways to sample randomly. To obtain a large sample, these methods would be very time-consuming. Instead, statisticians might use computer simulation software, a table of random numbers, or a computer or calculator with the capability of providing random numbers. This is a common middle school topic, but we will skip it in this class.
24
Types of data: qualitative
25
Frequency tables Grade Frequency A 4 B 7 C 9 D 3 E 2 Total 25
26
Bar graphs
27
Pie charts
28
Types of data: quantitative
29
Stem and leaf plots Stem and leaf plots can be used to represent measurement data.
30
Line plots These are prominent in the Common Core Standards and are also used to represent measurement data. An example is below:
31
Line graphs Usually represent the change of a quantity over time. These graphs are commonly used, but the book doesn’t give too much attention to them, nor do the CCSM.
32
Histograms A histogram is similar to a bar graph, but not at all the same. The main differences are that the data categories have to be quantitative, the bars have to follow the order of the categories, and the widths of the bars must have specific meaning. At the K-12 level, it is assumed that all the bars have the same width.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.