The Nature of Probability and Statistics Chapter 1 The Nature of Probability and Statistics Probability deals with events that occur by chance. Statistics are theory and methods deals with data.
Chapter 1 Overview 1-1 Descriptive and Inferential Statistics 1-2 Variables and Types of Data 1-3 Data Collection & Sampling Techniques Bluman Chapter 1
Introduction What do you think about statistics? Research domain: A group of subjects Data: variation/level of uncertainty Probability and Decision making Data analysis, data analyst Why statistical methods? What statistical methods will be covered in this class? There exists variation within the data set of a group of subjects. In order to make better decision, you have to know the probability (level of uncertainty). And statistical methods consider the variation and help you make a better decision. Bluman Chapter 1
Data Analyst Jobs, Employment | Indeed.com 84,484 Data Analyst Jobs available on Indeed.com. Bluman Chapter 1
Introduction Statistics is the science of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data. happened, was, history going to happen, being, future Knowledge and Claim Collect----sampling Organize----raw data description of your data Bluman Chapter 1
1-1 Descriptive and Inferential Statistics (two applications) Descriptive statistics consists of the collection, organization, summarization, and presentation of data. Inferential statistics consists of generalizing from samples to populations, performing estimations and hypothesis tests, and making predictions. Bluman Chapter 1
1-1 Descriptive and Inferential Statistics Descriptive statistics consists of the collection, organization, summarization, and presentation of data. Bluman Chapter 1
Example: classify each of the studies as either descriptive or inferential statistics Professional athlete salaries: In the Statistical Abstract of the United States, average professional athletes’ salaries in baseball, basketball, and football were compiled and compared for the years 1990 and 2000. Average Salary ($1000): Sport 1990 2000 Baseball(MLB) 598 1720 Basketball (NBA) 750 3522 Football 395 1071 (descriptive statistics) Bluman Chapter 1
Example: classify each of the studies as either descriptive or inferential statistics “U.S. Credit and Debit Cards Share in 2010” The pie chart describes the results. (descriptive statistics) Bluman Chapter 1
1-1 Descriptive and Inferential Statistics Inferential statistics consists of generalizing information from samples to populations, performing estimations and hypothesis tests, and making predictions. Bluman Chapter 1
1-1 Descriptive and Inferential Statistics domain A population consists of all subjects (human or otherwise) that are studied. A sample is a subset of the population. Think about why we take sample. population is large, the researcher saves time and money using samples. Samples are used when the units must be destroyed.
Example: classify each of the studies as either descriptive or inferential statistics There is a survey result. 48 out of 100 university students like football best. Based on this result, we conclude that almost half of university students like football best. population sample (inferential statistics) Bluman Chapter 1
Example: classify each of the studies as either descriptive or inferential statistics A published report recently stated "Based on a sample of 280 new cars, there is evidence to indicate that the average new car price of all foreign automobiles is significantly higher than the average new car price of all American cars.“ population sample (inferential statistics) Bluman Chapter 1
1-1 Descriptive and Inferential Statistics A variable is a characteristic or attribute of a unit or subject that can assume different values. The values that a variable can assume are called data. Example of data set population is large, the researcher saves time and money using samples. Samples are used when the units must be destroyed.
Example: According to the state of the news media, 2006, the average age of viewers of “ABC World News Tonight” is 59 years. Suppose a rival network executive hypothesizes that the average age of ABC news viewers is less than 59. To test her hypothesis, she samples 500 ABC nightly news viewers and determines the age of each. a. Describe the population. The population is ______________. b. Describe the variable of interest. ___________________________. c. Describe the sample. _____________________________. Bluman Chapter 1
a. Describe the population. The population is _________________. Example. In order to get an estimate of average math test score of a class with 200 students. We randomly choose 50 students and calculate the average math test score for them. a. Describe the population. The population is _________________. b. Describe the variable of interest. _____________________. c. Describe the sample. The sample is ________________________ . Bluman Chapter 1
a. Describe the population. __________________. Example. “Cola wars” is the popular term for the intense competition between Coca-Cola and Pepsi displayed in their marketing campaigns, which have featured movie and television stars, rock videos, athletic endorsements, and claims of consumer preference based on taste tests. Suppose, as part of a Pepsi marketing campaign, 1000 cola consumers are given a blind taste test. Each consumer is asked to state a preference for brand A and brand B. a. Describe the population. __________________. b. Describe the variable of interest. c. Describe the sample. ______________________. Bluman Chapter 1
1-2 Variables and Types of Data Qualitative Categorical Quantitative Numerical, Can be ranked Discrete Countable 5, 29, 8000, etc. Continuous Can be decimals 2.59, 312.1, etc. Bluman Chapter 1
Qualitative Qualitative variables are variables that can be placed into distinct categories. For example: the blood type, voting opinion, the brand of car, the gender,…. Bluman Chapter 1
Quantitative Quantitative variables are numerical and can be ordered or ranked. For example: the number of students in a college, the number of pizza sold one day, the test score, the height, the temperature, the income, … Bluman Chapter 1
Discrete Discrete variables assume values that can be counted. Bluman Chapter 1
Continuous Continuous variables can assume an infinite number of values between any two specific values. Bluman Chapter 1
Example: identify types of data. DDT: Chemical and manufacturing plants sometimes discharge toxic-waste materials such as DDT into nearby rivers and streams. These toxins can adversely affect the plants and animals inhabiting the river and the riverbank. The U.S. Army Corps of engineers conducted a study of fish in Tennessee River and its three creeks: Flint Creek, Limestone Creek, and Spring Creek. A total of 144 fish were captured, and the following variables were measured for each: 1. River/Creek where each fish was captured 2. Species(catfish, bass, or buffalo fish) 3. Length (centimeters) 4. weight (grams) Bluman Chapter 1
Example: identify types of data. The following table shows some information of employee in a company. Name Age Gender Salary McKinley 52 F $101,000 Logan 33 $82,000 …… Elias 46 M $58,700 Angie 21 $43,900 Bluman Chapter 1
1-3 Data Collection and Sampling Techniques Some Sampling Techniques Random – random number generator Systematic – every kth subject Stratified – divide population into “layers” Cluster – use intact groups Convenient – mall surveys Bluman Chapter 1
Summary of Sampling Methods Fair (unbiased, free from all prejudice and favoritism ) sample Random Subjects are selected by random numbers. Systematic Subjects are selected by using every kth number after the first subject is randomly selected from 1 through k. Stratified Subjects are selected by dividing up the population into groups (strata), and subjects are randomly selected within groups. Cluster Subjects are selected by using an intact group that is representative of the population. Bluman Chapter 1
Pg. 26-27, #6, 8, 9, 12 Pg. 46-47 #1-15 odds Bluman Chapter 1
Chapter 1 1. descriptive / inferential statistics The national average annual medicine expenditure per person is $1052 (Source: The Greensburg Tribune Review). The average life expectancy in New Zealand is 78.49 years (Source: World Factbook). A published analysis recently stated "Based on a sample of 250 newly hired truck drivers, there is evidence to indicate that, on average, independent truck drivers are overpaid relative to company-hired truck drivers." Bluman Chapter 1
3. qualitative / quantitative 4. discrete / continuous 2. population / sample 3. qualitative / quantitative 4. discrete / continuous 5. sampling methods Bluman Chapter 1