Download presentation
1
Chapter 1 The Where, Why, and How of Data Collection
Business Statistics: A Decision-Making Approach 7th Edition Chapter 1 The Where, Why, and How of Data Collection
2
What is Statistics? Statistics is the development and application of methods to collect, analyze and interpret data. Statistics is a discipline which is concerned with: designing experiments and other data collection summarizing information to aid understanding drawing conclusions from data estimating the present or predicting the future.
3
Populations and Samples
A Population is the set of all items or individuals of interest Examples: All likely voters in the next election All parts produced today All sales receipts for November A Sample is a subset of the population Examples: 1000 voters selected at random for interview A few parts selected for destructive testing Every 100th receipt selected for audit
4
Population vs. Sample Population Sample a b c d b c ef gh i jk l m n
o p q rs t u v w x y z b c g i n o r u y
5
Why Sample? Less time consuming than a census
Less costly to administer than a census It is possible to obtain statistical results of a sufficiently high precision based on samples.
6
Nonstatistical Sampling
Sampling Techniques Sampling Techniques Nonstatistical Sampling Statistical Sampling (Simple) Random Convenience Systematic Judgment Cluster Stratified Not interested in……
7
(Probability Sampling)
Statistical Sampling Items of the sample are chosen based on known or calculable probabilities Statistical Sampling (Probability Sampling) (Simple) Random Stratified Systematic Cluster Video Clip
8
Random Sampling Every possible sample of a given size has an equal chance of being selected Selection may be with replacement or without replacement The sample can be obtained using a table of random numbers or computer random number generator
9
Stratified Random Sampling
Watch Video Clip: Samples and Surveys (#14) More often used than “Systematic” and “Cluster” Stratified random sampling could be used to divide the employees into groups with similar characteristics that might affect preferences like marital status or age and then simple random samples can be taken from each group.
10
Systematic Random Sampling
Decide on sample size: n (sample) Divide frame of N (population) individuals into groups of k individuals: k=N/n Randomly select one individual from the 1st group Select every kth individual thereafter N = 64 n = 8 k = 8 First Group
11
Cluster Sampling Divide population into several “clusters,” each representative of the population Select a simple random sample of clusters All items in the selected clusters can be used, or items can be chosen from a cluster using another probability sampling technique Population divided into 16 clusters. Randomly selected clusters for sample
12
Two Basic Divisions of Statistics
Descriptive statistics Descriptive statistics are numbers that are used to summarize and describe data. Examples: Average salary of various occupations Median house price in Bakersfield, CA Descriptive statistics do not infer the properties of the population from which the sample was drawn --- do not involve generalization
13
Descriptive Statistics
Collect data e.g., Survey, Observation, Experiments Present data e.g., Charts and graphs Characterize data e.g., Sample mean =
14
Inferential Statistics
You have been hired by the National Election Commission to examine how the American people feel about the fairness of the voting procedures in the U.S. Who will you ask? Ask every single American Ask randomly selected a small group (sample) of Americans and then draw inferences about the entire country from their responses.
15
Inferential Statistics
are used to draw inferences about a population from a sample (two main methods: estimation and hypothesis testing). Sample selection is “critical” matter…. Not from a particular state Not from a particular party
16
Tools for Collecting Data
Data Collection Methods Domino Pizza example from What is Statistics? (#1) Also Watch Video Clip: Samples and Surveys (#14) Experiments Written questionnaires Telephone surveys Direct observation and personal interview
17
Survey Design Steps Define the issue Define the population of interest
What are the purpose and objectives of the survey? How will the survey be administered? (e.g. phone, , face to face) Define the population of interest Develop survey questions Make questions clear and unambiguous Use universally-accepted definitions Limit the number of questions
18
Survey Design Steps Pre-test the survey
(continued) Pre-test the survey Pilot test with a small group of participants Assess clarity and length Determine the sample size and sampling method Select sample and administer the survey
19
Types of Questions Closed-end Questions
Select from a short list of defined choices Example: Major: __business __liberal arts __science __other Open-end Questions Respondents are free to respond with any value, words, or statement Example: What did you like best about this course? Demographic Questions Questions about the respondents’ personal characteristics Example: Gender: __Female __ Male
20
Observations and Interviews
Data collected is observed and recorded based on what takes place Very subjective Example: Observe reactions of customers to a new store layout Interviews Can be structured – fixed set of questions Can use a variety of questions Requires more time from the researcher
21
Data Collection Pitfalls
Interview bias Non response Selection bias Observer bias Measurement error Internal/External validity The objective is to collect accurate and reliable data!
22
Data (variable) Types Data Qualitative (Categorical) Quantitative
(Numerical) Discrete Continuous Examples: Marital Status Political Party Eye Color (Defined categories) Examples: Weight Voltage (Measured characteristics) Examples: Number of Children Defects per hour (Counted items)
23
Qualitative vs. Quantitative Variables (Data)
Qualitative variables (data) take on values that are names or labels. Quantitative variables are numerical. They represent a measurable quantity. Quantitative variables can be further classified as discrete or continuous: If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable; otherwise, it is called a discrete variable. Video clip (easy and simple): Introduction to Variables
24
Data Measurement Levels (Please see the Video Clip: Scales of Measurement)
Highest Level Complete Analysis Measurements Ratio/Interval Data Rankings Ordered Categories Higher Level Mid-level Analysis Ordinal Data Categorical Codes ID Numbers Category Names Lowest Level Basic Analysis Nominal Data
25
Data Types Time Series Data Cross Section Data
Ordered data values observed over time Cross Section Data Data values observed at a fixed point in time
26
Data Types Sales (in $1000’s) 2003 2004 2005 2006 Atlanta 435 460 475
490 Boston 320 345 375 395 Cleveland 405 390 410 Denver 260 270 285 280 Time Series Data Cross Section Data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.