An Introduction to Business Statistics

Slides:



Advertisements
Similar presentations
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 1 An Introduction to Business Statistics.
Advertisements

An Introduction to Business Statistics
Chapter 1 A First Look at Statistics and Data Collection.
Chapter One An Introduction to Business Statistics McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 3 Goals After completing this chapter, you should be able to: Describe key data collection methods Know key definitions:  Population vs. Sample.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
MATH1342 S08 – 7:00A-8:15A T/R BB218 SPRING 2014 Daryl Rupp.
Chapter 1: Introduction to Statistics
Chapter 1 Introduction and Data Collection
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 1 An Introduction to Business Statistics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 1 An Introduction to Business Statistics.
Chapter 1: The Nature of Statistics
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 1 An Introduction to Business Statistics.
Unit 1 – Intro to Statistics Terminology Sampling and Bias Experimental versus Observational Studies Experimental Design.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling and Sampling Distributions.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 1 An Introduction to Business Statistics.
Lecture 1 Stat Applications, Types of Data And Statistical Inference.
1 Introduction to Statistics. 2 What is Statistics? The gathering, organization, analysis, and presentation of numerical information.
Chapter 1: Section 2-4 Variables and types of Data.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 1 An Introduction to Business Statistics.
Statistics Terminology. What is statistics? The science of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Chapter 1: Statistics, Data and Statistical Thinking
Sampling and Sampling Distributions
Elementary Statistics
Learning Objectives : After completing this lesson, you should be able to: Describe key data collection methods Know key definitions: Population vs. Sample.
Sources of Error In Sampling
Introduction to Statistics
Chapter 14 Sampling PowerPoint presentation developed by:
Sampling Why use sampling? Terms and definitions
Chapter 7 (b) – Point Estimation and Sampling Distributions
Part III – Gathering Data
Chapter 1: Statistics, Data, and Statistical Thinking
Section 5.1 Designing Samples
Sampling Methods and the Central Limit Theorem
Arrangements or patterns for producing data are called designs
Chapter 10 Samples.
Statistics – Chapter 1 Data Collection
Probability and Statistics
Graduate School of Business Leadership
Inference for Sampling
Basic Statistics.
Chapter 4 Sampling Design.
statistics Specific number
Chapter 1 Getting Started Understandable Statistics Ninth Edition
Information from Samples
Federalist Papers Activity
The Nature of Probability and Statistics
Arrangements or patterns for producing data are called designs
Introduction to Statistics
Gathering and Organizing Data
statistics Specific number
The Nature of Probability and Statistics
Chapter 1 The Where, Why, and How of Data Collection
Week 3 Lecture Statistics For Decision Making
Use your Chapter 1 notes to complete the following warm-up.
Chapter 1: Statistics.
Chapter 1 The Where, Why, and How of Data Collection
Statistics · the study of information Data · information
Section 5.1 Designing Samples
Chapter 5: Producing Data
MATH 2311 Section 6.1.
The Nature of Probability and Statistics
Gathering and Organizing Data
Chapter 7 Sampling and Sampling Distributions
The Where, Why, and How of Data Collection
Chapter 1: Statistics, Data and Statistical Thinking
Probability and Statistics
Chapter 1 The Where, Why, and How of Data Collection
Presentation transcript:

An Introduction to Business Statistics Chapter 1 An Introduction to Business Statistics

An Introduction to Business Statistics 1.1 Populations and Samples 1.2 Selecting a Random Sample 1.3 Ratio, Interval, Ordinal, and Nominative Scales of Measurement (Optional) 1.4 An Introduction to Survey Sampling (Optional) 1.5 More About Data Acquisition and Survey Sampling (Optional)

Populations and Samples A set of existing units (usually people, objects or events) Variable A measurable characteristic of the population Census An examination of the entire population of measurements Sample A selected subset of the units of a population

Sample from Population

Terminology Measurement Value Quantitative Qualitative Population of Measurement Census Sample Descriptive Statistics Statistical Inference

Measurement The process of determining the extent, quantity, amount, etc, of the variable of interest for some a particular item of the population. Produces data For example, collecting annual starting salaries of graduates from last year’s MBA program

Value The result of measurement. The specific measurement for a particular unit in the population For example, the starting salaries of graduates from last year’s MBA Program

Quantitative Measurements that represent quantities. (For example, “how much” or “how many.”) Annual starting salary is quantitative Age and number of children are also quantitative

Qualitative A descriptive category to which a population unit belongs: a descriptive attribute of a population unit. A person’s gender is qualitative A person’s hair color is also qualitative

Population of Measurements Measurement of the variable of interest for each and every population unit. Sometimes referred to as an observation For example, annual starting salaries of all graduates from last year’s MBA program

Census The process of collecting the population of all measurements is a census. Census usually too expensive, too time consuming, and too much effort for a large population

Sample For example, a university graduated 8,742 students A subset of population units. For example, a university graduated 8,742 students This is too large for a census So, we select a sample of these graduates and learn their annual starting salaries

Sample of Measurements Measured values of the variable of interest for the sample units For example, the actual annual starting salaries of the sampled graduates

Descriptive Statistics The science of describing the important aspects of a set of measurements. For example, for a set of annual starting salaries, want to know: How much to expect What is a high versus low salary If the population is small, could take a census and make statistical inferences But if the population is too large, then …

Statistical Inference The science of using a sample of measurements to make generalizations about the important aspects of a population of measurements. For example, use a sample of starting salaries to estimate the important aspects of the population of starting salaries

Selecting a Random Sample A random sample is a sample selected from a population so that: Each population unit has the same chance of being selected as every other unit Each possible sample (of the same size) has the same chance of being selected

Random Sample Example Randomly pick two different people from a group of 15: Number the people from 1 to 15 and write their numbers on 15 different slips of paper Thoroughly mix the papers and randomly pick two of them The numbers on the slips identifies the people for the sample

How to Pick? Sample with replacement Sample without replacement

Sample with Replacement Replace each sampled unit before picking next unit The unit is placed back into the population for possible reselection However, the same unit in the sample does not contribute new information

Sample Without Replacement A sampled unit is withheld from possibly being selected again in the same sample Guarantees a sample of different units Each sampled unit contributes different information Sampling without replacement is the usual and customary sampling method

Drawing the Random Sample If the population is large, use a table of random numbers In large sampling projects, tables of random numbers are often used to automate the sample selection process See Table 1.1 in the textbook for a table of random numbers Portion on next slide

Portion of Random Number Table

Using Random Number Tables For a demonstration of the use of random numbers, read Example 1.1, “Cell Phone Case: Estimating Cell Phone Costs,” in the textbook Use random numbers to randomly select 100 employees from a bank with 2,136 employees Random numbers can be computer-generated

Approximately Random Samples In general, must make a list identifying each and every individual population unit Called a frame If the population is very large, it may not be possible to list every individual population unit So instead draw a “systematic” sample

Systematic Sample Randomly enter the population and systematically sample every kth unit This usually approximates a random sample Read Example 1.2, “Marketing Research Case: Rating a New Bottle Design,” in the textbook

Example 1.2: Rating a New Bottle Design Wish to determine consumer reaction to a new bottle design Will use the “mall intercept method” Shoppers in a mall are intercepted and asked to participate in a consumer survey Asked to rate a new bottle

Example 1.2: Using Systematic Sample Cannot list and number every shopper As a result, cannot use random numbers Instead, will use a systematic sample Every 100th shopper is selected Using every 100th shopper is arbitrary Using widely spaced shoppers, can be reasonable sure not related

Problems With Non-Random Samples For presidential election of 1936, Literary Digest predicted Alf Landon would defeat Franklin D. Roosevelt Instead Roosevelt won in a landslide Literary Digest’s mistake was to sample names from telephone books and club membership rosters Many people did not have phones or belong to clubs As a result, they were not included in sample They voted overwhelmingly for Roosevelt

Voluntary Response Sample Participants select themselves to be in the sample Participants “self-select” For example, voting on American Idol Commonly referred to as a “non-scientific” sample Usually not representative of the population Over-represent individuals with strong opinions Usually, but not always, negative opinions

Sampling a Process Process Process Inputs Outputs A sequence of operations that takes inputs (labor, raw materials, methods, machines, etc) and turns them into outputs (products, services, and the like). Process Inputs Outputs

Process “Population” The “population” from a process is all output produced in the past, present, and the future. For example, all automobiles of a particular make and model. For instance, the Chrysler Sebring Cars will continue to be made over time

Population Size Finite Infinite

Finite Population Finite if it is of fixed and limited size Finite if it can be counted Even if very large For example, all the Chrysler Sebring cars actually made during just this model year is a finite population Because a specific number of cars was made between the start and end of the model year

Infinite Population Infinite if it is unlimited Infinite if listing or counting every element is impossible For example, all the Chrysler Sebring cars that could have possibly been made this model year is an infinite population

Statistical Control A process is in statistical control if it does not exhibit any unusual process variations. A process in statistical control displays a constant amount of variation around a constant level A process not in statistical control is “out of control”

Statistical Control Continued To determine if a process is in control or not, sample the process often enough to detect unusual variations. Issue: How often to sample? See Example 1.3, “The Car Mileage Case: Estimating Mileage,” in the textbook

Runs Plot A runs plot is a graph of actual individual measurements of process output over time Process output (the variable of interest) is plotted on the vertical axis against time plotted on the horizontal axis The constant process level is plotted as a horizontal line The variation is plotted as an up and down movement as time goes by of the individual measurements, relative to the constant level

Runs Plot

Temperature of Coffee Consider the coffee temperature case of Problem 1.12 Coffee was sampled every half hour from 10:00 AM to 9:30 PM, and its temperature measured The 24 timed measurements are graphed in the runs plot

Temperatures Recorded

Runs Plot

Results Over time, temperatures appear to have a fairly constant amount of variation around a fairly constant level The temperature is expected to be at the constant level shown by the horizontal blue line Sometimes the temperature is higher and sometimes lower than the constant level About the same amount of spread of the values (data points) around the constant level The points are as far above the line as below it The data points appear to form a horizontal band So, the process is in statistical control Coffee-making process is operating “consistently”

Outcome Because the coffee temperature has been and is presently in control, it will likely stay in control in the future If the coffee making process stays in control, then coffee temperature is predicted to be between 152o and 170o F In general, if the process appears from the runs plot to be in control, then it will probably remain in control in the future The sample of measurements was approximately random Future process performance is predictable

Out of Control If there is a trend in the process performance Future performance of the process will be outside established limits

Out of Control If, there is a constant level, but the amount of the variation is varying as time goes by Data points fan out from or neck down to the constant level

Statistical Process Control The real purpose is to see if the process is out of control so corrective action can be taken if necessary Must investigate further to find out why the process is out of control

Ratio, Interval, Ordinal, and Nominative Scales of Measurement (Optional)

Qualitative Variables Descriptive categorization of population or sample units Two types: Nominative Ordinal

Quantitative Variables Numerical values represent quantities measured with a fixed or standard unit of measure Two types: Interval Ratio

Qualitative Variables Nominative Identifier or name Unranked categorization Example: gender, car color

Qualitative Variables Ordinal All characteristics of nominative plus… Rank-order categories Ranks are relative to each other Example: Low (1), moderate (2) or high (3) risk

Interval Variable All of the characteristics of ordinal plus… Measurements are on a numerical scale with an arbitrary zero point The “zero” is assigned: it is nonphysical and not meaningful Zero does not mean the absence of the quantity that we are trying to measure

Interval Variable Continued Can only meaningfully compare values by the interval between them Cannot compare values by taking their ratios “Interval” is the arithmetic difference between the values Example: temperature 0 F means “cold,” not “no heat” 80 F is not twice as warm as 40 F

Ratio Variable All the characteristics of interval plus… Measurements are on a numerical scale with a meaningful zero point Zero means “none” or “nothing” Values can be compared in terms of their interval and ratio $30 is $20 more than $10 $0 means no money

Ratio Variable Continued In business and finance, most quantitative variables are ratio variables, such as anything to do with money Examples: Earnings, profit, loss, age, distance, height, weight

An Introduction to Survey Sampling (Optional) Already know some sampling methods Also called sampling designs, they are: Random sampling Systematic sampling Voluntary response sampling But there are other sample designs Stratified random sampling Cluster sampling

Stratified Random Sample Divide the population into non-overlapping groups, called strata, of similar units Separately, select a random sample from each and every stratum Combine the random samples from each stratum to make the full sample

Stratified Random Sample Continued Appropriate when the population consists of two or more different groups so that: The groups differ from each other with respect to the variable of interest Units within a group are similar to each other For example, divide population into strata by age, gender, income, etc

Stratified Random Sample Advantages Takes advantage of the fact that units in the same stratum are similar to each other As a result, a stratified sample can provide more accurate information than a random sample of the same size Stratification can make a sample easier to select

Cluster Sampling “Cluster” or group a population into subpopulations Cluster by geography, time, etc… Each cluster is a representative small-scale version of the population (i.e. heterogeneous group) A simple random sample is chosen from each cluster Combine the random samples from each cluster to make the full sample

Cluster Sampling Continued Appropriate for populations spread over a large geographic area so that… There are different sections or regions in the area with respect to the variable of interest A random sample of the cluster

Combination It is sometimes a good idea to combine stratification with multistage cluster sampling For example, we wish to estimate the proportion of all registered voters who favor a presidential candidate Divide United States into regions Use these regions as strata Take a multistage cluster sample from each stratum

Systematic Sampling Saw back in Example 1.2 To systematically select n units without replacement from a frame of N units, divide N by n and round down to a whole number Randomly select one unit within the first N/n interval Select every N/nth unit after that

More About Data Acquisition and Survey Sampling (Optional) Web searches… Cheap Fast Limited in type of information we are able to find Private data source… Internal company records If no affiliation, can be difficult to get data Data collection agency Cost money Buy subscription or individual reports

Experimental and Observational Studies Much of the data we need is not available from public or private source May hire consultants or statisticians to help obtain appropriate data Requires much more time and money that public or private sources

Initiating a Study First, define the variable of interest Called a response variable Next, define other variables that may be related to the variable of interest and will be measured Called independent variables If we manipulate the independent variables, we have an experimental study If unable to control independent variables, the study is observational

Types of Survey Questions Dichotomous questions ask for a yes/no response Multiple choice questions give the respondent a list of of choices to select from Open-ended questions allow the respondent to answer in their own words

Common Types of Surveys Phone survey Mail survey Self-administered survey Web-based survey Personal interview Mall survey

Errors Occurring in Surveys Random sampling should eliminate bias But even a random sample may not be representative because of: Under-coverage Too few sampled units or some of the population was excluded Non-response When a sampled unit cannot be contacted or refuses to participate Response bias Responses of selected units are not truthful