© 2002 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (8 th Edition) Introduction and Data Collection
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-2 Chapter Topics Why a manager needs to know about statistics The growth and development of modern statistics Key definitions Descriptive versus inferential statistics
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-3 Chapter Topics Why data are needed Types of data and their sources Design of survey research Types of sampling methods Types of survey errors (continued)
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-4 Why a Manager Needs to Know about Statistics To know how to properly present information To know how to draw conclusions about populations based on sample information To know how to improve processes To know how to obtain reliable forecasts
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-5 The Growth and Development of Modern Statistics Needs of government to collect data on its citizens The development of the mathematics of probability theory The advent of the computer
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-6 Key Definitions A population (universe) is the collection of things under consideration A sample is a portion of the population selected for analysis A parameter is a summary measure computed to describe a characteristic of the population A statistic is a summary measure computed to describe a characteristic of the sample
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-7 Population and Sample PopulationSample Use parameters to summarize features Use statistics to summarize features Inference on the population from the sample
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-8 Statistical Methods Descriptive statistics Collecting and describing data Inferential statistics Drawing conclusions and/or making decisions concerning a population based only on sample data
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-9 Descriptive Statistics Collect data e.g. Survey Present data e.g. Tables and graphs Characterize data e.g. Sample mean =
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-10 Inferential Statistics Estimation e.g.: Estimate the population mean weight using the sample mean weight Hypothesis testing e.g.: Test the claim that the population mean weight is 120 pounds Drawing conclusions and/or making decisions concerning a population based on sample results.
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-11 Why We Need Data To provide input to survey To provide input to study To measure performance of service or production process To evaluate conformance to standards To assist in formulating alternative courses of action To satisfy curiosity
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-12 Data Sources Primary Data Collection Secondary Data Compilation Observation Experimentation Survey Print or Electronic
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-13 Types of Data
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-14 Design of Survey Research Choose an appropriate mode of response Reliable primary modes Personal interview Telephone interview Mail survey Less reliable self-selection modes (not appropriate for making inferences about the population) Television survey Internet survey Printed survey on newspapers and magazines Product or service questionnaires
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-15 Design of Survey Research Identify broad categories List complete and non-overlapping categories that reflect the theme Formulate accurate questions Make questions clear and unambiguous. Use universally-accepted definitions Test the survey Pilot test the survey on a small group of participants to assess clarity and length (continued)
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-16 Design of Survey Research Write a cover letter State the goal and purpose of the survey Explain the importance of a response Provide assurance of respondent’s anonymity Offer incentive gift for respondent participation (continued)
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-17 Reasons for Drawing a Sample Less time consuming than a census Less costly to administer than a census Less cumbersome and more practical to administer than a census of the targeted population
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-18 Types of Sampling Methods Quota Samples Non-Probability Samples JudgementChunk Probability Samples Simple Random Systematic Stratified Cluster
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-19 Probability Sampling Subjects of the sample are chosen based on known probabilities Probability Samples Simple Random SystematicStratifiedCluster
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-20 Simple Random Samples Every individual or item from the frame has an equal chance of being selected Selection may be with replacement or without replacement Samples obtained from table of random numbers or computer random number generators
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-21 Decide on sample size: n Divide frame of N individuals into groups of k individuals: k=n/n Randomly select one individual from the 1 st group Select every k-th individual thereafter Systematic Samples N = 64 n = 8 k = 8 First Group
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-22 Stratified Samples Population divided into two or more groups according to some common characteristic Simple random sample selected from each group The two or more samples are combined into one
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-23 Cluster Samples Population divided into several “clusters,” each representative of the population Simple random sample selected from each The samples are combined into one Population divided into 4 clusters.
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-24 Advantages and Disadvantages Simple random sample and systematic sample Simple to use May not be a good representation of the population’s underlying characteristics Stratified sample Ensures representation of individuals across the entire population Cluster sample More cost effective Less efficient (need larger sample to acquire the same level of precision)
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-25 Evaluating Survey Worthiness What is the purpose of the survey? Is the survey based on a probability sample? Coverage error – appropriate frame Nonresponse error – follow up Measurement error – good questions elicit good responses Sampling error – always exists
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-26 Types of Survey Errors Coverage error Non response error Sampling error Measurement error Excluded from frame. Follow up on non responses. Chance differences from sample to sample. Bad Question!
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-27 Chapter Summary Addressed why a manager needs to know about statistics Discussed the growth and development of modern statistics Addressed the notion of descriptive versus inferential statistics Discussed the importance of data
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-28 Chapter Summary Defined and described the different types of data and sources Discussed the design of survey Discussed types of sampling methods Described different types of survey errors (continued)
© 2002 Prentice-Hall, Inc.Chap 2-29 Basic Business Statistics (8 th Edition) Presenting Data in Tables and Charts
© 2002 Prentice-Hall, Inc. Chap 2-30 Chapter Topics Organizing numerical data The ordered array and stem-leaf display Tabulating and graphing Univariate numerical data Frequency distributions: tables, histograms, polygons Cumulative distributions: tables, the Ogive Graphing Bivariate numerical data
© 2002 Prentice-Hall, Inc. Chap 2-31 Chapter Topics Tabulating and graphing Univariate categorical data The summary table Bar and pie charts, the Pareto diagram Tabulating and graphing Bivariate categorical data Contingency tables Side by side bar charts Graphical excellence and common errors in presenting data (continued)
© 2002 Prentice-Hall, Inc. Chap 2-32 Organizing Numerical Data Numerical Data Ordered Array Stem and Leaf Display Frequency Distributions Cumulative Distributions Histograms Polygons Ogive Tables , 24, 32, 26, 27, 27, 30, 24, 38, 21 21, 24, 24, 26, 27, 27, 30, 32, 38, 41
© 2002 Prentice-Hall, Inc. Chap 2-33 Data in raw form (as collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 38 Data in ordered array from smallest to largest: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Stem-and-leaf display: Organizing Numerical Data (continued)
© 2002 Prentice-Hall, Inc. Chap 2-34 Tabulating and Graphing Numerical Data Numerical Data Ordered Array Stem and Leaf Display Histograms Ogive Tables , 24, 32, 26, 27, 27, 30, 24, 38, 21 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Frequency Distributions Cumulative Distributions Polygons
© 2002 Prentice-Hall, Inc. Chap 2-35 Tabulating Numerical Data: Frequency Distributions Sort raw data in ascending order: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find range: = 46 Select number of classes: 5 (usually between 5 and 15) Compute class interval (width): 10 (46/5 then round up) Determine class boundaries (limits): 10, 20, 30, 40, 50, 60 Compute class midpoints: 15, 25, 35, 45, 55 Count observations & assign to classes
© 2002 Prentice-Hall, Inc. Chap 2-36 Frequency Distributions, Relative Frequency Distributions and Percentage Distributions Class Frequency 10 but under but under but under but under but under Total Relative Frequency Percentage Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2002 Prentice-Hall, Inc. Chap 2-37 Graphing Numerical Data: The Histogram Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 No Gaps Between Bars Class Midpoints Class Boundaries
© 2002 Prentice-Hall, Inc. Chap 2-38 Graphing Numerical Data: The Frequency Polygon Class Midpoints Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2002 Prentice-Hall, Inc. Chap 2-39 Tabulating Numerical Data: Cumulative Frequency Cumulative Cumulative Class Frequency % Frequency 10 but under but under but under but under but under Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2002 Prentice-Hall, Inc. Chap 2-40 Graphing Numerical Data: The Ogive (Cumulative % Polygon) Class Boundaries (Not Midpoints) Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2002 Prentice-Hall, Inc. Chap 2-41 Graphing Bivariate Numerical Data (Scatter Plot)
© 2002 Prentice-Hall, Inc. Chap 2-42 Tabulating and Graphing Categorical Data:Univariate Data Categorical Data Tabulating Data The Summary Table Graphing Data Pie Charts Pareto Diagram Bar Charts
© 2002 Prentice-Hall, Inc. Chap 2-43 Summary Table (for an Investor’s Portfolio) Investment Category AmountPercentage (in thousands $) Stocks Bonds CD Savings Total Variables are Categorical
© 2002 Prentice-Hall, Inc. Chap 2-44 Graphing Categorical Data: Univariate Data Categorical Data Tabulating Data The Summary Table Graphing Data Pie Charts Pareto Diagram Bar Charts
© 2002 Prentice-Hall, Inc. Chap 2-45 Bar Chart (for an Investor’s Portfolio)
© 2002 Prentice-Hall, Inc. Chap 2-46 Pie Chart (for an Investor’s Portfolio) Percentages are rounded to the nearest percent. Amount Invested in K$ Savings 15% CD 14% Bonds 29% Stocks 42%
© 2002 Prentice-Hall, Inc. Chap 2-47 Pareto Diagram Axis for line graph shows cumulative % invested Axis for bar chart shows % invested in each category
© 2002 Prentice-Hall, Inc. Chap 2-48 Tabulating and Graphing Bivariate Categorical Data Contingency tables: investment in thousands of dollars Investment Investor A Investor B Investor C Total Category Stocks Bonds CD Savings Total
© 2002 Prentice-Hall, Inc. Chap 2-49 Tabulating and Graphing Bivariate Categorical Data Side by side charts
© 2002 Prentice-Hall, Inc. Chap 2-50 Principles of Graphical Excellence Presents data in a way that provides substance, statistics and design Communicates complex ideas with clarity, precision and efficiency Gives the largest number of ideas in the most efficient manner Almost always involves several dimensions Tells the truth about the data
© 2002 Prentice-Hall, Inc. Chap 2-51 Using “chart junk” Failing to provide a relative basis in comparing data between groups Compressing the vertical axis Providing no zero point on the vertical axis Errors in Presenting Data
© 2002 Prentice-Hall, Inc. Chap 2-52 “Chart Junk” Good Presentation 1960: $ : $ : $ : $3.80 Minimum Wage $ Bad Presentation
© 2002 Prentice-Hall, Inc. Chap 2-53 No Relative Basis Good Presentation A’s received by students. Bad Presentation 0 FRSOJRSR Freq. 10 30 FRSOJRSR % FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior
© 2002 Prentice-Hall, Inc. Chap 2-54 Compressing Vertical Axis Good Presentation Quarterly Sales Bad Presentation Q1Q2Q3 Q4 $ Q1Q2 Q3 Q4 $
© 2002 Prentice-Hall, Inc. Chap 2-55 No Zero Point on Vertical Axis Good Presentation Monthly Sales Bad Presentation J F MAMJ $ JFMAMJ $ Graphing the first six months of sales. 36
© 2002 Prentice-Hall, Inc. Chap 2-56 Chapter Summary Organized numerical data The ordered array and stem-leaf display Tabulated and graphed univariate numerical data Frequency distributions: tables, histograms, polygon Cumulative distributions: tables and the Ogive Graphed bivariate numerical data
© 2002 Prentice-Hall, Inc. Chap 2-57 Chapter Summary Tabulated and graphed univariate categorical data The summary table Bar and pie charts, the Pareto diagram Tabulated and graphed bivariate categorical data Contingency tables Side by side charts Discussed graphical excellence and common errors in presenting data (continued)
© 2002 Prentice-Hall, Inc.Chap 3-58 Basic Business Statistics (8 th Edition) Numerical Descriptive Measures
© 2002 Prentice-Hall, Inc. Chap 3-59 Chapter Topics Measures of central tendency Mean, median, mode, geometric mean, midrange Quartile Measure of variation Range, Interquartile range, variance and standard deviation, coefficient of variation Shape Symmetric, skewed, using box-and-whisker plots
© 2002 Prentice-Hall, Inc. Chap 3-60 Chapter Topics Coefficient of correlation Pitfalls in numerical descriptive measures and ethical considerations (continued)
© 2002 Prentice-Hall, Inc. Chap 3-61 Summary Measures Central Tendency Mean Median Mode Quartile Geometric Mean Summary Measures Variation Variance Standard Deviation Coefficient of Variation Range
© 2002 Prentice-Hall, Inc. Chap 3-62 Measures of Central Tendency Central Tendency AverageMedianMode Geometric Mean
© 2002 Prentice-Hall, Inc. Chap 3-63 Mean (Arithmetic Mean) Mean (arithmetic mean) of data values Sample mean Population mean Sample Size Population Size
© 2002 Prentice-Hall, Inc. Chap 3-64 Mean (Arithmetic Mean) The most common measure of central tendency Affected by extreme values (outliers) (continued) Mean = 5Mean = 6
© 2002 Prentice-Hall, Inc. Chap 3-65 Median Robust measure of central tendency Not affected by extreme values In an ordered array, the median is the “middle” number If n or N is odd, the median is the middle number If n or N is even, the median is the average of the two middle numbers Median = 5
© 2002 Prentice-Hall, Inc. Chap 3-66 Mode A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may may be no mode There may be several modes Mode = No Mode
© 2002 Prentice-Hall, Inc. Chap 3-67 Geometric Mean Useful in the measure of rate of change of a variable over time Geometric mean rate of return Measures the status of an investment over time
© 2002 Prentice-Hall, Inc. Chap 3-68 Example An investment of $100,000 declined to $50,000 at the end of year one and rebounded to $100,000 at end of year two:
© 2002 Prentice-Hall, Inc. Chap 3-69 Quartiles Split Ordered Data into 4 Quarters Position of i-th Quartile and Are Measures of Noncentral Location = Median, A Measure of Central Tendency 25% Data in Ordered Array:
© 2002 Prentice-Hall, Inc. Chap 3-70 Measures of Variation Variation VarianceStandard DeviationCoefficient of Variation Population Variance Sample Variance Population Standard Deviation Sample Standard Deviation Range Interquartile Range
© 2002 Prentice-Hall, Inc. Chap 3-71 Range Measure of variation Difference between the largest and the smallest observations: Ignores the way in which data are distributed Range = = Range = = 5
© 2002 Prentice-Hall, Inc. Chap 3-72 Measure of variation Also known as midspread Spread in the middle 50% Difference between the first and third quartiles Not affected by extreme values Interquartile Range Data in Ordered Array:
© 2002 Prentice-Hall, Inc. Chap 3-73 Important measure of variation Shows variation about the mean Sample variance: Population variance: Variance
© 2002 Prentice-Hall, Inc. Chap 3-74 Standard Deviation Most important measure of variation Shows variation about the mean Has the same units as the original data Sample standard deviation: Population standard deviation:
© 2002 Prentice-Hall, Inc. Chap 3-75 Comparing Standard Deviations Mean = 15.5 s = Data B Data A Mean = 15.5 s = Mean = 15.5 s = 4.57 Data C
© 2002 Prentice-Hall, Inc. Chap 3-76 Coefficient of Variation Measures relative variation Always in percentage (%) Shows variation relative to mean Is used to compare two or more sets of data measured in different units
© 2002 Prentice-Hall, Inc. Chap 3-77 Comparing Coefficient of Variation Stock A: Average price last year = $50 Standard deviation = $5 Stock B: Average price last year = $100 Standard deviation = $5 Coefficient of variation: Stock A: Stock B:
© 2002 Prentice-Hall, Inc. Chap 3-78 Shape of a Distribution Describes how data is distributed Measures of shape Symmetric or skewed Mean = Median =Mode Mean < Median < Mode Mode < Median < Mean Right-Skewed Left-SkewedSymmetric
© 2002 Prentice-Hall, Inc. Chap 3-79 Exploratory Data Analysis Box-and-whisker plot Graphical display of data using 5-number summary Median( ) X largest X smallest
© 2002 Prentice-Hall, Inc. Chap 3-80 Distribution Shape and Box-and-Whisker Plot Right-SkewedLeft-SkewedSymmetric
© 2002 Prentice-Hall, Inc. Chap 3-81 Coefficient of Correlation Measures the strength of the linear relationship between two quantitative variables
© 2002 Prentice-Hall, Inc. Chap 3-82 Features of Correlation Coefficient Unit free Ranges between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker any positive linear relationship
© 2002 Prentice-Hall, Inc. Chap 3-83 Scatter Plots of Data with Various Correlation Coefficients Y X Y X Y X Y X Y X r = -1 r = -.6r = 0 r =.6 r = 1
© 2002 Prentice-Hall, Inc. Chap 3-84 Pitfalls in Numerical Descriptive Measures Data analysis is objective Should report the summary measures that best meet the assumptions about the data set Data interpretation is subjective Should be done in fair, neutral and clear manner
© 2002 Prentice-Hall, Inc. Chap 3-85 Ethical Considerations Numerical descriptive measures: Should document both good and bad results Should be presented in a fair, objective and neutral manner Should not use inappropriate summary measures to distort facts
© 2002 Prentice-Hall, Inc. Chap 3-86 Chapter Summary Described measures of central tendency Mean, median, mode, geometric mean, midrange Discussed quartile Described measure of variation Range, interquartile range, variance and standard deviation, coefficient of variation Illustrated shape of distribution Symmetric, skewed, box-and-whisker plots
© 2002 Prentice-Hall, Inc. Chap 3-87 Chapter Summary Discussed correlation coefficient Addressed pitfalls in numerical descriptive measures and ethical considerations (continued)