Download presentation
Presentation is loading. Please wait.
Published byHilary Bishop Modified over 9 years ago
2
Business Statistics
3
Outline Dealing with decision problem when the face of uncertainty are important. Descriptive Statistics Sampling and Sampling Distributions Point and Interval Estimation Hypothesis Testing Non-parametric Test - Chi-square Test Analysis of Variance
4
Outline (cont.) Time Series and Forecasting Survey and sampling methods Multivariate Analysis Bayesian Statistics and Decision Analysis
5
Descriptive Statistics Session 1
6
Population and sample Measures of Central Tendency Mean, Median, Mode Measures of Dispersion Variance, Standard deviation Percentile, Inter-quartile range Grouped data and histogram Other data representations Descriptive Statistics
7
Population and Sample Population The population consists of the set of all measurements in which the investigator is interested. The population is also called the universe. Sample A sample is a subset of measurements selected from the population. Sampling from the population is often done randomly i.e. such that every possible sample of n elements will have equal chance of being selected. A sample created in this way is called simple random sample or random sample.
8
A medical manufacturer interested in marketing a new drug may be required the Food and Drug Administration (FDA) to prove that the drug does not cause any serious side effect. The sampling was made by selecting a sample of people randomly, the result of tests of drug using on this sample may then be used in a statistical inference about the entire population of people who may use the drug if it will be introduced. Example 1.1.
9
Illustration for simple random sampling
10
Measures of Central Tendency Mean Arithmetic Mean - AM Given a set of data, the arithmetic mean is defined as follows: ModeThe mode of a data set is the value that occurs most frequently This kind of mean is the most frequently used.
11
Measures of Central Tendency Harmonic Mean - HM This kind of mean is used when dealing with velocity.
12
Population Mean Sample Mean Median The median of a set of observations is a special point, it lies in position that half of the data lie below it and half above it. Measures of Central Tendency
13
Set 1: Ordering 7, 9, 15, 18, 20; median is 15 Set 2: Ordering 15.8 20.7 21.1 22.5 33.4 40.3 Median = (21.1 + 22.5)/2 = 21.8 Example 1.2. Find median of the following two sets of data. Set 1: 15 20 7 9 18 (n=5) Set 2: 20.7 22.5 15.8 40.3 33.4 21.1 (n=6)
14
Measurements of Dispersion The variance of a set of observations is the average squared deviation of the data points from their mean. Variance and Standard Deviation Sample Variance Note The denominator is of (n-1)
15
Population Variance The standard deviation of a set of observations is the square root of the variance of the set Measurements of Dispersion Variance and Standard Deviation
16
Percentiles The P th percentile of a group of numbers is that value below which lie P% (P percent) of the numbers in the group. The position is given by (n+1)* P /100 where n is the number of data points. (GRE, GMAT Test) Measurements of Dispersion
17
Quartiles The percentage points that break the data set into 4 groups by the quarters-1st quarter, 2nd quarter and 3rd quarter 1st quartile Q 1 is the 25 th percentile. 2nd quartile Q 2 is the 50 th percentile. 3rd quartile Q 3 is the 75 th percentile. Inter-Quartile Range IQR = Q 3 - Q 1 Measurements of Dispersion
18
Example 1.3. Given a data set including 22 points: 88, 56, 64, 45, 52, 76, 54, 79, 38, 98, 69, 77, 71, 45, 60, 78, 90, 81, 87, 44, 80, 41. Find the 20th, 30th and 90th percentiles. Also find the IQR. What are mean, mode and median? What is the variance of the set ? SPSS Measurements of Dispersion
19
Grouped Data and Histogram Classes We divide the data values into classes which have the same length and cover all data points. Each class represents for a m i observation value. Frequencies f i The number of observations in each class. Total frequencies is number of observations N. The relative frequency of each class is the ratio of individual frequency and N. Histogram
20
Mean and Variance of grouped data Population Variance Mean Sample Mean Variance Where K is number of classes, n is number observations of sample. Grouped Data and Histogram
21
The number of errors in a text books was found. Number of errors per page is placed in column (mi) while column (fi) shows the number of pages contains errors. The following table and charts show histogram of errors distribution: Example1.4 Grouped Data and Histogram
22
Example1.4
23
Other Descriptive Statistics Index numbers Simple index numbers A index number is a number that measures the relative change in a set of measurements over time. Index number for period i = 100 (value in period i / value in base period )
24
Other Descriptive Statistics
25
Consumer Price Index - Laspeyres Index Laspeyres Index gives us a measurement for a change of quantity and price of items. Other Descriptive Statistics
26
Items199319941995 Price QuantityPrice QuantityPrice Quantity Beef238502405223354 Pork140261622416220 Eggs8515102128010 Milk105851129111392 Bread513054285528 Potatoes180101911216011 Tomatoes465506534 Oranges427537528 Other Descriptive Statistics
27
Compute the Laspeyres Index: –Select year 1993 as a base year For 1993: Sum of quantity x price = 29594 For 1994: Sum of quantity x price = 31413 For 1995: Sum of quantity x price = 30546 –Laspeyres Index: For 1993:100 For 1994:106.15 For 1993:103.22 Other Descriptive Statistics
28
Stem-and-Leaf Displays A way for re-arranging data to allow the data “speak for themselves”. Given the data set: 11, 12, 12, 13, 14, 15, 15, 16, 20, 21, 21, 21, 21, 22, 25, 25, 26, 27, 28, 29, 29, 31, 32, 34, 35, 36, 38, 41, 42, 45, 47, 50, 52, 55, 60, 62 Example Other Descriptive Statistics
29
The Stem-and-leaf display 1 12234556 2 0111125567899 3 124568 4 1257 5 025 6 02 Other Descriptive Statistics
30
Box-Whiskers plot Other Descriptive Statistics
31
Examples for Box-Whiskers plot
32
Box-Whisker plot (or Box plot) are useful for the following purposes. To identify the spread of data set. To identify the location of data set based on median. To identify possible skewness of the distribution. To identify suspected outlier and outlier. To quickly compare data sets. Look at example in SPSS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.