Statistics The systematic and scientific treatment of quantitative measurement is precisely known as statistics. Statistics may be called as science of counting. Statistics is concerned with the collection, classification (or organization), presentation, analysis and interpretation of data which are measurable in numerical terms.
Stages of Statistical Investigation Collection of Data Organization of data Presentation of data Analysis Interpretation of Results
Statistics It is divided into two major parts: Descriptive and Inferential Statistics. Descriptive statistics, is a set of methods to describe data that we have collected. i.e. summarization of data. Inferential statistics, is a set of methods used to make a generalization, estimate, prediction or decision. When we want to draw conclusions about a distribution.
Statistics functions & Uses It simplifies complex data It provides techniques for comparison It studies relationship It helps in formulating policies It helps in forecasting It is helpful for common man Statistical methods merges with speed of computer can make wonders; SPSS, STATA MATLAB, MINITAB etc.
Scope of Statistics In Business Decision Making In Medical Sciences In Actuarial Science In Economic Planning In Agricultural Sciences In Banking & Insurance In Politics & Social Science
Distrust & Misuse of Statistics Statistics is like a clay of which one can make a God or Devil. Statistics are the liers of first order. Statistics can prove or disprove anything.
Measure of Central Tendency It is a single value represent the entire mass of data. Generally, these are the central part of the distribution. It facilitates comparison & decision-making There are mainly three type of measure 1. Arithmetic mean 2. Median 3. Mode
Arithmetic Mean This single representative value can be determined by: A.M.=Sum/No. of observations Properties: 1. The sum of the deviations from AM is always zero. 2. If every value of the variable increased or decreased by a constant then new AM will also change in same ratio.
Arithmetic Mean (contd..) 3. If every value of the variable multiplied or divide by a constant then new AM will also change in same ratio. 4. The sum of squares of deviations from AM is minimum. 5. The combined AM of two or more related group is defined as
Median
Mode Mode is that value which occurs most often in the series. It is the value around which, the items tends to be heavily concentrated. It is important average when we talk about “most common size of shoe or shirt”.
Relationship among Mean, Median & Mode For a symmetric distribution: Mode = Median = Mean The empirical relationship between mean, median and mode for asymmetric distribution is: Mode = 3 Median – 2 Mean
Advantages and disadvantages MeanMore sensitive than the median, because it makes use of all the values of the data. It can be misrepresentative if there is an extreme value. MedianIt is not affected by extreme scores, so can give a representative value. It is less sensitive than the mean, as it does not take into account all of the values. ModeIt is useful when the data are in categories, such as the number of babies who are securely attached. It is not a useful way of describing data when there are several modes.
Same center, different variation
Ignores the way in which data are distributed Range = = Range = = 5
When the value of Arithmetic mean is fraction value(not an integer), Then to compute variance we use the formulae:
x Calculate S.D.;-
Formulae for Frequency distribution By Definition: For Computation:
Example An analysis of production rejects resulted in the following figures. Calculate mean and variance for number of rejects per operator No. of rejects per operator No. of operators
Example SaleNo. of days Calculate variance from the following data. (Sale is given in thousand Rs.)
No. of rejects/ operatorNo. of operators An Analysis of production rejects resulted in following observations Calculate the mean and standard deviation.
Measures relative variation Always in percentage (%) Shows variation relative to mean Is used to compare two or more sets of data measured in different units
Comparing Coefficient of Variation Stock A: – Average price last year = $50 – Standard deviation = $5 Stock B: – Average price last year = $100 – Standard deviation = $5
Coefficient of variation: Stock A: Stock B:
. An investment ‘A’ has an Expected return of Rs.1,000 and a standard deviation of Rs Another investment ‘B’ has a standard deviation of its returns as 400 but its expected return is 4,000. Calculate which investment is more risky.
Example Length of life (in hrs.) Company ACompany B A quality control laboratory received samples of electric bulbs for testing their lives, from two companies. The results were as follows: (a). Which company’s bulbs have the greater length of life? (b). Which company’s bulbs are more uniform with respect to their lives?
Determine the Mean and standard deviation of prices of shares.In which markets are the share prices more stable? The share prices of a company in Mumbai and Kolkata markets during the last 10 months are recorded below: MonthMumbaiKolkata Jan Feb March April May June July Aug Sep Oct112135
Shape of a Distribution Describes how data is distributed Measures of shape – Symmetric or skewed Mean = Median =Mode Mean < Median < Mode Mode < Median < Mean Right-Skewed Left-SkewedSymmetric
Skewness For a positively skewed distribution: Mean>Median>Mode For a Negatively skewed distribution: Mean<Median<Mode
Measure of Skewness Karl Pearson coefficient of Skewness: Where -3 <= <= 3
Calculate the Karl pearson coefficient of skewness for the given data & comment about the result. 7, 9, 15, 16, 17, 22, 25, 27,33,39.
Advantages and disadvantages AdvantagesDisadvantages RangeQuick and easy to calculateAffected by extreme values (outliers) Does not take into account all the values Standard deviationMore precise measure of dispersion because all values are taken into account Much harder to calculate than the range