Eng.Mosab I. Tabash Applied Statistics
Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics
Eng.Mosab I. Tabash Objectives To define statistics To define statistics To discuss the wide range of applications of statistics in business To discuss the wide range of applications of statistics in business To understand the branches of statistics To understand the branches of statistics To describe the levels of measurement of data To describe the levels of measurement of data
Eng.Mosab I. Tabash What is Statistics? Science of collecting, organizing, presenting, analyzing, and interpreting data for the purpose of assisting in making more effective decision Branch of mathematics Facts and figures A subject or discipline Collections of data A way to get information from data
Eng.Mosab I. Tabash What is Statistics? “Statistics is a way to get information from data” Data Statistics Information Data: Facts, especially numerical facts, collected together for reference or information. Definitions: Oxford English Dictionary Information: Knowledge communicated concerning some particular fact. Statistics is a tool for creating new understanding from a set of numbers.
Eng.Mosab I. Tabash Example - Stats Anxiety… A business school student is anxious about their statistics course, since they’ve heard the course is difficult. The professor provides last term’s final exam marks to the student. What can be discerned from this list of numbers? Data Statistics Information List of last term’s marks : New information about the statistics class. E.g. Class average, Proportion of class receiving A’s Most frequent mark, Marks distribution, etc.
Eng.Mosab I. Tabash Applications of Statistics in Business Accounting – auditing and cost estimation Finance – investments and portfolio management Human resource – compensation, job satisfaction, performance measure Operation – quality management, forecasting, MIS, capacity planning, materials control Marketing - market analysis, consumer research, pricing Economics – regional, national, and international economic performance International Business- market and demographic analysis.
Eng.Mosab I. Tabash Key Statistical Concepts… Population — a population is the group of all items of interest to a statistics practitioner. — frequently very large; sometimes infinite. E.g. All Blue collar workers in Malaysia Sample — A sample is a set of data drawn from the population. — Potentially very large, but less than the population. E.g. a sample of 765 blue collar workers
Eng.Mosab I. Tabash Key Statistical Concepts… Parameter — A descriptive measure of a population. Statistic — A descriptive measure of a sample.
Eng.Mosab I. Tabash Key Statistical Concepts… Populations have Parameters, Populations have Parameters, Samples have Statistics. Samples have Statistics. Parameter Population Sample Statistic Subset
Eng.Mosab I. Tabash Branches of Statistics Statistics Descriptive StatisticsInferential Statistics Non-Parametric StatisticsParametric Statistics
Eng.Mosab I. Tabash Descriptive Statistics… …are methods of organizing, summarizing, and presenting data in a convenient and informative way. These methods include: Graphical Techniques Numerical Techniques The actual method used depends on what information we would like to extract. Are we interested in… measure(s) of central location? and/or measure(s) of variability (dispersion)? Descriptive Statistics helps to answer these questions…
Eng.Mosab I. Tabash Inferential Statistics… Descriptive Statistics describe the data set that’s being analyzed, but doesn’t allow us to draw any conclusions or make any interferences about the data. Hence we need another branch of statistics: inferential statistics. Inferential statistics is also a set of methods, but it is used to draw conclusions or inferences about characteristics of populations based on data from a sample.
Eng.Mosab I. Tabash Statistical Inference… Statistical inference is the process of making an estimate, prediction, or decision about a population based on a sample. Parameter Population Sample Statistic Inference What can we infer about a Population’s Parameters based on a Sample’s Statistics?
Eng.Mosab I. Tabash Statistical Inference… We use statistics to make inferences about parameters. Therefore, we can make an estimate, prediction, or decision about a population based on sample data. Thus, we can apply what we know about a sample to the larger population from which it was drawn!
Eng.Mosab I. Tabash Inference… Statistical Inference… Rationale: Large populations make investigating each member impractical and expensive. Easier and cheaper to take a sample and make estimates about the population from the sample. However: Such conclusions and estimates are not always going to be correct. For this reason, we build into the statistical inference “measures of reliability”, namely confidence level and significance level.
Eng.Mosab I. Tabash Confidence & Significance Levels… The confidence level is the proportion of times that an estimating procedure will be correct. E.g. a confidence level of 95% means that, estimates based on this form of statistical inference will be correct 95% of the time. When the purpose of the statistical inference is to draw a conclusion about a population, the significance level measures how frequently the conclusion will be wrong in the long run. E.g. a 5% significance level means that, in the long run, this type of conclusion will be wrong 5% of the time.
Eng.Mosab I. Tabash Process of Inferential Statistics
Eng.Mosab I. Tabash Types of Data and Information Definitions… A variable is some characteristic of a population or sample. E.g. student grades; workers salary Typically denoted with a capital letter:A,A-, B+, B, B-… The values of the variable are the range of possible values for a variable. E.g. student marks (0..100) Data are the observed values of a variable. E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
Eng.Mosab I. Tabash Types of Data & Information Data (at least for purposes of Statistics) fall into three main groups: Interval Data Nominal Data Ordinal Data
Eng.Mosab I. Tabash Interval Data… Interval data Real numbers, i.e. heights, weights, prices, etc. Also referred to as quantitative or numerical. Arithmetic operations can be performed on Interval Data, thus its meaningful to talk about 2*Height, or Price + $1, and so on.
Eng.Mosab I. Tabash Nominal Data… Nominal Data The values of nominal data are categories. E.g. responses to questions about marital status, coded as: Single = 1, Married = 2, Divorced = 3, Widowed = 4 Because the numbers are arbitrary, arithmetic operations don’t make any sense (e.g. does Widowed ÷ 2 = Married?!) Nominal data are also called qualitative or categorical.
Eng.Mosab I. Tabash Ordinal Data… Ordinal Data appear to be categorical in nature, but their values have an order; a ranking to them: E.g. College course rating system: poor = 1, fair = 2, good = 3, very good = 4, excellent = 5 While its still not meaningful to do arithmetic on this data (e.g. does 2*fair = very good?!), we can say things like: excellent > poor or fair poor or fair < very good That is, order is maintained no matter what numeric values are assigned to each category.
Eng.Mosab I. Tabash Types of Data & Information… Categorical? Data Interval Data Nominal Data Ordinal Data N Ordered? Y Y N Categorical Data
Eng.Mosab I. Tabash E.g. Representing Student Grades… Categorical? Data Interval Data e.g. {0..100} Nominal Data e.g. {Pass | Fail} Ordinal Data e.g. {F, D, C, B, A} N Ordered? Y Y N Categorical Data Rank order to data NO rank order to data
Eng.Mosab I. Tabash Calculations for Types of Data As mentioned above, All calculations are permitted on interval data. Only calculations involving a ranking process are allowed for ordinal data. No calculations are allowed for nominal data, only counting the number of observations in each category is possible. This lends itself to the following “hierarchy of data”…
Eng.Mosab I. Tabash Hierarchy of Data… Interval Values are real numbers. All calculations are valid. Data may be treated as ordinal or nominal. Ordinal Values must represent the ranked order of the data. Calculations based on an ordering process are valid. Data may be treated as nominal but not as interval. Nominal Values are the arbitrary numbers that represent categories. Only calculations based on the frequencies of occurrence are valid. Data may not be treated as ordinal or interval.
Eng.Mosab I. Tabash Data Level, Operations, and Statistical Methods Data Level Nominal Ordinal Interval Meaningful Operations Classifying and Counting All of the above plus Ranking All of the above plus Addition, Subtraction, Multiplication, and Division Statistical Methods Nonparametric Parametric