Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 STATISTICAL CONCEPTS AND LANGUAGE 2.1 THE DIFFERENCE BETWEEN THE POPULATION AND A SAMPLE 2.2THE DIFFERENCE BETWEEN THE PARAMETER AND A STATISTICS.

Similar presentations


Presentation on theme: "Chapter 2 STATISTICAL CONCEPTS AND LANGUAGE 2.1 THE DIFFERENCE BETWEEN THE POPULATION AND A SAMPLE 2.2THE DIFFERENCE BETWEEN THE PARAMETER AND A STATISTICS."— Presentation transcript:

1 Chapter 2 STATISTICAL CONCEPTS AND LANGUAGE 2.1 THE DIFFERENCE BETWEEN THE POPULATION AND A SAMPLE 2.2THE DIFFERENCE BETWEEN THE PARAMETER AND A STATISTICS 2.3MEASUREMENT LEVELS 2.4SAMPLING METHODS (SIMPLE RANDOM SAMPLING, STRATIFIED RANDOM SAMPLING, CLUSTER SAMPLING, SYSTEMATIC SAMPLING, AND CONVENIENCE SAMPLING)

2 2.0 Statistical Concepts and Language  Data Set:  Measurements of items  e.g., Yearly sales volume for your 23 salespeople  e.g., Cost and number produced, daily, for the past month  Elementary Units:  The items being measured  e.g., Salespeople, Days, Companies, Catalogs, …  A Variable:  The type of measurement being done  e.g., Sales volume, Cost, Productivity, Number of defects, …

3 How many variables?  Univariate data set : One variable measured for each elementary unit  e.g., Sales for the top 30 computer companies.  Can do: Typical summary, diversity, special features  Bivariate data set : Two variables  e.g., Sales and # Employees for top 30 computer firms  Can also do: relationship, prediction  Multivariate data set : Three or more variables  e.g., Sales, # Employees, Inventories, Profits, …  Can also do: predict one from all other variables

4 2.1 The Difference Between the Population and a Sample  Population — the whole  a collection of all persons, objects, or items under study  Census — gathering data from the entire population  Sample — gathering data on a subset of the population  Use information about the sample to infer about the population

5 Population

6 IdentifierColorMPG RD1Red12 RD2Red10 RD3Red13 RD4Red10 RD5Red13 BL1Blue27 BL2Blue24 GR1Green35 GR2Green35 GY1Gray15 GY2Gray18 GY3Gray17 Population and Census Data

7 Sample and Sample Data IdentifierColorMPG RD2Red10 RD5Red13 GR1Green35 GY2Gray18

8 2.2The Difference Between the Parameter and a Statistics Parameter — descriptive measure of the population Usually represented by Greek letters Statistic — descriptive measure of a sample Usually represented by Roman letters

9 Process of Inferential Statistics

10 Statistics in Business  Probability is used in statistics To estimate the level of confidence in a confidence interval To calculate the p-value in hypothesis testing

11 2.3Measurement Levels  Nominal — In nominal measurement the values just "name" the attribute uniquely.  No ordering of the cases is implied.  For example, a persons gender is nominal. It doesn’t matter whether you call them boys vs. girls or males vs. females or XY vs. XX chromosomes.  Another example is religion – Catholic, Protestant, Muslim, etc.

12  Ordinal - A variable is ordinal measurable if ranking is possible for values of the variable.  For example, a gold medal reflects superior performance to a silver or bronze medal in the Olympics. You can’t say a gold and a bronze medal average out to a silver medal, though.  Preference scales are typically ordinal – how much do you like this cereal? Like it a lot, somewhat like it, neutral, somewhat dislike it, dislike it a lot.

13  Interval - In interval measurement the distance between attributes does have meaning.  Numerical data typically fall into this category  For example, when measuring temperature (in Fahrenheit), the distance from 30-40 is same as the distance from 70-80. The interval between values is interpretable.

14  Ratio — in ratio measurement there is always a reference point that is meaningful (either 0 for rates or 1 for ratios)  This means that you can construct a meaningful fraction (or ratio) with a ratio variable.  In applied social research most "count" variables are ratio, for example, the number of clients in past six months.

15 Nominal Level Data  Numbers are used to classify or categorize Example: Employment Classification  1 for Educator  2 for Construction Worker  3 for Manufacturing Worker

16 Ordinal Level Data  Numbers are used to indicate rank or order  Relative magnitude of numbers is meaningful  Differences between numbers are not comparable Example: Ranking productivity of employees Example: Position within an organization  1 for President  2 for Vice President  3 for Plant Manager  4 for Department Supervisor  5 for Employee

17 Faculty and staff should receive preferential treatment for parking space. Ordinal Data 12345 Strongly Agree Strongly Disagree Neutral

18 Interval Level Data  Interval Level data - Distances between consecutive integers are equal  Relative magnitude of numbers is meaningful  Differences between numbers are comparable  Location of origin, zero, is arbitrary  Vertical intercept of unit of measure transform function is not zero Example: Fahrenheit Temperature Example: Monetary Utility

19 Ratio Level Data  Highest level of measurement  Relative magnitude of numbers is meaningful  Differences between numbers are comparable  Location of origin, zero, is absolute (natural)  Vertical intercept of unit of measure transform function is zero Examples: Height, Weight, and Volume Example: Monetary Variables, such as Profit and Loss, Revenues, Expenses, Financial ratios - such as P/E Ratio, Inventory Turnover, and Quick Ratio.

20 Ratio Level Data  Parametric statistics – requires that the data be interval or ration  Non Parametric – used if data are nominal or ordinal  Non parametric statistics can be used to analyze interval or ratio data

21 Copyright 2011 John Wiley & Sons, Inc. 21 Data Level Nominal Ordinal Interval Ratio Meaningful Operations Classifying and Counting All of the above plus Ranking All of the above plus Addition, Subtraction, Multiplication, and Division (including means, standard deviations, etc.) All of the above Data Level, Operations, and Statistical Methods

22 2.4Sampling Methods Reasons for Sampling  Sampling – A means for gathering information about a population without conducting a census  Information gathered from sample, and inference is made about the population  Sampling has advantages over a census  Sampling can save money.  Sampling can save time.

23 Random Versus Nonrandom Sampling  Nonrandom Sampling - Every unit of the population does not have the same probability of being included in the sample  Random sampling - Every unit of the population has the same probability of being included in the sample.

24 Random Sampling Techniques  Simple Random Sample – basis for other random sampling techniques  Each unit is numbered from 1 to N (the size of the population)  A random number generator can be used to select n items that form the sample

25 Random Sampling Techniques  Stratified Random Sample  The population is broken down into strata with like characteristics (i.e. men and women OR old, young, and middle-aged people)  Efficient when differences between strata exist  Proportionate (% of the sample from each stratum equals % that each stratum is within the whole population)  Systematic Random Sample  Define k = N/n. Choose one random unit from first k units, and then select every k th unit from there.  Cluster (or Area) Sampling  The population is in pre-determined clusters (students in classes, apples on trees, etc.)  A random sample of clusters is chosen and all or some units within the cluster is used as the sample

26 Simple Random Sample: Population Members  Population size of N = 30  Desired sample size of n = 6 01 Alaska Airlines 02 Alcoa 03 Ashland 04 Bank of America 05 BellSouth 06 Chevron 07 Citigroup 08 Clorox 09 Delta Air Lines 10 Disney 11 DuPont 12 Exxon Mobil 13 General Dynamics 14 General Electric 15 General Mills 16 Halliburton 17 IBM 18 Kellog 19 KMart 20 Lowe’s 21 Lucent 22 Mattel 23 Mead 24 Microsoft 25 Occidental Petroleum 26 JCPenney 27 Procter & Gamble 28 Ryder 29 Sears 30 Time Warner

27 Simple Random Sampling: Random Number Table Select 6 values from 1 to 30 (ignore repeats) and get

28 01 Alaska Airlines 02 Alcoa 03 Ashland 04 Bank of America 05 BellSouth 06 Chevron 07 Citigroup 08 Clorox 09 Delta Air Lines 10 Disney 11 DuPont 12 Exxon Mobil 13 General Dynamics 14 General Electric 15 General Mills 16 Halliburton 17 IBM 18 Kellog 19 KMart 20 Lowe’s 21 Lucent 22 Mattel 23 Mead 24 Microsoft 25 Occidental Petroleum 26 JCPenney 27 Procter & Gamble 28 Ryder 29 Sears 30 Time Warner Simple Random Sample: Sample Members

29 Systematic Sampling: Example  Purchase orders for the previous fiscal year are serialized 1 to 10,000 (N = 10,000).  A sample of fifty (n = 50) purchases orders is needed for an audit.  k = 10,000/50 = 200

30 Systematic Sampling: Example  First sample element randomly selected from the first 200 purchase orders. Assume the 45th purchase order was selected.  Subsequent sample elements: 45, 245, 445, 645,...

31 Convenience (Non Random) Sampling  Non-Random sampling – sampling techniques used to select elements from the population by any mechanism that does not involve a random selection process  These techniques are not desirable for making statistical inferences  Example – choosing members of this class as an accurate representation of all students at our university, selecting the first five people that walk into a store and ask them about their shopping preferences, etc.


Download ppt "Chapter 2 STATISTICAL CONCEPTS AND LANGUAGE 2.1 THE DIFFERENCE BETWEEN THE POPULATION AND A SAMPLE 2.2THE DIFFERENCE BETWEEN THE PARAMETER AND A STATISTICS."

Similar presentations


Ads by Google