Presentation is loading. Please wait.

Presentation is loading. Please wait.

STATISTICS is about how to COLLECT, ORGANIZE,

Similar presentations


Presentation on theme: "STATISTICS is about how to COLLECT, ORGANIZE,"— Presentation transcript:

1 STATISTICS is about how to COLLECT, ORGANIZE,
SUMMARIZE, ANALYZE, and INTERPRET a collected data. Example: Suppose we are conducting a study about the people who have climbed Mt. Apo. Q: Who are the individuals of interest here? If possible, all the people who have actually made it to the summit of Mt. Apo. This is the population of interest here. Q: What are the characteristics of these individuals we would like to observe or measure which are possibly relevant to our study? Height, Age, Weight, Gender, Nationality, Income, etc. These are the variables in our study.

2 BASIC IDEAS In a statistical study, we concentrate on individuals or objects which are pertinent to its goals. The POPULATION is the complete collection of all individuals or objects which are of interest to the study. Since it is not always possible to use the entire population for a study (due to the expense, time, size, etc.), we only select certain portions of the population which possess the same or similar characteristics of interest to the study. A SAMPLE is a group of individuals or objects selected from a population (to represent the large population).

3 A characteristic of individuals or objects being measured in the study is called a VARIABLE. The values of these variables are called DATA or DATA VALUES. Example: In a study about the people who have climbed Mt. Apo, Variable Data Values Height 5.5ft, 5.7ft, 5.9ft, 6ft, … Weight 75kg, 62kg, 68kg, 70.3kg, … Age 22yrs, 28yrs, 29yrs, … Gender M (Male), F (Female) Nationality Filipino, Chinese, …

4 As we can see from the example, we have two kinds of variables, depending on the kind of data they yield: QUALITATIVE VARIABLES yields data which describe, label, or categorize an element of the population. Also called ATTRIBUTE or CATEGORICAL VARIABLES. QUANTITATIVE VARIABLES yields data which numerically measure an element of the population. Also called NUMERICAL VARIABLES. Example: A nationwide survey of adult asks, “ how many times per week do you eat in a fast-food restaurant?” A. What is the implied population of this survey? B. Identify the variable. C. Is the variable qualitative or quantitative?

5 Quantitative variables are more interesting since they yield data which can meaningfully undergo arithmetic operations. We have two types of quantitative variables: The DISCRETE type yield data values which are countable. Examples: A. The number of dependents an employee has B. The number of topics in a three-unit algebra course C. The number of times per week a student eats in McDonald’s The CONTINUOUS type yield data values which can lie any-where in an interval (and, therefore, not countable). Examples: A. The height of a mountain climber B. The weight of a power lifter

6 VARIABLE / DATA QUALITATIVE QUANTITATIVE DISCRETE CONTINUOUS

7 Besides this classification of variables (according to the type of data they yield), we can also distinguish the scale, level, or depth of measuring variables (that is, the depth of arithmetic we can perform on their data). We have four levels of measurement: Data on the NOMINAL LEVEL consists of names, labels, or categories, where no ranking or ordering can be applied. Examples: Gender, Eye color, Religion, Nationality, Zip code Data on the ORDINAL LEVEL also consists of names, labels, or categories, but ranking or ordering can be applied. Examples: Grade (A, B+, B, …, F), Performance rating (Poor, Fair, Good, Very Good, Excellent), T-shirt size (XS, S, M, L, XL), Qualitative variables belong to either nominal or ordinal level of measurement.

8 Data on the INTERVAL LEVEL can be ranked or ordered, and precise differences between these values are meaningful. Examples: A. IQ scores. (50 – 200) IQ scores can be ranked (obviously) from lowest to highest. A difference of 20 in the IQ scores of two students (say, Student 1 has an IQ score of 122 and Student 2 has 142) means: Student 2 is able to achieve more academically. B. Temperature (ºC) Temperature can be ranked from lowest to highest. A difference of 23 ºC in the temperatures of two objects (say, Object 1 has a temperature of 54ºC and Object 2 has 31ºC) means: Object 1 is warmer.

9 Data on the RATIO LEVEL can be ranked or ordered, and both differences and ratios (quotients) between these values are meaningful. Examples: A. Height Height can be ranked from smallest to tallest. Suppose Object 1 is 3ft tall and Object 2 is 12ft tall. The difference, 12 – 3 = 9ft, means Object 2 is taller by 9ft. We can also divide the values. The quotient, 12 / 3 = 4, means Object 2 is four times taller B. Other examples are: Weight, Age, Salary Quantitative variables belong to either interval or ratio level of measurement.

10 Quantitative variables RATIO LEVEL
INTERVAL LEVEL Qualitative variables ORDINAL LEVEL NOMINAL LEVEL

11 THE “AVERAGE” What we call the “average” is the most popular statistical measurement we know of. Example: The results of a 50-item diagnostic exam in Math 1 on a class of 30 students are as follows: 43 23 34 40 36 47 45 25 32 35 31 22 41 33 24 37 20 39

12 A. The population is: the class of 30 students. B. The population average is: 33.4 C. Select randomly a sample of size 10 and find the average. 43 23 34 40 36 47 45 25 32 35 31 22 41 33 24 37 20 39 Sample ave. = 34.5 D. Again. 43 23 34 40 36 47 45 25 32 35 31 22 41 33 24 37 20 39 Sample ave. = 30.3

13 E. And again. 43 23 34 40 36 47 45 25 32 35 31 22 41 33 24 37 20 39 Sample ave. = 36.4 Take note: The population average is fixed at 33.4. The sample averages vary, but close to the population average. In other words, the sample averages give a good estimate of the population average. This is why, in general, when the population of interest to the study is very large, we collect and study adequately selected, smaller samples instead.

14 The “average” is one of the central ideas in Statistic
The “average” is one of the central ideas in Statistic. We have other statistical measures of closer importance such as the median, mode, variance, standard deviation, etc. etc. etc. For a given population, a numerical measurement obtained from all its data (such as the “average”) is called a PARAMETER. The parameters of a population are fixed quantities, being the actual values of such measurements. For a sample, a numerical measurement obtained from all data in the sample is called a SAMPLE STATISTIC. Sample statis-tics are variable (they vary), yet may give good estimates to the corresponding parameter.

15 HOW DO WE RANDOMIZE SAMPLE SUBJECTS?
To do this, the selection of sample subjects must be unbiased -- We use samples to represent a large population. that is, each subject has an equally likely chance of being selected. We have four basic methods of unbiased sampling: In RANDOM SAMPLING, each member of the population is labeled with a number, then a computer (or calculator) is tasked to generate random numbers from the set of all these numerical labels as many as desired for the sample size. In SYSTEMATIC SAMPLING, each member is also labeled with a number and every kth member from the first random subject is selected. For example, suppose there are 2000 subjects in the population, and a sample of size 50 is desired; then k = 2000/50 = 40. The first subject is selected randomly (say member #23); and the next ones would be #63, #103, #143, and so on.

16 In STRATIFIED SAMPLING, the population is divided in groups (called strata) and samples within the strata (of any desired size) are randomly selected. In CLUSTER SAMPLING, the population is also divided in groups (called clusters), then some of these clusters are selected and all the members of these selected clusters are used as sample subjects. In all of these methods, a random item is involved. RANDOM does not mean a haphazard personal selection. This is why computers are employed in implementing random selections.


Download ppt "STATISTICS is about how to COLLECT, ORGANIZE,"

Similar presentations


Ads by Google