Statistics: Basic Concepts
Overview Survey objective: – Collect data from a smaller part of a larger group to learn something about the larger group. What is data ? How de we describe them? – O bservations (such as measurements, genders, survey responses) collected. 2Statistical Inference
Statistics Statistics: Science which describes or make inferences about the universe from sample information. Descriptive Statistics: Refers to numerical and graphic procedures to summarize a collection of data in a clear and understandable way. Inferential Statistics: Refers to procedures to draw inferences about a population from a sample. In sum, Statistics refers to a set of methods to plan experiments, obtain data, and then organize, summarize, present, analyze, interpret, and draw conclusions based on the data. 3Statistical Inference
Definitions Population: The set of all elements (scores, people, measurements, and so on) for study. Census: Collection of data from every member of the population. Sample: a sub-collection of members drawn from a population. 4Statistical Inference
Key Concepts Sample data must be collected in a scientific manner, say, through a process of random selection. If not, collected information will be useless & statistical gymnastic would not salvage. 5Statistical Inference
Types of Data Parameter: A numerical measurement to describe some characteristic of a population. Statistic: A numerical to describe some characteristic of a sample. 6Statistical Inference
Definitions Quantitative data: Numbers representing counts or measurements. Qualitative (categorical/attribute) data: Data specified by some non-numeric characteristics (for example, gender of participants). 7Statistical Inference
Quantitative Data Discrete: When the number of possible values is finite or countable number of possible values – 0,1,2,3,… Example: Number of cars parked outside the campus. Continuous: Infinite number of values pertaining to some continuous scale without gaps. Example: Milk yield of a cow. 8Statistical Inference
Levels of Measurement Nominal: Data on names, labels, or categories that cannot be ordered. Example: Survey responses: Yes, No, Undecided. Ordinal: Data that can be ordered but their difference cannot be determined or are meaningless. Example: Course grades A, B, C, D, or F 9Statistical Inference
Levels of Measurement Interval: Ordinal with the additional property that difference between any two values is meaningful but here is no natural starting point (none of the quantity is present). Example: Years: 1900, 1910,… 10Statistical Inference
Levels of Measurement Ratio: Modified interval level to include the natural zero starting point- differences and ratios are defined. Example: Prices of chocolates. 11Statistical Inference
Levels of Measurement Nominal - categories only Ordinal - categories with some order Interval - differences but no natural starting point Ratio - differences and a natural starting point 12Statistical Inference
Methods of sampling Random Sampling: Members of a population selected in such a way that every member has equal chance of getting selected. Simple Random Sample: Sample units selected in such a way that every possible sample of the same size n has the same chance of selection. Systematic Sampling: Select some staring point and then select every k-th member in the population 13Statistical Inference
Methods of sampling Convenience Sampling: Use results easy to obtain. Stratified Sampling: Subdivide the population into at least two different groups with similar characteristics and draw a sample from each group. Cluster Sampling: Divide the population into clusters, randomly select clusters, choose all members of the chosen clusters. 14Statistical Inference
Relevant Definitions Sampling error: Difference between a sample estimate and the true population estimate – error due to sample fluctuations. Non-sampling error: Errors due to mistakes in collection, recording, or analysis (biased sample, defective instrument, mistakes in copying data). 15Statistical Inference
Relevant Definitions Reliability: An estimate is reliable when there is consistency on repeated experiments. Validity: An estimate is valid when it has measured what it is supposed to measure. 16Statistical Inference