Statistics Definitions

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Chapter 3 Properties of Random Variables
Richard M. Jacobs, OSA, Ph.D.
Population Population
Agricultural and Biological Statistics
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Chapter 1 & 3.
QUANTITATIVE DATA ANALYSIS
Statistical Methods in Computer Science Data 2: Central Tendency & Variability Ido Dagan.
Introduction to Educational Statistics
Data observation and Descriptive Statistics
Measures of Central Tendency 3.1. ● Analyzing populations versus analyzing samples ● For populations  We know all of the data  Descriptive measures.
 Deviation is a measure of difference for interval and ratio variables between the observed value and the mean.  The sign of deviation (positive or.
Measures of Central Tendency
Today: Central Tendency & Dispersion
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Inferential Statistics
Study of Measures of Dispersion and Position. DATA QUALITATIVEQUANTITATIVE DISCRETE CONTINUOUS.
Chapter 3 Statistical Concepts.
Psychometrics.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Measures of Central Tendency or Measures of Location or Measures of Averages.
Chapter 1 The Role of Statistics. Three Reasons to Study Statistics 1.Being an informed “Information Consumer” Extract information from charts and graphs.
1 Concepts of Variables Greg C Elvers, Ph.D.. 2 Levels of Measurement When we observe and record a variable, it has characteristics that influence the.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Chapter 2 Describing Data.
1  Specific number numerical measurement determined by a set of data Example: Twenty-three percent of people polled believed that there are too many polls.
Descriptive Statistics The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way.
Statistics Without Fear! AP Ψ. An Introduction Statistics-means of organizing/analyzing data Descriptive-organize to communicate Inferential-Determine.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
PCB 3043L - General Ecology Data Analysis. PCB 3043L - General Ecology Data Analysis.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
PROBABILITY AND STATISTICS WEEK 1 Onur Doğan. What is Statistics? Onur Doğan.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
Overview and Types of Data
Educational Research: Data analysis and interpretation – 1 Descriptive statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 1 – Slide 1 of 27 Chapter 3 Section 1 Measures of Central Tendency.
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Descriptive Statistics Printing information at: Class website:
Unit 1 - Graphs and Distributions. Statistics 4 the science of collecting, analyzing, and drawing conclusions from data.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
Outline Sampling Measurement Descriptive Statistics:
Measurements Statistics
Chapter 6 Introductory Statistics and Data
Different Types of Data
Populations.
Data Analysis.
Chapter 1 & 3.
Chapter 5 STATISTICS (PART 1).
Central Tendency and Variability
Distributions and Graphical Representations
Unit 1 - Graphs and Distributions
Basics of Statistics.
PROBABILITY AND STATISTICS
statistics Specific number
Descriptive Statistics
Introduction to Statistics
Basic Statistical Terms
statistics Specific number
Political Science 30 Political Inquiry
Measure of Central Tendency
Chapter 3: Central Tendency
6A Types of Data, 6E Measuring the Centre of Data
Population Population
15.1 The Role of Statistics in the Research Process
Population Population
Advanced Algebra Unit 1 Vocabulary
Chapter 6 Introductory Statistics and Data
Biostatistics Lecture (2).
Presentation transcript:

Statistics Definitions ID1050– Quantitative & Qualitative Reasoning

Population vs. Sample We can use statistics when we wish to characterize some particular aspect of a group, merging each individual’s data into a corporate value. The definitions on this and the following slides are useful in describing statistical analyses. Population This is the entire group that we wish to characterize using statistics. Sample It is often not practical or possible to collect data from every individual in the population. In this case, we collect data from a sample of the population. The sample is usually a small subset of the population. In the case that the sample is the entire population, we call this a census. The hope is that the statistics of the sample accurately represent the statistics of the population, had we collected data from every member.

Parameter vs. Variable The aspect we wish to characterize may have a numerical value, or it may be just the number of members that have some property. Parameter The aspect we wish to characterize from the population is called the parameter. Variable When we actually collect data from a sample of the population, we call that data the variable. If we do our job well, the variable from the sample will be similar (though it is not expected to be identical) to that parameter of the population.

Discrete & Continuous Data The data we collect is either a mathematical measurement or the count of members with some property. In either case, we have a number that has one of two properties. Discrete variable This type of variable can only take on discrete values. Often, these are integer values. There are no instances of members with a value between the discrete values. Examples are number of children in a family, the state you were born in, the number of faulty computer chips from a factory, or a ranking of high/medium/low probability of some risk. Continuous variable With this type of variable, a measurement from the sample can take on any real value. Typically, the number falls within some reasonable range. Examples are heights of people, battery voltages from a manufacturer, or annual income.

Scales (Data Type) There are four types of classes or groups we can make from a population’s parameter: Nominal scale – This classification is based on name only. There is no numerical value to the data. Examples include hair color, state of birth, and gender. Ordinal scale – We use this scale when we rank or order a group based on how much of the characteristic they possess. Often the classes are high/medium/low or the like. Examples: dominance in monkeys or attractiveness of a painting. Interval scale – If we have a numerical value, but what we define as ‘zero’ for that value is arbitrarily chosen, then the only meaningful comparison between values is the difference between them, or intervals. The best example is time. We commonly count from a convenient starting point, but time existed before that point. So the interval between start and stop is what is relevant. Ratio scale – If the numerical value is a measurement of some continuous value that cannot be negative, then the ‘zero’ is not arbitrary, and meaningful comparisons between data can be made, such as ‘how many times larger is a value’ or ‘what is half of this value’. Some examples: height or age of a person, corporate income, battery voltage, etc.

Distributions For a large sample, we may see a particular distribution of members among the classes or categories. Some often-encountered distributions have names: Uniform distribution – The chances of a member being in any of the categories is equal. With a large sample, all classes should have approximately equal numbers. The roll of a single six-sided die is expected to be uniformly distributed. The chances are equal that a roll will be any number between 1 and 6. Normal distribution – The values measured tend to be near a central value. There are more members near the middle, and fewer members at values farther from the middle. The height of a population is expected to be normally distributed. There is an average height, and most people are near this value. There are few exceptionally tall or exceptionally short individuals. This is due to physiological constraints of the human body. Uniform and normal distributions are idealized distributions; real data may or may not approximate one of these distributions (or one of several other named distributions.) A single die should be uniform. Two dice tend to give more values near 7 and fewer near 2 and 12. The more dice that are rolled, the more the distribution begins to look like a normal distribution.

Measures of Central Tendency When data tends toward the middle, we have to ask “what does ‘middle’ mean?” There is no one definition of central tendency; instead, what the middle means may depend on context or desired use. What is the middle of the title ‘One Fish, Two Fish, Red Fish, Blue Fish’? Is it the middle letter (R), the middle word (there are two), or the word ‘Fish’, which appears the most? There are three common measures of central tendency: Mean – Sometimes called ‘average’, this is the sum of the data divided by the number of data values. Median – This is the middle element of an ordered list. Mode – This is the class that is most represented.

Measures of Spread and Measure of Symmetry Once we have identified the middle of some data, we may then ask “how far is the data spread about the middle?”. There are two common measures of the spread: Variance and standard deviation. If these values are large, the data is spread wide, and if they are small, the data occupies a narrow range. The two differ from each other only slightly; one is the square of the other. After considering the spread, we may ask “how symmetric is the spread of the data about the middle?”. The value we will used for this measure is called skewness. A value near ‘zero’ indicates high symmetry; the data tails off similarly to the left and to the right. An absolute value near ‘one’ or higher indicates that one tail is distinctly longer than the other. The sign of the skewness indicates the direction of the tail (- = left, + = right).

Conclusion When performing statistical analysis, some vocabulary is helpful Population vs. Sample – What is the group we are analyzing? Parameter vs. Variable – The name for the data we collect from a population or sample. Discrete & Continuous – The data can be the number of members, it could consist of integers, or it could be any real numerical value. Nominal, Ordinal, Interval, and Ratio Scales – How do we categorize the collected data? Uniform & Normal Distributions – How is the data spread among the groups? Statistical Variables – The numerical results from analyzing the data Mean, Median, & Mode – Measures of the middle Standard Deviation & Variance – Measures of the spread Skewness – Measure of symmetry