Chapter 2 Characterizing Your Data Set Allan Edwards: “Before you analyze your data, graph your data.

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Measurement, Evaluation, Assessment and Statistics
Section #1 October 5 th Research & Variables 2.Frequency Distributions 3.Graphs 4.Percentiles 5.Central Tendency 6.Variability.
Agricultural and Biological Statistics
Descriptive Statistics. Frequency Distributions a tallying of the number of times (frequency) each score value (or interval of score values) is represented.
BHS Methods in Behavioral Sciences I April 18, 2003 Chapter 4 (Ray) – Descriptive Statistics.
Calculating & Reporting Healthcare Statistics
PSY 307 – Statistics for the Behavioral Sciences
Descriptive Statistics
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
PPA 415 – Research Methods in Public Administration Lecture 4 – Measures of Dispersion.
Analysis of Research Data
Introduction to Educational Statistics
Data observation and Descriptive Statistics
SHOWTIME! STATISTICAL TOOLS IN EVALUATION DESCRIPTIVE VALUES MEASURES OF VARIABILITY.
Central Tendency and Variability
Chapter 3: Central Tendency
Measures of Central Tendency
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
Today: Central Tendency & Dispersion
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
Summarizing Scores With Measures of Central Tendency
Describing distributions with numbers
@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Describing Data from One Variable
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 3 Statistical Concepts.
EPE/EDP 557 Key Concepts / Terms –Empirical vs. Normative Questions Empirical Questions Normative Questions –Statistics Descriptive Statistics Inferential.
Psychometrics.
MSE 600 Descriptive Statistics Chapter 10 in 6 th Edition (may be another chapter in 7 th edition)
Chapter 3 – Descriptive Statistics
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Types of data and how to present them 47:269: Research Methods I Dr. Leonard March 31, :269: Research Methods I Dr. Leonard March 31, 2010.
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
Statistical Tools in Evaluation Part I. Statistical Tools in Evaluation What are statistics? –Organization and analysis of numerical data –Methods used.
PPA 501 – Analytical Methods in Administration Lecture 5a - Counting and Charting Responses.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Descriptive Statistics
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Agenda Descriptive Statistics Measures of Spread - Variability.
Central Tendency & Dispersion
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
IE(DS)1 Descriptive Statistics Data - Quantitative observation of Behavior What do numbers mean? If we call one thing 1 and another thing 2 what do we.
Statistical Analysis of Data. What is a Statistic???? Population Sample Parameter: value that describes a population Statistic: a value that describes.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapter 2 Describing and Presenting a Distribution of Scores.
Measures of Central Tendency (MCT) 1. Describe how MCT describe data 2. Explain mean, median & mode 3. Explain sample means 4. Explain “deviations around.
Descriptive Statistics(Summary and Variability measures)
HMS 320 Understanding Statistics Part 2. Quantitative Data Numbers of something…. (nominal - categorical Importance of something (ordinal - rankings)
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Chapter 3 Descriptive Statistics: Numerical Measures Part A
Description of Data (Summary and Variability measures)
An Introduction to Statistics
Descriptive Statistics
Basic Statistical Terms
Chapter 3: Central Tendency
St. Edward’s University
Presentation transcript:

Chapter 2 Characterizing Your Data Set Allan Edwards: “Before you analyze your data, graph your data

Chapter 2 Characterizing Your Data Set Allan Edwards: “Before you analyze your data, graph your data Francis Galton, Father of Intelligence Testing: Whenever you can, count!

Frequency Table Variable is Continuous

Grouped Frequency Table & Distribution Continuous variable, Data from Same 100 Subjects Constant Interval “Class Interval”

Grouped Frequency Histogram For Continuous Variable Bars “Touch”, the end of one interval is beginning of next Value is middle value of Interval Spatz says the bars don’t touch – Whaaaaaa?????

Bar Chart for Categorical Variable Bars are separated – a lot of Biology is not almost English

Standard Normal Distribution The more Extreme your score the more unusual, improbable you are Remember this relationship -- it’s the basis of 90% of statistics Typical of many characteristics -- E.G., height, intelligence, speed

Rectangular Distribution Never Seen One Extreme Scores are NOT less usual/frequent/probable

Non-Normal Distribution Example: Income -- Where is the mean? How would you characterize these data?

Negative Skew

Bimodal Distribution Is the Mean appropriate/representative E.G., Mean age of onset for Anorexia is 17yrs One Peak is at 14yrs -- Onset of Puberty One Peak is at 18yrs -- Going away to college

Bimodal Distribution, cont.

Characterizing Your Data Measures of Central Tendency Characterizing your Data: Shorthand notation for all of your values Central Tendency: A representative value Where Your Scores tend to “Hang Out” Where you go to find your data 1.Mean -- What is definition & why do you use it? 2.Median -- Middle Value What if you have an even # of values? 3.Mode -- Most frequent value

Which Central Tendency is Best? Mean Ratio Data (People allow Interval Data) Symmetrical Distributions Median Skewed Distributions Ordinal (Ranked) Data -- A mean cannot be computed Mode Nominal (Qualitative) Data Bimodal Data

If you Had to Guess the Value of Each (Quantitative) Data Point Mode: Highest # of correct guesses Median: Errors would be symmetrical Overestimations would balance out Underestimations Mean: Errors of Estimation will be smallest, overall Two Unique Properties of the Mean: 1.Deviations are smallest from the mean Than for any other value 2.Deviation scores sum to zero

How Strong Is Your Tendency? Measures of Heterogeneity (Chapter 3) Two Data Sets with nearly identical: Ns Means Medians Modes Are these two data sets similar?

Are They The Same?

Some Data Sets are More Heterogeneous Jockeys:Very Low average height Very Homogeneous Presbyterians:Medium average heightVery Heterogeneous NBA Players:Very High average heightVery Homogenous How do you characterize a data set’s Heterogeneity? The Greater the Heterogeneity, the Weaker the Central Tendency

Quantifying Heterogeneity Range: Highest Score minus Lowest Score Very sensitive to a single Extreme Score Inter Quartile Range: 75 th percentile minus 25 th percentile Captures 50% of the scores How wide do you have to go to capture 50% of values? The wider you have to go the more Heterogeneity

Heterogeneity, cont. The more Heterogeneity, the more the scores will deviate from The mean

Heterogeneity, cont. Two Unique properties of the Mean: 1.All deviation scores sum to zero 2.Raw scores Deviate Less from the mean than from any other Value This makes the mean the Best Representative of the data Set If distribution is symmetrical

Heterogeneity, cont. Problem: All deviation scores sum to zero no matter how Heterogeneous the raw scores You Cannot average deviations scores to quantify heterogeneity Solution: Make all deviation scores Positive

Heterogeneity, cont. Two way to make all deviation scores Positive: Take the Absolute Value of the Deviation Scores: Average of absolute values = Average Deviation Mean +/- AD Captures 50% of raw scores Take the Square of the Deviation Scores Average of squared deviation scores = Variance  2 for Population S 2 for Sample S 2 -”hat” for estimating Population from Sample

Variance Population Estimate of Population from Sample To Describe sample use N S 2 = Sample Variance Problem: Magnitude of Variance is large relative to individual Deviation scores -- Quantifies but not very descriptive

Standard Deviation PopulationSample Population Estimate Mean +/- SD captures 68% of Data Points

Standard Deviation, cont.

The Concept Standard Deviation Standard Deviation from the Mean “Average” Deviation from the Mean Expected Deviation from the Mean Expect 68% of your data to be within 1 SD of the mean Expect 95% of your data to be within 2 SD of the mean If your score is beyond 2 SDs of the mean You are very infrequent You are very unusual You are very improbable Associate: Infrequent with Improbable

Interpreting a Value Transforming a score to make it more interpretable: Comparing two scores: Two tests of Equal Difficulty but of Different Length Pretend both tests were 100 items long How many would you have gotten right? Percent Correct is a Transformed Score Comparing one score to everybody else: Pretend there were 100 people, where would rank? Percentile is a Transformed Score

Z-scores & Z-transformations Take each score (Xi) and covert it to Zi Mean of z-scores = 0 Standard Deviation = 1 Units of z-scores are in Standard Deviations Z-score compares Your Deviation (numerator) to the “Average Deviation” (denominator)

Where you are relative to Population Think Percentile

Interpreting Your Z-Score

Interpreting Your Z-Score, cont.