DESCRIPTIVE STATISTICS

Slides:



Advertisements
Similar presentations
Rubric Unit Plan Univariate Bivariate Examples Resources Curriculum Statistically Thinking A study of univariate and bivariate statistics.
Advertisements

Richard M. Jacobs, OSA, Ph.D.
IB Math Studies – Topic 6 Statistics.
QUANTITATIVE DATA ANALYSIS
Lesson Fourteen Interpreting Scores. Contents Five Questions about Test Scores 1. The general pattern of the set of scores  How do scores run or what.
A quick introduction to the analysis of questionnaire data John Richardson.
Analysis of Research Data
Statistical Evaluation of Data
Social Research Methods
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Statistics. Question Tell whether the following statement is true or false: Nominal measurement is the ranking of objects based on their relative standing.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
PPA 501 – Analytical Methods in Administration Lecture 5a - Counting and Charting Responses.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Describing Data.
UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Chapter 2 Describing Data.
Research Methodology Lecture No :24. Recap Lecture In the last lecture we discussed about: Frequencies Bar charts and pie charts Histogram Stem and leaf.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
DESCRIPTIVE STATISTICS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.
MMSI – SATURDAY SESSION with Mr. Flynn. Describing patterns and departures from patterns (20%–30% of exam) Exploratory analysis of data makes use of graphical.
Chapter Eight: Using Statistics to Answer Questions.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
LIS 570 Summarising and presenting data - Univariate analysis.
Introduction to statistics I Sophia King Rm. P24 HWB
Chapter 2 Describing and Presenting a Distribution of Scores.
Exploratory data analysis, descriptive measures and sampling or, “How to explore numbers in tables and charts”
Chapter 5: Organizing and Displaying Data. Learning Objectives Demonstrate techniques for showing data in graphical presentation formats Choose the best.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Central Tendency  Key Learnings: Statistics is a branch of mathematics that involves collecting, organizing, interpreting, and making predictions from.
Chapter 11 Summarizing & Reporting Descriptive Data.
Descriptive Statistics ( )
Exploratory Data Analysis
Methods for Describing Sets of Data
EHS 655 Lecture 4: Descriptive statistics, censored data
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
Chapter 12 Understanding Research Results: Description and Correlation
Business and Economics 6th Edition
EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Statistics in psychology
APPROACHES TO QUANTITATIVE DATA ANALYSIS
CHOOSING A STATISTICAL TEST
Basic Statistics Overview
Statistical Reasoning
Description of Data (Summary and Variability measures)
Social Research Methods
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Basic Statistical Terms
Descriptive and Inferential
Warm Up # 3: Answer each question to the best of your knowledge.
15.1 The Role of Statistics in the Research Process
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Descriptive Statistics
Descriptive Statistics
Business and Economics 7th Edition
Introductory Statistics
Presentation transcript:

DESCRIPTIVE STATISTICS © LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

STRUCTURE OF THE CHAPTER A cautionary note about missing data Frequencies, percentages and crosstabulations Measures of central tendency and dispersal Taking stock Correlations and measures of association Partial correlations Reliability © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MISSING DATA Data may be Missing Completely At Random (MCAR), i.e. there is no pattern to the missing data for any variables. Data may be Missing At Random (MAR), where there is a pattern to the missing data, but not for the main dependent variable. Data may be Missing Not At Random (MNAR), where there is a pattern in the missing data that affects the main dependent variable (e.g. low-income families may not respond to a survey item). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ADDRESSING MISSING DATA If the missing data are randomly scattered, and the number of missing cases is so small that it is impossible for the results to seriously distort the overall findings, then the researcher might simply exclude those cases. If the missing data are not randomly scattered, but are systematically missing, i.e. a pattern in the non-response, then this is a major problem for the researcher, who may decide not to pursue that part of the analysis or may use imputation methods. Conduct sensitivity analysis: calculate the number of different responses/cases required to overturn or seriously change the findings of the analysis. If the number is so low that it could not upset the findings then the researcher might proceed, reporting the number of missing cases. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ADDRESSING MISSING DATA Adopt a deletion method for missing data: exclude any cases whose data are incomplete on any variable or only use those cases which are complete on all the variables. Adopt the imputation method: a general term given to the methods of trying to calculate what the missing values might be so that they can be included in the analysis, i.e. substituting missing values with plausible, calculated values. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

FREQUENCIES AND PERCENTAGES Frequency and percentage tables Bar charts (for nominal and ordinal data) Histograms (for continuous – interval and ratio – data) Line graphs Pie charts High and low charts Scatterplots Stem and leaf displays Boxplots (box and whisker plots) Graphical forms of data presentation © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

FREQUENCIES AND PERCENTAGES Bar charts present categorical and discrete data, highest and lowest. Avoid using a third dimension (e.g. depth) in a graph when it is unnecessary; a third dimension to a graph must provide additional information. Histograms present continuous data. Line graphs show trends, particularly in continuous data, for one or more variables at a time. Multiple line graphs show trends in continuous data on several variables in the same graph. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

FREQUENCIES AND PERCENTAGES Pie charts and bar charts show proportions. Crosstabulations show interdependence. Boxplots show the distribution of values for several variables in a single chart, together with their range and medians. Stacked bar charts show the frequencies of different groups within a specific variable for two or more variables in the same chart. Scatterplots show the relationship between two variables or several sets of two or more variables on the same chart. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CROSSTABULATIONS A crosstabulation is a presentational device Rows for nominal data, columns for ordinal data. Independent variables as row data, dependent variables as column data. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

BIVARIATE CROSSTABULATION © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

TRIVARIATE CROSSTABULATION Acceptability of formal, written public examinations Traditionalist Progressivist/ child-centred Formal, written public exams Socially advantaged Socially disadvantaged In favour 65% 70% 35% 20% Against 30% 80% Total per cent 100% © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MEASURES OF CENTRAL TENDENCY AND DISPERSAL The mode (the score obtained by the greatest number of people) For categorical (nominal) and ordinal data The mean (the average score) For continuous data Used if the data are not skewed Used if there are no outliers © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MEASURES OF CENTRAL TENDENCY AND DISPERSAL The median (the score obtained by the middle person in a ranked group of people, i.e. it has an equal number of scores above it and below it) For continuous data Used if the data are skewed Used if there are outliers Used if the standard deviation is high © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MEASURES OF CENTRAL TENDENCY AND DISPERSAL Standard deviation (the average distance of each score from the mean, the average difference between each score and the mean, and how much, the scores, as a group, deviate from the mean. A standardized measure of dispersal For interval and ratio data © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

STANDARD DEVIATION The standard deviation is calculated, in its most simplified form, as: or d2 = the deviation of the score from the mean (average), squared  = the sum of N = the number of cases A low standard deviation indicates that the scores cluster together, whilst a high standard deviation indicates that the scores are widely dispersed. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

High standard deviation 9 8 Mean 7 | 6 5 4 3 2 1 X 10 11 12 13 14 15 16 17 18 19 20 2 3 4 20 Mean = 6 High standard deviation © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

Moderately high standard deviation 9 8 Mean 7 | 6 5 4 3 2 1 X 10 11 12 13 14 15 16 17 18 19 20 1 2 6 10 11 Mean = 6 Moderately high standard deviation © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

Low standard deviation 9 8 Mean 7 | 6 5 4 3 X 2 1 10 11 12 13 14 15 16 17 18 19 20 5 6 6 6 7 Mean = 6 Low standard deviation © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE RANGE AND INTERQUARTILE RANGE The difference between the minimum and maximum score. A measure of dispersal. Outliers exert a disproportionate effect. The interquartile range The difference between the first and the third quartile (the 25th and the 75th percentile), i.e. the middle 50 per cent of scores (the second and third quartiles). Overcomes problems of outliers/extreme scores. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CORRELATION Measure of association between two variables Note the direction of the correlation Positive: As one variable increases, the other variables increases Negative: As one variable increases, the other variable decreases The strongest positive correlation coefficient is +1. The strongest negative correlation coefficient is -1. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CORRELATION Foot size Hand size 1 1 2 2 3 3 4 4 5 5 1 1 2 2 3 3 4 4 5 5 Perfect positive correlation: +1 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CORRELATION Foot size Hand size 1 5 2 4 3 3 4 2 5 1 1 5 2 4 3 3 4 2 5 1 Perfect negative correlation: +1 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CORRELATION Hand size Foot size 1 2 2 1 3 4 4 3 5 5 1 2 2 1 3 4 4 3 5 5 Positive correlation: <+1 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

PERFECT POSITIVE CORRELATION © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

PERFECT NEGATIVE CORRELATION © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MIXED CORRELATION © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CORRELATIONS Spearman correlation for nominal and ordinal data Pearson correlation for interval and ratio data © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CORRELATIONS Begin with a null hypothesis (e.g. there is no relationship between the size of hands and the size of feet). The task is not to support the hypothesis, i.e. the burden of responsibility is not to support the null hypothesis. If the hypothesis is not supported for 95 per cent or 99 per cent or 99.9 per cent of the population, then there is a statistically significant relationship between the size of hands and the size of feet at the 0.05, 0.01 and 0.001 levels of significance respectively. These levels of significance – the 0.05, 0.01 and 0.001 levels – are the levels at which statistical significance is frequently taken to be demonstrated. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CORRELATION Note the magnitude of the correlation coefficient: 0.20 to 0.35: slight association 0.35 to 0.65: sufficient for crude prediction 0.65 to 0.85: sufficient for accurate prediction >0.85: strong correlation Note the direction of the correlation (positive/negative) Ensure that the relationships are linear and not curvilinear (i.e. the line reaches an inflection point) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CURVILINEAR RELATIONSHIP © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MULTIPLE AND PARTIAL CORRELATIONS Multiple correlation The degree of association between three or more variables simultaneously. Partial correlation The degree of association between two variables after the influence of a third has been controlled or partialled out. controlling for the effects of a third variable means holding it constant whilst manipulating the other two variables. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

RELIABILITY Split-half reliability (correlation between one half of a test and the other matched half) The alpha coefficient © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

SPLIT-HALF RELIABILITY (Spearman-Brown) r = the actual correlation between the two halves of the instrument (e.g. 0.85); Reliability = = = 0.919 (very high) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CRONBACH’S ALPHA Reliability as internal consistency: Cronbach’s alpha (the alpha coefficient of reliability). A coefficient of inter-item correlations. It calculates the average of all possible split-half reliability coefficients. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

INTERPRETING THE RELIABILITY COEFFICIENT Maximum is +1 >.90 very highly reliable .80–.90 highly reliable .70–.79 reliable .60–.69 marginally/minimally reliable <.60 unacceptably low reliability © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors