UNIT I INTRODUCTION TO MEASUREMENT THEORY

Slides:

Advertisements

Similar presentations

Richard M. Jacobs, OSA, Ph.D.

Advertisements

Chapter 5 Measurement, Reliability and Validity.

Psychology: A Modular Approach to Mind and Behavior, Tenth Edition, Dennis Coon Appendix Appendix: Behavioral Statistics.

Table of Contents Exit Appendix Behavioral Statistics.

QUANTITATIVE DATA ANALYSIS

Descriptive Statistics

Data observation and Descriptive Statistics

Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.

Measures of Central Tendency

Understanding Research Results

CHAPTER 13 ANOVA.

@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.

Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.

Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.

What is Statistics?  Set of methods and rules for organizing summarizing, and interpreting information 2.

Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 

Foundations of Educational Measurement

McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.

*What is Test Theory? The study of measurement problems, influence of these measurement problems on psychological inventories, and how to create methods.

Chapter 11 Descriptive Statistics Gay, Mills, and Airasian

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Thinking About Psychology: The Science of Mind and Behavior 2e Charles T. Blair-Broeker Randal M. Ernst.

Research & Statistics Looking for Conclusions. Statistics Mathematics is used to organize, summarize, and interpret mathematical data 2 types of statistics.

UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.

METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.

Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.

Chapter 1 Introduction to Statistics. Statistical Methods Were developed to serve a purpose Were developed to serve a purpose The purpose for each statistical.

Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.

An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.

Copyright © 2014 by Nelson Education Limited. 3-1 Chapter 3 Measures of Central Tendency and Dispersion.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

CHAPTER 3  Descriptive Statistics Measures of Central Tendency 1.

CHAPTER OVERVIEW The Measurement Process Levels of Measurement Reliability and Validity: Why They Are Very, Very Important A Conceptual Definition of Reliability.

Chapter Eight: Using Statistics to Answer Questions.

IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.

Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.

Chapter 16: Correlation. So far… We’ve focused on hypothesis testing Is the relationship we observe between x and y in our sample true generally (i.e.

Introduction to statistics I Sophia King Rm. P24 HWB

Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,

Educational Research: Data analysis and interpretation – 1 Descriptive statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.

Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.

© 2009 Pearson Prentice Hall, Salkind. Chapter 5 Measurement, Reliability and Validity.

Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.

Outline Sampling Measurement Descriptive Statistics:

Statistics & Evidence-Based Practice

Chapter 12 Understanding Research Results: Description and Correlation

Measurements Statistics

Different Types of Data

SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION

Doc.RNDr.Iveta Bedáňová, Ph.D.

CHAPTER 4 Research in Psychology: Methods & Design

Tips for exam 1- Complete all the exercises from the back of each chapter. 2- Make sure you re-do the ones you got wrong! 3- Just before the exam, re-read.

APPROACHES TO QUANTITATIVE DATA ANALYSIS

Basic Statistics Overview

Social Research Methods

Chapter 15: Correlation.

Understanding Research Results: Description and Correlation

Research Statistics Objective: Students will acquire knowledge related to research Statistics in order to identify how they are used to develop research.

Module 8 Statistical Reasoning in Everyday Life

Introduction to Statistics

Basic Statistical Terms

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

An Introduction to Correlational Research

1. Homework #2 (not on posted slides) 2. Inferential Statistics 3

15.1 The Role of Statistics in the Research Process

Descriptive Statistics

Chapter Nine: Using Statistics to Answer Questions

Descriptive Statistics

Get a free digital copy of the text book

Presentation transcript:

UNIT I INTRODUCTION TO MEASUREMENT THEORY CHAP 1: WHAT IS TEST THEORY CHAP 2: STATISTICAL CONCEPTS FOR TEST THEORY CHAP 3: INTRODUCTION TO SCALLING CHAP 4: PROCESS OF TEST CONSTRUCTION CHAPTER 5: TEST SCORES AS COMPOSITES

UNIT II RELIABILITY CHAP 6: RELIABILITY AND THE CLASSICAL TRUE SCORE MODEL CHAP 7: PROCEDURES FOR ESTIMATING RELIABILITY CHAP 8: INTRODUCTION TO GENERALIZABILITY THEORY CHAP 9: RELIABILITY COEFFICIENTS FOR CRITERION-REFERENCED TESTS

UNIT III VALIDITY CHAP 10: INTRODUCTION TO VALIDITY CHAP 11: STATISTICAL PROCEDURES FOR PREDICTION AND CLASSIFICATION CHAP 12: BIAS IN SELECTION CHAP 13: FACTOR ANALYSIS

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS

UNIT V TEST SCORING AND INTERPRETATION CHAP 17: CORRECTING FOR GUESSING AND OTHER SCORING METHODS CHAP 18: SETTING STANDARDS CHAP 19: NORMS AND STANDARD SCORES CHAP 20: EQUATING SCORES FROM DIFFERENT TESTS

Introduction to Classical and Modern Test Theory Chapter 1

*What is Test Theory? The study of measurement problems, influence of these measurement problems on tests or inventories, and how to create methods to minimize these problems

Pioneer countries in test theory are: Historic Origins Pioneer countries in test theory are: Germany, England, France, and the United States

Germany Wilhelm Wundt, Ernest Weber, and Gustavo Fechner used procedures for collection of observations in a standard way for all subjects, such as reading the instructions at the top of the test page (see next slide).

Germany Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The type of sensation you experience depends on which area of the brain is activated. This is known as a. sensory localization. b.transduction. c.sensory adaptation.d.cerebralization. 2. A hypnic jerk usually occurs during a.light sleep.b.deep sleep.c.episodes of hypersomnia.d.episodes of sleep apnea. See p.14 Exercise 4-b

Germany p.14 Exercise 4-b 4.Consider the following testing practices and indicate which nineteenth-century psychological researcher probably should be credited with the origin? b. A teacher about to give a test reads aloud from the test manual: “Please read the instructions at the top of the page silently while I read them aloud…..” (see previous slide)

England Karl Pearson-----Pearson Correlation Charles Spearman-----Spearman Correlation. Used Factor Analysis in his “Theory of Intelligence.” Galton----Categorizing half cousin to Darwin

*The Difference between Ratio IQ and Deviation IQ or Normative IQ France Alfred Binet & Theodore Simon (1905) Developed the first IQ test. IQ=MA/CAx100 MA=Mental Age CA= Chronological Age *The Difference between Ratio IQ and Deviation IQ or Normative IQ

*James McKeen Cattell  “Mental Testing” United States *James McKeen Cattell  “Mental Testing” Thorndike -- An Introduction to the Theory of Mental and Social Measurement Trail and Error  A Theory of Learning

Key Terms Test Optimal Performance Typical Performance Observable Performance Constructs Measurement

Key Terms Test: Test is a Procedure for obtaining a sample of an individual’s performance. Optimal Performance: Refers to the performance on Aptitude Tests (GRE,SAT,ACT), or Achievement Tests (WRAT, WIAT)

Key Terms Typical Performance: Refers to the performance on questioners and inventories to report one’s feelings, attitudes, interests, or reactions to a situation. Observable Performance: Refers to perform in an observable behavior (watching children interacting with each others, natural observation).

Key Terms Measurement: Quantifying an observable behavior or when quantitative value is given to a behavior. See Exercise 1 & 2 on P.14

Confounding Variables Confounding variables are variables that the researcher failed to control, or eliminate, damaging the internal validity of an experiment. Also, known as a third variable or a mediator variable, can adversely affect the relation between the independent variable and dependent variable. Ex. Next

Heavy drinkers die at a younger age

Ex. A research group might design a study to determine if heavy drinkers die at a younger age. Heavy drinkers may be more likely to smoke, or eat junk food, all of which could be factors in reducing longevity. A third variable may have adversely influenced the results.

Intervening Variables A variable that explains a relation or provides a causal link between other variables. Also called “Mediating Variable” or “intermediary variable.” Ex. Next slide

Intervening Variables Ex: The statistical association between income and longevity needs to be explained because just having money does not make one live longer. Other variables intervene between money and long life. People with high incomes tend to have better medical care than those with low incomes. Medical care is an intervening variable. It mediates the relation between income and longevity.

extraneous variables These variables are undesirable because they add error to an experiment. A major goal in research design is to decrease or control the influence of extraneous variables as much as possible. Ex; In a study examining the effect of post-secondary education on lifetime earnings, some extraneous variables might be gender, ethnicity, social class, genetics, intelligence, age, and so forth.

Key Terms They are difficult to measure. Constructs: Constructs are hypothetical concepts or psychological attributes/traits, such as personality, anxiety, depression etc. They are difficult to measure. Constructs are not physical attributes such as height and weight.

*Why do we have Measurement Problems in Psychology?? 1.There is no single universal way of defining psychological construct 2. Psychological measurements are based on samples of behavior 3. Sampling of behavior results in errors in measurement 4.The units (scales) of measurements are not well defined. 5. The measurements must have demonstrated relationship to other variables to have meaning.

Role of Test Theory in Research & Evaluation Selecting a Problem Operational Definitions of Variables Instruments Accuracy of the Instruments Data Collection Use of Statistics Optometrists and Ophthalmologists

Merriam Webster Dictionary and Thesaurus Definition of Short-Sighted 1. Near sighted or Myopia 2. Lacking Foresight 3. Lacking the power of foreseeing 4. Inability to look forward My Operational Definition: 5. person who is able to see near things more clearly than distant ones, needs to wear corrected eyeglasses prescribed (measured) by Ophthalmologist.

The American Heritage Dictionary Definition of Intelligent 1. Having or indicating a high or satisfactory degree of intelligence and mental capacity My Operational Definition of Intelligent: 2. Revealing or reflecting good judgment or sound thought : skillful And is measured by the IQ score from the Stanford-Binet V IQ Test ( in the Method section of the research paper we write about the reliability and validity of this instrument). Or select WAIS or WISC

Statistical Concepts for Test Theory Chapter 2 Statistical Concepts for Test Theory

Population Sample

Population and Sample Population: Population is the set of all individuals of interest for a particular study. Measurements related to Population are PARAMETERS. Sample: Sample is a set of individuals selected from a population. Measurements related to sample are STATISTICS.

Sample The people chosen for a study are its subjects or participants, collectively called a sample The sample must be representative

Statistics Descriptive Inferential Describes the distribution of scores and values such as mean, median, and mode Inferential Infer or draw a conclusion from a sample.

Key Terms Constant I.e. temp in learning and hunger Variable IV  manipulate DV  measure Discrete Numbers 1, 2 , 3, 14 Continues Numbers 1.3, 3.6

CONTINUOUS VERSUS DISCRETE VARIABLES Discrete variables (categorical) Values are defined by category boundaries E.g., gender Continuous variables Values can range along a continuum E.g., height

Statistics Scales of Measurement Frequency Distributions and Graphs Measures of Central Tendency Standard Deviations and Variances Z Score 1- Pearson Correlations 2- Spearman

Scales of Measurement (NOIR) Nominal Scale Qualities Example What You Can Say What You Can’t Say Assignment of labels Gender— (male or female) Preference— (like or dislike) Voting record—(for or against) Each observation belongs in its own category An observation represents “more” or “less” than another observation

ORDINAL SCALE Rank in college Order of finishing a race Qualities Example What You Can Say What You Can’t Say Assignment of values along some underlying dimension (order) Rank in college Order of finishing a race One observation is ranked above or below another. The amount that one variable is more or less than another

INTERVAL SCALE Number of words spelled correctly on Qualities Example What You Can Say What You Can’t Say Equal distances between points arbitrary zero Number of words spelled correctly on Intelligence test scores Temperature One score differs from another on some measure that has equally appearing intervals The amount of difference is an exact representation of differences of the variable being studied

RATIO SCALE Age Weight Time? Absolute zero Qualities Example What You Can Say What You Can’t Say Meaningful and non-arbitrary zero Absolute zero Age Weight Time? One value is twice as much as another or no quantity of that variable can exist Not much!

LEVELS OF MEASUREMENT Level of Measurement For Example Quality of Level Ratio Rachael is 5’ 10” and Gregory is 5’ 5” Absolute zero Interval Rachael is 5” taller than Gregory An inch is an inch is an inch Ordinal Rachael is taller than Gregory Greater than Nominal Rachael is tall and Gregory is short Different from Variables are measured at one of these four levels Qualities of one level are characteristic of the next level up The more precise (higher) the level of measurement, the more accurate is the measurement process

WHAT IS ALL THE FUSS? Measurement should be as precise as possible In psychology, most variables are probably measured at the nominal or ordinal level But—how a variable is measured can determine the level of precision

Frequency Distributions and Graphs

histogram

*Histogram for Test Scores

Quiz 1. Frequency distributions of test scores are frequently illustrated by which kind of graph? a. a histogram b. a scatterplot c. a pie chart d. a bar graph

Quiz 14. Frequency distributions of test scores are frequently illustrated by which kind of graph? *a. a histogram b. a scatterplot c. a pie chart d. a bar graph

Polygon

Frequency Distributions and Graphs

PERCENTILES When the results of a test for a specific person are presented in terms of Percentiles, we have direct information about that person’s performance relative to a group.

Quartiles and Z-Score

Platykurtic Mesokurtic, Leptokurtic

Frequency Distributions 2, 4, 3, 2, 5, 3, 6, 1, 1, 3, 5, 2, 4, 2 Σƒ=N=14 Ρ=ƒ/N P=Proportion %=P x 100

Frequency Distributions X f fX Ρ=ƒ/N %=P x 100 Cum% 6 1 6 1/14=.07 7% 5 2 4 2 3 3 2 4 1 2

Frequency Distribution Table X f fX P=f/n %= px100 Cumulative % 6 1 1/14=.07 7% 5 2 10 2/14=.14 14% 21% 4 8 35%

How do you Calculate Cumulative Percent ? Add each new individual percent to the running tally of the percentages that came before it. For example, if your dataset consisted of the four numbers: 100, 200, 150, 50 then their individual values, expressed as a percent of the total (in this case 500), are 20%, 40%, 30% and 10%. The cumulative percent would be:1.Proportion 2.percentage 100/500=0.2x100: 20% 200: (i.e. 20% from the step before + 40%)= 60% 150: (i.e. 60% from the step before + 30%)= 90% 50: (i.e. 90% from the step before + 10%) = 100%

Frequency Distributions X=2, f=4, N=14 Ρ=ƒ/N P=4/14=.29 %=P x 100= 29% X=3, f=3, N=14 P=3/14=.21 %= 21% μ=ΣƒX/Σƒ

Measures of Central Tendency Mean--------Interval or Ratio scale The sum of the values divided by the number of values--often called the "average." μ=ΣX/N Add all of the values together. Divide by the total number of values to obtain the mean. Example: X 7 12 24 20 19 ????

Mean The Mean is: μ=ΣX/N= 82/5=16.4 (7 + 12 + 24 + 20 + 19) / 5 = 16.4.

Median Measures of Central Tendency Median or Middle ------Ordinal Scale Divides the values into two equal halves, with half of the values being lower than the median and half higher than the median. Sort the values into ascending order. If you have an odd number of values, the median is the middle value. If you have an even number of values, the median is the arithmetic mean (see above) of the two middle values. Ex: The median of the same five numbers (7, 12, 24, 20, 19) is ???.

Mode The median is 19. Mode ----Nominal Scale The most frequently-occurring value (or values). Calculate the frequencies for all of the values in the data. The mode is the value (or values) with the highest frequency. Example: For individuals having the following ages -- 18, 18, 19, 20, 20, 20, 21, and 23, the mode is ????

CHARACTERISTICS OF MODE Nominal Scale Discrete Variable Describing Shape

The Range The Mode is 20 The Range: The Range is the difference between the highest number –lowest number +1 2, 4, 7, 8, and 10 -> Discrete Numbers 2, 4.6, 7.3, 8.4, and 10 -> Continues Numbers The difference between the upper real limit of the highest number and the lower real limit of the lowest number.

@ Variability

1. Describes the distribution Variability Range, Interquartile Range, Semi-Interquartile Range, Standard Deviation, and Variance are the Measures of Variability Variability is a measure of dispersion or spreading of scores around the mean, and has 2 purposes: 1. Describes the distribution Next slide

Variability 2. How well an individual score or group of scores represents the entire distribution. (i.e. in Z Score) Ex. In inferential statistics we collect information from a small sample then, generalize the results obtained from the sample to the entire population. Next slide

Variability SS, Standard Deviations and Variances X σ² = ss/N Pop 1 σ = √ss/N 2 4 s² = ss/n-1 or ss/df Standard deviation 5 s = √ss/df Sample SS=Σx²-(Σx)²/N  Computation SS=Σ( x-μ)²  Definition Sum of Squared Deviation from Mean Variance (σ²) is the Mean of Squared Deviations=MS

MEASURES OF VARIABILITY Variability is the degree of dispersion/spreading of scores in a set of scores (data) Standard Deviation—Average difference of each score from mean Variance is the variability/changes of scores in a set of scores (data)

Suppose you earned a score of X = 54 on an exam. Which set of parameters would give you the highest grade? a. μ= 50 and σ= 2 σ²=4 b. μ= 50 and σ= 4 σ²=16 c. μ= 54 and σ= 2 σ²=4 d. μ= 54 and σ= 4 σ²=16

Suppose you earned a score of X = 46 on an exam. Which set of parameters would give you the highest grade? a. μ= 50 and σ= 2 σ²=4 b. μ= 50 and σ= 4 σ²=16 c. μ= 54 and σ= 2 σ²=4 d. μ= 54 and σ= 4 σ²=16

Covariance

Covariance Correlation is based on a statistic called Covariance (Cov xy or S xy) ….. COVxy=SP/N-1 Correlation-- r=sp/√ssx.ssy Covariance is a number that reflects the degree to which 2 variables vary together. Original Data X Y 8 1 1 0 3 6 0 1

Spearman Correlation rank order data then proceed X Y 1 1 2 3 3 2 4 4

Ranking/Monotonic Transformation Score Rank position Final Rank 3 1 1.5 3 2 1.5 5 3 3 6 4 5 6 5 5 6 6 5 12 7 7

Z Scores Z=x-μ/ σ Single score Z=M-μ/ σm  Sample Mean for research σm= σ/√n we use Z score when σ is known.

Z-Scores X= σ(Z)+µ µ= X- σZ σ= (X-µ)/Z If X=60 µ=50 σ=5 Z=?

Computations/ Calculations / Collect Data and Compute test Statistics Z Score for a Sample M=115, n=25

Z Score for Research Standard Error (σm )

*Stanines Stanines are used to compare an individual student’s achievement with the results obtained by a national reference sample chosen to represent a certain year level i.e. 2nd level, 3rd level a nine-point scale used for normalized test scores, with 1-3 below average, 4-6 average, and 7-9 above average. It is a nine-point scale of standard score with mean of 5 and SD of 2.

The Correlational Method Correlational data can be graphed and a “line of best fit” can be drawn 1- Pearson Correlations 2-Spearman

The Correlational Method Correlation is the degree to which events or characteristics vary from each other. Measures the strength of a relationship Does not imply cause and effect

The Correlational Method Correlation has 3 characteristics: 1. The Form of the Relationship 2. The Direction of the Relationship 3. The strength or Consistency of the Relationship

1. The Form of the Relationship The most common use of correlation is to measure straight-line (linear form) relationship. However, other forms of relationships do exist and there are special correlations used to measure them.

2. The Direction of the Relationship Correlational data can be graphed and a “line of best fit” can be drawn

Positive correlation = variables change in the same direction

Positive Correlation

Negative correlation = variables change in the opposite direction

Negative Correlation

Unrelated = No consistent relationship No Correlation Unrelated = No consistent relationship

No Correlation

The Correlational Method The magnitude (strength) of a correlation is also important High magnitude = variables which vary closely together; fall close to the line of best fit Low magnitude = variables which do not vary as closely together; loosely scattered around the line of best fit

3. The strength or Consistency of the Relationship Direction and magnitude of a correlation are often calculated statistically Called the “Correlation Coefficient,” symbolized by the letter “r” Sign (+ or -) indicates direction Number (from 0.00 to 1.00) indicates magnitude 0.00 = no consistent relationship +1.00 = perfect positive correlation -1.00 = perfect negative correlation Most correlations found in psychological research fall far short of “perfect”

The Correlational Method Correlations can be trusted based on statistical probability “Statistical significance” means that the finding is unlikely to have occurred by chance By convention/agreement, if there is less than a 5% probability that findings are due to chance or (p < 0.05), results are considered “significant,” and thought to reflect the larger population Generally, confidence increases with the size of the sample (n) and the magnitude of the correlation (r)

The Correlational Method Advantages of correlational studies: Have high external validity Can generalize findings Can repeat (replicate) studies on other samples Difficulties with correlational studies: Lack internal validity Results describe but do not explain a relationship

External & Internal Validity *External Validity External validity addresses the ability to generalize your study to other people and other situations. *Internal Validity Internal validity addresses the "true" causes of the outcomes that you observed in your study. Strong internal validity means that you not only have reliable measures of your independent and dependent variables BUT a strong justification that causally links your independent variables (IV) to your dependent variables (DV).

The Correlational Method Pearson r=sp/√ssx.ssy Original Data X Y 1 3 2 6 4 4 5 7 SP requires 2 sets of data SS requires only one set of data

The Correlational Method Spearman r=sp/√ssx.ssy Original Data  Ranks X Y X Y 1 3 1 1 2 6 2 3 4 4 3 2 5 7 4 4 SP requires 2 sets of data SS requires only one set of data

Regression and Prediction Y=bX+a Regression Line e

Three Levels of Analysis for Prediction/Validity INPUTS PROCESSES OUTCOMES Ex. Stress (INPUT) is an unpleasant psychological (PROCESS) that occurs in response to environmental pressures (job) and can lead to withdrawal/quit job (OUTCOME).

prognosis

Please read chapter 3 and 4 for the next week