The Theory of Sampling and Measurement. Sampling First step in implementing any research design is to create a sample. First step in implementing any.

Slides:



Advertisements
Similar presentations
Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Advertisements

Chapter 8 Flashcards.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Conceptualization and Measurement
Reliability and Validity
VALIDITY AND RELIABILITY
Part II Sigma Freud & Descriptive Statistics
Transforming Concepts into Variables Operationalization and Measurement Issues of Validity and Reliability.
Levels of Measurement. The Levels of Measurement l Nominal l Ordinal l Interval l Ratio.
MEASUREMENT. Measurement “If you can’t measure it, you can’t manage it.” Bob Donath, Consultant.
Concept of Measurement
Experimental Design, Statistical Analysis CSCI 4800/6800 University of Georgia Spring 2007 Eileen Kraemer.
Beginning the Research Design
SOWK 6003 Social Work Research Week 4 Research process, variables, hypothesis, and research designs By Dr. Paul Wong.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 5 Making Systematic Observations.
Personality, 9e Jerry M. Burger
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Variables cont. Psych 231: Research Methods in Psychology.
“There are three types of lies: Lies, Damn Lies and Statistics” - Mark Twain.
Chapter 7 Evaluating What a Test Really Measures
1 Measurement Adapted from The Research Methods Knowledge Base, William Trochim (2006). & Methods for Social Researchers in Developing Counries, The Ahfad.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Measurement and Data Quality
Measurement in Survey Research MKTG 3342 Fall 2008 Professor Edward Fox.
Reliability, Validity, & Scaling
CHAPTER 4 Research in Psychology: Methods & Design
Collecting Quantitative Data
Slide 9-1 © 1999 South-Western Publishing McDaniel Gates Contemporary Marketing Research, 4e Understanding Measurement Carl McDaniel, Jr. Roger Gates Slides.
Measurement in Exercise and Sport Psychology Research EPHE 348.
Instrumentation.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
10/3/20151 PUAF 610 TA Session 4. 10/3/20152 Some words My –Things to be discussed in TA –Questions on the course and.
Which Test Do I Use? Statistics for Two Group Experiments The Chi Square Test The t Test Analyzing Multiple Groups and Factorial Experiments Analysis of.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
Validity Is the Test Appropriate, Useful, and Meaningful?
Convergent and Discriminant Validity. The Convergent Principle Measures of constructs that are related to each other should be strongly correlated.
Measurement Validity.
Learning Objective Chapter 9 The Concept of Measurement and Attitude Scales Copyright © 2000 South-Western College Publishing Co. CHAPTER nine The Concept.
Selecting a Sample. Sampling Select participants for study Select participants for study Must represent a larger group Must represent a larger group Picked.
Measurement and Questionnaire Design. Operationalizing From concepts to constructs to variables to measurable variables A measurable variable has been.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
MOI UNIVERSITY SCHOOL OF BUSINESS AND ECONOMICS CONCEPT MEASUREMENT, SCALING, VALIDITY AND RELIABILITY BY MUGAMBI G.K. M’NCHEBERE EMBA NAIROBI RESEARCH.
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
 Measuring Anything That Exists  Concepts as File Folders  Three Classes of Things That can be Measured (Kaplan, 1964) ▪ Direct Observables--Color of.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Measurement Validity & Reliability. Measurement – Validity & Reliability l The Idea of Construct Validity The Idea of Construct Validity The Idea of Construct.
True Score Theory in Measurement. True Score Theory Holds that the Scan a multitude of information and decide what is important.
Probability in Sampling. Key Concepts l Statistical terms in sampling l Sampling error l The sampling distribution.
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Chapter 2 Theoretical statement:
Reliability and Validity
Evaluation of measuring tools: validity
4 Sampling.
Introduction to Measurement
Sociological Research Methods
پرسشنامه کارگاه.
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Types of Control I. Measurement Control II. Statistical Control
Sampling.
Measurement Concepts and scale evaluation
M e a s u r e m e n t.
Analyzing Reliability and Validity in Outcomes Assessment
Presentation transcript:

The Theory of Sampling and Measurement

Sampling First step in implementing any research design is to create a sample. First step in implementing any research design is to create a sample. We cannot study the theoretical population of all conceivable events (e.g., events that have not occurred), nor can we usually study all instances of actual events. We select some instances to study and not others. Those we include are our sample. We cannot study the theoretical population of all conceivable events (e.g., events that have not occurred), nor can we usually study all instances of actual events. We select some instances to study and not others. Those we include are our sample. How our sample is selected is critical for external validity or generalizability. How our sample is selected is critical for external validity or generalizability.

Who do you want to generalize to? Groups in Sampling

The theoretical population

What population can you get access to? Groups in Sampling The theoretical population

Groups in Sampling The Theoretical Population The study population

How can you get access to them? Groups in Sampling The theoretical population The study population

Groups in Sampling The theoretical population The study population The sampling frame

Who is in your study? Groups in Sampling The theoretical population The study population The sampling frame

Groups in Sampling The theoretical population The study population The sampling frame The sample

Types of Samples Probability Sampling Probability Sampling Simple random Simple random Stratified random Stratified random Cluster or area random Cluster or area random Non-Probability Sampling Accidental Modal instance Expert Snowball Case study (intentional selection)

The Sampling Distribution AverageAverageAverage The sampling distribution......is the distribution of a statistic across an infinite number of samples. Sample Sample Sample

Population Parameter Self esteem Frequency The population has a mean of and a standard unit of.25. This means About 64% of cases fall between About 95% of cases fall between about 99% of cases fall between

Sampling Distribution Self-esteem Frequency The population has a mean of 3.75.

Sampling Distribution Self-esteem Frequency The population has a mean of and a standard error of.25.

Inferring Population from Sample Self esteem Frequency The sample has a mean of and a standard deviation of.25. This means 64% chance true population mean falls between % chance true population mean falls between % chance true population mean falls between

Figure 3.4 Labor Repression and Growth in the Asian Cases,

Figure 3.5 Labor Repression and Growth in the Full Universe of Developing Countries,

Measurement Operationalization is the process of translating theoretical constructs into observable indicators. Operationalization is the process of translating theoretical constructs into observable indicators. Construct validity and reliability are the criteria we use to evaluate how well you have operationalized your concepts. Construct validity and reliability are the criteria we use to evaluate how well you have operationalized your concepts. Both matter regardless of the level of measurement and whether you are using qualitative or quantitative indicators. Both matter regardless of the level of measurement and whether you are using qualitative or quantitative indicators.

The Hierarchy of Levels Nominal Interval Ratio Attributes are only named; weakest Attributes can be ordered Distance is meaningful Absolute zero Ordinal

Nominal Measurement The values “name” the attribute uniquely. The values “name” the attribute uniquely. The name does not imply any ordering of the cases. The name does not imply any ordering of the cases.

Ordinal Measurement When attributes can be rank-ordered… Distances between attributes do not have any meaning. Distances between attributes do not have any meaning.

Interval Measurement When distance between attributes has meaning, for example, temperature (in Fahrenheit) -- distance from 30-40°F is same as distance from 70-80°F Note that ratios don’t make any sense -- 80°F is not twice as hot as 40°F. Note that ratios don’t make any sense -- 80°F is not twice as hot as 40°F.

Ratio Measurement Has an absolute zero that is meaningful Has an absolute zero that is meaningful Can construct a meaningful ratio (fraction), for example, number of clients in past six months Can construct a meaningful ratio (fraction), for example, number of clients in past six months

Construct Validity Key problem is that we have abstract theoretical construct – power, democracy, development, corruption, etc. – that we can never observe directly. Key problem is that we have abstract theoretical construct – power, democracy, development, corruption, etc. – that we can never observe directly. Yet, to test propositions requires that we have some indicator for the construct – or at least have proxies that we can argue are capturing some attributes of the construct. Yet, to test propositions requires that we have some indicator for the construct – or at least have proxies that we can argue are capturing some attributes of the construct. Our indicator is an analogy (to an analogy). Our indicator is an analogy (to an analogy).

Assessing Construct Validity Translation Validity Translation Validity Face Validity: plausible on its “face” Face Validity: plausible on its “face” Content Validity: matches lists of attributes Content Validity: matches lists of attributes Criterion-related Validity Criterion-related Validity Predictive Validity: predicts accurately Predictive Validity: predicts accurately Concurrent Validity: distinguishes appropriately between groups Concurrent Validity: distinguishes appropriately between groups Convergent Validity Convergent Validity Discriminant Validity Discriminant Validity

The Convergent Principle Alternative measures of a construct should be strongly correlated.

How It Works Theory Self-esteemconstruct Item 1 Item 2 Item 3 Item 4 You theorize that the items all reflect self-esteem.

How It Works Theory Observation Self-esteemconstruct Item 1 Item 2 Item 3 Item The correlations provide evidence that the items all converge on the same construct.

Convergent Validity in Measures of “Democracy” 1985 | polity2 pollib civlib reg 1985 | polity2 pollib civlib reg polity2 | polity2 | pollib | pollib | civlib | civlib | reg | reg |

Convergent Validity in Measures of “Education” 1985 | Ed. spending | Illiteracy (%) | Cohort to Grade 4 | % Grade School | % Secondary School | % College |

The Discriminant Principle Measures of different constructs should not correlate highly with each other.

How It Works Theory Self-esteemconstruct SE 1 SE 2 Locus-of-controlconstruct LOC 1 LOC 2

How It Works Theory Self- esteem construct SE 1 SE 2 Locus-of-controlconstruct LOC 1 LOC 2 You theorize that you have two distinguishable constructs.

How It Works Theory Self-esteemconstruct SE 1 SE 2 Locus-of-controlconstruct LOC 1 LOC 2 Observation r SE 1, LOC 1 =.12 r SE 1, LOC 2 =.09 r SE 2, LOC 1 =.04 r SE 2, LOC 2 =.11 The correlations provide evidence that the items on the two tests discriminate.

Theory Self-esteemconstruct SE 1 SE 2 SE 3 Locus-of-controlconstruct LOC 1 LOC 2 LOC 3 We have two constructs. We want to measure self-esteem and locus of control. For each construct, we develop three scale items; our theory is that items within the construct will converge and Items across constructs will discriminate.

Theory Observation Self-esteemConstruct SE 1 SE 2 SE 3 Locus-of-controlconstruct LOC 1 LOC 2 LOC SE 1 SE 2 SE 3 LOC 1 LOC 2 LOC 3 SE 1 SE 2 SE 3 LOC 1 LOC 2 LOC 3 Green and red correlations are Convergent; yellow are Discriminant.

Theory Observation Self-esteemconstruct SE 1 SE 2 SE 3 Locus-of-controlconstruct LOC 1 LOC 2 LOC SE 1 SE 2 SE 3 LOC 1 LOC 2 LOC 3 SE 1 SE 2 SE 3 LOC 1 LOC 2 LOC 3 The correlations support both convergence and discrimination, and therefore construct validity.

What Is Reliability? The “repeatability” of a measure The “repeatability” of a measure The “consistency” of a measure The “consistency” of a measure The “dependability” of a measure The “dependability” of a measure

True Score Theory Scan a multitude of information and decide what is important Manage time effectively 2Manage resources effectively. 3Scan a multitude of information and decide what is important. 4Decide how to manage multiple tasks. 5Organize the work when directions are not specific. 1Manage time effectively Rating Sheet Observedscore = Trueability + Randomerror T e + X

The Error Component T e + X Two components: Random error Random error Systematic error Systematic error erererer eseseses

The Revised True Score Model T erererer + X eseseses +

Random Error X Frequency The distribution of X with no random error

Random Error X Frequency The distribution of X with no random error The distribution of X with random error Notice that random error doesn’t affect the average, only the variability around the average.

Systematic Error X Frequency The distribution of X with no systematic error

Systematic Error X Frequency The distribution of X with no systematic error The distribution of X with systematic error Notice that systematic error does affect the average; we call this a bias.

If a Measure Is Reliable... X1X1X1X1 X2X2X2X2 We should see that a person’s score on the same test given twice is similar (assuming the trait being measured isn’t changing).

If a Measure Is Reliable... X1X1X1X1 X2X2X2X2 T + e 1 T + e 2 Recall from true score theory that... But, if the scores are similar, why are they similar?

If a Measure Is Reliable... X1X1X1X1 X2X2X2X2 T + e 1 T + e 2 The only thing common to the two measures is the true score, T. Therefore, the true score must determine the reliability.

Reliability Is... a ratio variance of the true scores variance of the measure var(T) var(X)

Reliability Is... a ratio variance of the true scores variance of the measure We can measure the variance of the observed score, X. The greater the variance, the less reliable the measure.

This Leads Us to... We cannot calculate reliability exactly; we can only estimate it. We cannot calculate reliability exactly; we can only estimate it. Each estimate attempts to capture the consequences of the true score in different ways. Each estimate attempts to capture the consequences of the true score in different ways.

We want both Reliability and Validity

Reliability and Validity Reliable but not valid

Reliability and Validity Valid but not reliable

Reliability and Validity Neither reliable nor valid

Reliability and Validity Reliable and valid

Assignment #1 Assess the validity and reliability of the IRIS-3 International Country Risk Guide. Assess the validity and reliability of the IRIS-3 International Country Risk Guide. Can examine a single instance, compare instances, analyze the full variation in the dataset, compare with additional measures, or use any other form of assessment. May use outside sources of data, history, or analysis (but document). Can examine a single instance, compare instances, analyze the full variation in the dataset, compare with additional measures, or use any other form of assessment. May use outside sources of data, history, or analysis (but document). The only restriction is that the paper must be empirical and examine issues of validity and reliability. The only restriction is that the paper must be empirical and examine issues of validity and reliability. 3-5 pages. Be concise. 3-5 pages. Be concise. Due Monday 10/24 at beginning of class. Due Monday 10/24 at beginning of class.