Measurement, Data Collection, Validity & Reliability Data is your friend.

Slides:



Advertisements
Similar presentations
Measurement, Evaluation, Assessment and Statistics
Advertisements

Survey Methodology Reliability and Validity EPID 626 Lecture 12.
The Research Consumer Evaluates Measurement Reliability and Validity
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Chapter 4 – Reliability Observed Scores and True Scores Error
VALIDITY AND RELIABILITY
Part II Sigma Freud & Descriptive Statistics
Research Methods in Psychology
Reliability and Validity of Research Instruments
Reliability, Validity, Trustworthiness If a research says it must be right, then it must be right,… right??
RESEARCH METHODS Lecture 18
Concept of Measurement
SOWK 6003 Social Work Research Week 5 Measurement By Dr. Paul Wong.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Research Methods in MIS
Chapter 7 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
Classroom Assessment A Practical Guide for Educators by Craig A
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Technical Issues Two concerns Validity Reliability
Measurement and Data Quality
Validity and Reliability
Reliability, Validity, & Scaling
Research and Statistics AP Psychology. Questions: ► Why do scientists conduct research?  answer answer.
Instrument Validity & Reliability. Why do we use instruments? Reliance upon our senses for empirical evidence Senses are unreliable Senses are imprecise.
Instrumentation.
Foundations of Educational Measurement
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Technical Adequacy Session One Part Three.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Chapter 1: Research Methods
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Data Collection Data is your friend. Agenda Action research check-up Measures (aka, ways to collect data) Midterms.
Validity Is the Test Appropriate, Useful, and Meaningful?
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
EDU 8603 Day 6. What do the following numbers mean?
Chapter 7 Measurement and Scaling Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
 Descriptive Methods ◦ Observation ◦ Survey Research  Experimental Methods ◦ Independent Groups Designs ◦ Repeated Measures Designs ◦ Complex Designs.
Data Collection and Reliability All this data, but can I really count on it??
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Learning Objective Chapter 9 The Concept of Measurement and Attitude Scales Copyright © 2000 South-Western College Publishing Co. CHAPTER nine The Concept.
Issues in Validity and Reliability Conducting Educational Research Chapter 4 Presented by: Vanessa Colón.
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Reliability: The degree to which a measurement can be successfully repeated.
JS Mrunalini Lecturer RAKMHSU Data Collection Considerations: Validity, Reliability, Generalizability, and Ethics.
McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:
Chapter 6: Analyzing and Interpreting Quantitative Data
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Chapter 14: Affective Assessment
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Reliability EDUC 307. Reliability  How consistent is our measurement?  the reliability of assessments tells the consistency of observations.  Two or.
Measuring Research Variables
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Data Collection Methods NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN.
پرسشنامه کارگاه.
RESEARCH METHODS Lecture 18
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Measurement, Data Collection, Validity & Reliability Data is your friend

Agenda Measurement Measures (aka, ways to collect data) Validity/reliability, up close and personal

Educational Measurement Measurement: assignment of numbers to differentiate values of a variable GOOD RESEARCH MUST HAVE SOUND MEASUREMENT!!

Thought Question Consider the following scores on a test Marco 90 Adriane 85 Linda 75 Christy 99 Chantelle 88 Jay 45 Remi 68 Marcus 97 Chi Bo 92 Donnie 85 Which measure of central tendency would Adriane use when telling her parents about her performance?

Descriptive Statistics Statistics: procedures that summarize and analyze quantitative data Descriptive statistics: statistical procedures that summarize a set of numbers in terms of central tendency or variation Important for understanding what the data tells the researcher

Descriptive Statistics: A Caution Statistics can provide us with useful information, but they can be interpreted in different ways to say different things

Thought Question If Jay scored an 85 instead of a 45, what changes? Highly deviant scores (called "outliers") have no more effect on the median than those scores very close to the middle. However, outliers can greatly affect the mean.

Descriptive Statistics Frequency distributions (see Figure 6.2) Normal - scores equally distributed around middle Positively skewed - large number of low scores and a small number of high scores; mean being pulled to the positive Negatively skewed - large number of high scores and a small number of low scores; mean being pulled to the negative

Normal Distribution

An Extreme Example Consider the salaries of 10 people Group A – All are teachers. Salaries: $45,000$45,000$45,000 $50,000$50,000$50,000 $50,000 $55,000 $55,000 $55,000

An Extreme Example Consider the salaries of 10 people Group B – Nine are teachers; 1 is Donovan McNabb. Salaries: $45,000$45,000$45,000 $50,000$50,000$50,000 $50,000 $55,000 $55,000 $6,300,000

An Extreme Example What happens to the mean and median in these 2 examples? Does it change? What happens to the normal distribution?

Positive Skew

Negative Skew

Case in Point: Teacher Salary Compare Radnor to Philadelphia Is the salary distribution for Philadelphia going to be positively or negatively skewed? (Hint: Look at the # years of experience)

Descriptive Statistics Variability How different are the scores? Types Range: the difference between the highest and lowest scores Standard deviation The average distance of the scores from the mean The relationship to the normal distribution ±1 SD = 68% of all scores in a distribution ±2 SD = 95% of all scores in a distribution

Variability

Standard Deviation

Variability Why does variability matter?

Descriptive Statistics Relationship How two sets of scores relate to one another Correlation (positive) Low Moderate High >.70

Example of Correlation

Measures of Data Collection Tests Questionnaires Observations Interviews

Measures (Means of Data Collection) You must match the instrument to the research question!

Questionnaires Thoughts on those you responded to Approaches to Happiness Optimism Grit

Examples to critique Measures Questionnaire – Psychological School Membership Survey used with middle school students Interview protocol – for teachers & counselors regarding professional development issues Observation instrument – PDE 430 for student teachers What are 2 benefits and 2 limitations of this measure?

Questionnaires Used to obtain a subject’s perceptions, attitudes, beliefs, values, opinions, or other non-cognitive traits Scales - a continuum that describes subject’s responses to a statement Likert Checklists Ranked items

Questionnaires Likert scales Response options require the subject to determine the extent to which they agree with a statement Debate over odd v. even number responses Statements must reflect extreme positive or extreme negative positions Example – CATS evaluations

Questionnaires Checklists Choose options Ranked items Sequential order Avoids marking everything high or low

Questionnaires Problems with measuring non-cognitive traits Difficulty clearly defining what is being measured Self-concept or self-esteem Response set Responding same way (Ex - all 4’s on CATS) Social desirability “PC filter” Faking Agreeing with statements because of the negative consequences associated with disagreeing

Questionnaires Controlling problems Equal numbers of positively and negatively worded statements Alternating positive and negative statements Providing confidentiality or anonymity to respondents

Designing Questionnaires Online resources E536http:// 004E536

Observations Observations - direct observations of behaviors Provide first hand account (ameliorates issues of self-reporting in questionnaires) Natural or controlled settings Ex – classroom vs. lab (child attachment studies) Structured or unstructured observations Ex – frequency counts vs. narrative record Detached or involved observers

Observations Inference Low inference - involves little if any inference on the observers part On-task/Off-task behavior instrument High inference - involves high levels of inference on the observers part Teacher effectiveness – PDE form 430

Observations Controlling observer effects Observer bias Training Inter-rater reliability (Cronbach’s alpha) Multiple observers Contamination - knowledge of the study influences the observation Training Targeting specific behaviors Observers do not know of the expected outcomes Observers are “blind” to which group is which

Observations Observer effects Halo effectHalo effect - initial ratings influence subsequent ratings Hawthorne effectHawthorne effect - increased performance results from awareness of being part of study LeniencyLeniency - wanting everyone to do well Central TendencyCentral Tendency - measuring in the middle Observer DriftObserver Drift - failing to record pertinent information

Interviews What are some challenges to doing this kind of interviewing?

Interviews Advantages Establish rapport & enhance motivation Clarify responses through additional questioning Capture the depth and richness of responses Allow for flexibility Reduce “no response” and/or “neutral” responses

Interviews Disadvantages Time consuming Expensive Small samples Subjective – interviewer characteristics, contamination, bias

Validity and Reliability What’s all the fuss about?

Validity/Reliability and Trustworthiness Why do we need validity and reliability in quantitative studies and “trustworthiness” in qualitative studies? We can’t trust the results if we can’t trust the methods!

Reader’s Digest version… Reliability The extent to which scores are free from error Error is measured by consistency Validity The extent to which inferences are appropriate, meaningful, and useful “Does the instrument measure what it is supposed to measure??”

Thought Question On the ACT and SAT assessments, there is a definitive script that test administrators are required to follow exactly. What measurement issue are the test makers addressing?

Reliability of Measurement Reliability - The extent to which measures are free from error Error is measured by consistency

Reliability of Measurement Reliability Measurement 0.00 indicates no reliability or consistency 1.00 indicates total reliability or consistency <.60 = weak reliability >.80 = sufficient reliability

Reliability of Measurement Types of reliability evidence Stability (i.e. test-retest) Testing the same subject using the same test on two occasions Limitation - carryover effects from the first to second administration of the test Equivalence (i.e. parallel form) Testing the same subject with two parallel (i.e. equal) forms of the same test taken at the same time Limitation - difficulty in creating parallel forms

Reliability of Measurement Equivalence and stability Testing the same subject with two forms of the same test taken at different times Limitation - difficulty in creating parallel forms

Reliability of Measurement Internal consistency Testing the same subject with one test and “artificially” splitting the test into two halves Limitations - must have a minimum of ten (10) questions Often see “Chronbach’s alpha” for reliability coefficient (ex – Learning styles)

Reliability of Measurement Agreement / Inter-rater reliability Observational measures Multiple observers coding similarly

Reliability of Measurement Enhancing reliability Standardized administration procedures (e.g. directions, conditions, etc.) Appropriate reading level Reasonable length of the testing period Counterbalancing the order of testing if several tests are being given

Validity of Measurement Validity: the extent to which inferences are appropriate, meaningful, and useful Current example – content tests and teacher licensure

Validity of Measurement For research results to have any value, validity of the measurement of a variable must exist Use of established and “new” instruments and the implications for establishing validity Importance of establishing validity prior to data collection (e.g. pilot tests)

Validity Content Predictive (criterion-related) Concurrent Construct

Thought Question Criticisms of standardized tests like the SAT claim that they discriminate against particular groups of students (especially minorities) and do not represent a broad enough domain of knowledge to adequately assess a student’s academic potential. What issue of validity is operating in these arguments?

Thought Question Other arguments against the SAT state that the tests do not adequately estimate an individual’s ability to succeed in college. What issue of validity is operating here?

Reliability & Validity of Measurement What is the relationship of reliability to validity? If a watch consistently gives the time at 1:10 when actually it is 1:00, it is ____ but not ____. ______ is necessary but not sufficient condition for _______. To be _____, an instrument must be ______, but a ____ instrument is not necessarily _____.

Reliability & Validity of Measurement What is the relationship of reliability to validity? If a watch consistently gives the time at 1:10 when actually it is 1:00, it is reliable but not valid. Reliability is necessary but not sufficient condition for validity To be valid, an instrument must be reliable, but a reliable instrument is not necessarily valid.

Midterm Multiple Choice: 50 pts Short Answer: 25 pts Article Critique: 25 pts Bring article with you to class. It’s ok to have notes on it.