Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency.

Similar presentations

Presentation on theme: "Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency."— Presentation transcript:

1 Reliability and Validity checks S-005

2 Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency  Inter-rater agreement  Compare over time (test-retest)  Item analysis  Internal consistency  Inter-rater agreement

3 Compare over time Test-Retest reliability  One sample at two (or more) times  Very convincing in theory  Often hard to do in practice  Time interval? Memory effects? Special sample?  Correlation of time-1 answers with time-2 answers  Other approaches are often approximations of this idea  One sample at two (or more) times  Very convincing in theory  Often hard to do in practice  Time interval? Memory effects? Special sample?  Correlation of time-1 answers with time-2 answers  Other approaches are often approximations of this idea

4 Split-half reliability  Easier than test-retest checks  Requires only one time point  Works when there is a scale or set of questions on a single topic  Divide the items into two sets (two halves)  Correlate the scores on the two halves  Often adjusted by the Spearman-Brown correction  Gives us an estimate of the test-retest reliability  Easier than test-retest checks  Requires only one time point  Works when there is a scale or set of questions on a single topic  Divide the items into two sets (two halves)  Correlate the scores on the two halves  Often adjusted by the Spearman-Brown correction  Gives us an estimate of the test-retest reliability

5 Item-analysis approaches When there is a set of questions about a single topic  Examine the answers to each item  How many answered correctly? Percent correct? Item difficulty.  Or, if there is no “correct” answer, look at how the answers were distributed  Agree / neutral / disagree  Examine the “wrong” answers that are chosen  Find items that are too hard or too easy  Or those that have little variability (too boring? too trivial?)  Do you really need these?  Sometimes these are very important  Test publishers tend to delete the “easy” and “hard” items  Correlate the “item responses” with the “total responses”  High correlations indicate consistency  Low correlations indicate “different” or “weak” items  Negative correlations indicate “something interesting”  Confusing wording? The item doesn’t belong?  Examine the answers to each item  How many answered correctly? Percent correct? Item difficulty.  Or, if there is no “correct” answer, look at how the answers were distributed  Agree / neutral / disagree  Examine the “wrong” answers that are chosen  Find items that are too hard or too easy  Or those that have little variability (too boring? too trivial?)  Do you really need these?  Sometimes these are very important  Test publishers tend to delete the “easy” and “hard” items  Correlate the “item responses” with the “total responses”  High correlations indicate consistency  Low correlations indicate “different” or “weak” items  Negative correlations indicate “something interesting”  Confusing wording? The item doesn’t belong?

6 Internal consistency reliability When there is a “scale” or set of questions on a single topic  Cronbach’s coefficient alpha  a measure of “internal consistency”  Look at all of the items  Check the “average correlation”  Then adjust for the number of items  Find items that do not correlate with others  Check the item-total correlations  If low, delete these or move them elsewhere  Assess the overall internal consistency  Cronbach’s coefficient alpha  a measure of “internal consistency”  Look at all of the items  Check the “average correlation”  Then adjust for the number of items  Find items that do not correlate with others  Check the item-total correlations  If low, delete these or move them elsewhere  Assess the overall internal consistency

7 Internal consistency reliability Comparing answers from different sources  Compare similar questions that appear in different parts of the questionnaire  Compare answers from different places during an interview  Compare interview responses with questionnaire responses  Compare questionnaires with actual observations  Compare similar questions that appear in different parts of the questionnaire  Compare answers from different places during an interview  Compare interview responses with questionnaire responses  Compare questionnaires with actual observations

8 Inter-rater agreement  Useful in checking on coding open-ended answers, observations, etc.  Try this on a sample or pilot study  Check the overall percent agreement  Sometimes we adjust for “chance agreement” -- Cohen’s Kappa  A very important step in lots of studies  If agreement is high, then okay to rely on one primary coder or rater  If not high, then perhaps we need more than one rater  Or perhaps we need to revise or clarify the coding rules  Then check on things again  There are often several iterations here  Keep going until the agreement is acceptable  Useful in checking on coding open-ended answers, observations, etc.  Try this on a sample or pilot study  Check the overall percent agreement  Sometimes we adjust for “chance agreement” -- Cohen’s Kappa  A very important step in lots of studies  If agreement is high, then okay to rely on one primary coder or rater  If not high, then perhaps we need more than one rater  Or perhaps we need to revise or clarify the coding rules  Then check on things again  There are often several iterations here  Keep going until the agreement is acceptable

9 Check out some examples  Bayley Scales of Infant Development  Inter-rater agreement example  Internal consistency example  Then try some clicker questions!  Bayley Scales of Infant Development  Inter-rater agreement example  Internal consistency example  Then try some clicker questions!

10 Observing students and teachers in classrooms. What type of reliability check is most important? 1.Inter-observer agreement (have more than one observer) 2.Time 1 - Time 2 (Observe at two or more times) 3.Consistency within the classroom sessions 4.Other 1.Inter-observer agreement (have more than one observer) 2.Time 1 - Time 2 (Observe at two or more times) 3.Consistency within the classroom sessions 4.Other

11 Coding transcripts from individual interviews What type of reliability check is most helpful? 1.Have multiple transcribers 2.Inter-rater agreement 3.Internal consistency checks 4.Other 1.Have multiple transcribers 2.Inter-rater agreement 3.Internal consistency checks 4.Other

12 Using answers from questionnaires. What type of reliability check is most important? 1.Inter-rater agreement 2.Internal consistency checks 3.Item-analysis checks 4.Other 1.Inter-rater agreement 2.Internal consistency checks 3.Item-analysis checks 4.Other

13 Using a mix of open-ended and closed- ended questions on a questionnaire. Why is this a good idea? 1.Internal consistency checks 2.Makes replying less boring 3.Terry has said this about 50 times, so it must be a good idea 4.Other 5.All of the above 1.Internal consistency checks 2.Makes replying less boring 3.Terry has said this about 50 times, so it must be a good idea 4.Other 5.All of the above

Download ppt "Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency."

Similar presentations

Ads by Google