TEST SCORES INTERPRETATION - is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
The Research Consumer Evaluates Measurement Reliability and Validity
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
Part II Sigma Freud & Descriptive Statistics
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
Part II Sigma Freud & Descriptive Statistics
CORRELATIONAL ANALYSES EDRS 5305 EDUCATIONAL RESEARCH & STATISTICS.
Reliability and Validity of Research Instruments
RESEARCH METHODS Lecture 18
Chapter 4 Validity.
VALIDITY.
Concept of Measurement
BASIC STEPS OF CARRYING OUT RESEARCH  Select a research topic.  Formulate a research question/problem/statement of purpose.  A Research Problem is a.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Spearman Rho Correlation
Standardized Test Scores Common Representations for Parents and Students.
 Rosseni Din  Muhammad Faisal Kamarul Zaman  Nurainshah Abdul Mutalib  Universiti Kebangsaan Malaysia.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Measurement and Data Quality
Reliability, Validity, & Scaling
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
Technical Adequacy Session One Part Three.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Chapter 4: Test administration. z scores Standard score expressed in terms of standard deviation units which indicates distance raw score is from mean.
Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.
Tests and Measurements Intersession 2006.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
Chapter 6 - Standardized Measurement and Assessment
VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena.
Classroom Assessment Chapters 4 and 5 ELED 4050 Summer 2007.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Measurement and Scaling Concepts
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Spearman Rho Correlation
Statistical analysis.
Reliability Analysis.
Ch. 5 Measurement Concepts.
Lecture 5 Validity and Reliability
Statistical analysis.
Reliability and Validity in Research
Concept of Test Validity
Reliability & Validity
Week 3 Class Discussion.
پرسشنامه کارگاه.
PSY 614 Instructor: Emily Bullock, Ph.D.
Spearman Rho Correlation
Evaluation of measuring tools: reliability
RESEARCH METHODS Lecture 18
Reliability Analysis.
Measurement Concepts and scale evaluation
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

TEST SCORES INTERPRETATION

- is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because the raw score obtained from a test standing on itself rarely has meaning.

- is the interpretation of test raw score based on the conversion of the raw score into a description of the specific tasks that the learner can perform. - a score is given meaning by comparing it with the standard of performance that is set before the test is given. - It permits the description of a learner’s test performance without referring to the performance of others. This is essentially done in terms of some universally understood measure of proficiency like speed, precision or the percentage correct score in some clearly defined domain of learning tasks. Examples of criterion-referenced interpretation are: · Types 60 words per minute without error. · Measures the room temperature within + 0∙1 degree of accuracy (precision). · Defines 75% of the elementary concepts of electricity items correctly (percentage-correct score). driving test, when learner drivers are measured against a range of explicit criteria (not endangering other road drivers

- is the interpretation of raw score based on the conversion of the raw score into some type of derived score that indicates the learner’s relative position in a clearly defined referenced group. - This type of interpretation reveals how a learner compares with other learners who have taken the same test. - Example : IQ of a person ( Genius)

Reliability of a test - the degree to which a test is consistent, stable, dependable or trustworthy in measuring what it is measuring. How can we rely on the results from a test? How dependable are scores from the test? How well are the items in the test consistent in measuring whatever it is measuring? Reliability of a test - the degree to which a test is consistent, stable, dependable or trustworthy in measuring what it is measuring. How can we rely on the results from a test? How dependable are scores from the test? How well are the items in the test consistent in measuring whatever it is measuring? - seeks to find if the ability of a set of testees are determined based on testing them two different times using the same test, or using two parallel forms of the same test, or using scores on the same test marked by two different examiners, will the relative standing of the testees on each of the pair of scores remain the same? - seeks to find if the ability of a set of testees are determined based on testing them two different times using the same test, or using two parallel forms of the same test, or using scores on the same test marked by two different examiners, will the relative standing of the testees on each of the pair of scores remain the same?

 Validity - the most important quality you have to consider when constructing or selecting a test.  Refers to the meaningfulness or appropriateness of the interpretations to be made from test scores and other evaluation results.  Validity is therefore, a measure or the degree to which a test measures what it is intended to measure.  Validity - the most important quality you have to consider when constructing or selecting a test.  Refers to the meaningfulness or appropriateness of the interpretations to be made from test scores and other evaluation results.  Validity is therefore, a measure or the degree to which a test measures what it is intended to measure. While reliability is necessary, it alone is not sufficient. For a test to be reliable, it also needs to be valid.

Determining the extent to which performance on a test represents level of knowledge of the subject matter content which the test was designed to measure (Content validity) Determining the extent to which performance on a test represents the amount of what was being measured possessed by the Examinee (Construct validity) ·Determining the extent to which performance on a test represents an examinee’s probable task (Criterion validity)

Face Validation is a quick method of establishing the content validity of a test after its preparation. This is done by presenting the test to subject experts in the field for their experts’ opinion as to how well the test “looks like” it measures what it was supposed to measure. This process is refereed to as face validity. It is a subjective evaluation based on a superficial examination of the items, of the extent to which a test measures what it was intended to measure.

 A correlation coefficient expresses the degree of relationship between two sets of scores by numbers ranging from +1∙00 from – 1.00 to to - 1∙00.  A perfect positive correlation is indicated by a coefficient of +1∙00 and a perfect negative correlation by a coefficient of -1∙00.  The larger the coefficient (positive or negative), the higher the degree of relationship expressed. There are two common methods of computing correlation coefficient: 1. Spearman Rank-Difference Correlation- number of scores to be correlated is small (less than 30) 2. Pearson Product-Moment Correlation - the number of scores is large

 One way to measure the Internal consistency (reliability) of the test items is the Cronbach’s Alpha.  The higher the correlation among the items, the greater the alpha. High correlations imply that high (or low) scores one question are associated with high (or low) scores on other questions.  Alpha can vary from 0 to 1, with indicating that the test is perfectly reliable.  The computation of Cronbach’s Alpha when a particular item is removed from consideration is a good measure of that item’s contribution to the entire test’s assessment performance. Cronbach’s α interpretation (George and Mallery, 2003) α >.9 – Excellent α >.8 – Good α >.7 – Acceptable α >.6 – Questionable α >.5 – Poor α <.5 – Unacceptable” Cronbach’s α interpretation (George and Mallery, 2003) α >.9 – Excellent α >.8 – Good α >.7 – Acceptable α >.6 – Questionable α >.5 – Poor α <.5 – Unacceptable”

Table 1. Multi-item statements to measure pleasure with their graduate program

Use excel and SPSS to compute Cronbach’s α and interpret your results.

1. Open excel and encode answers. Save. (Note : In the use of SPSS; In Data View, columns represent variables, and rows represent cases (observations). In Variable View, each row is a variable, and each column is an attribute that is associated with that variable. 2. Open SPSS, open saved data in excel form. 3. Click Analyze scale reliability stat correlation