MISCONCEPTION #2: How one performs on tests is an accurate measure of one’s learning.
- Tests designed to compare student performance are being misused to assess learning.
- All test scores contain error.
- Tests vary in their validity, or how well they measure what they are supposed to.
- No big decision should ever be made on the basis of a single test score.
From: Research Brief #1: “Testing Today in Context”, CReATE, February 2012
Standards-referenced assessments should be measuring the performance of a student in relation to curriculum standards, not in relation to the performance of other students. Using norm-referenced scores for these purposes would render it mathematically impossible for all students to achieve success, and a large proportion of our students would be guaranteed to be held back at each testing threshold. After all, 50% of students will always score below the 50th percentile, even if all students achieve unprecedented mastery of the material.
…the fact that the test was developed to be norm referenced means that each item is tailored for half correct/half incorrect responses, and countless items that better assessed the standards were rejected. Some scholars posit that the variance of responses in norm-referenced test items is only achieved by requiring knowledge that is not part of the standards, information that students gain through their out-of-school experience and is not learned in the classroom—no matter how hard the teacher and pupils work.
…Another fundamental principle in psychometrics is reliability, which is most simply defined as “the consistency with which a test measures whatever it’s measuring.” Test developers work very hard to ensure that their products are reliable. However, it is an undisputed fact in the measurement field that no test is completely reliable. In other words, “some degree of inconsistency is present in all measurement procedures.” Psychometricians conceptualize this as the difference between a test taker’s true score (a perfectly accurate reflection of ability, necessarily hypothetical given the limitations of measurement) and their attained score on a test. The difference is called “error of measurement,” and it is a component of every test score.
Because measurement experts understand the limitations of tests that have just been described, there is universal agreement in the field on a caveat for their use: No big decision should ever be made based on single test score. This principle is espoused by test developers, academic bodies and professional associations alike, including the National Research Council and the National Council on Measurement in Education.16 Attaching high stakes to any single test breaks one of the most fundamental rules of psychometrics.
IN FACT, a recent study from The National Bureau of Economic Research released this study, “The Effect of High School Exit Exams on Graduation, Employment, Wages and Incarceration” finding a higher link between low h.s. exit exams and increase in incarceration.