Do Test Scores Misrepresent Test Results? An Item-by-Item Analysis. Discussion Paper #2025.13

Saved in:
Bibliographic Details
Title: Do Test Scores Misrepresent Test Results? An Item-by-Item Analysis. Discussion Paper #2025.13
Language: English
Authors: Jesse Bruhn, Michael Gilraine, Jens Ludwig, Sendil Mullainathan, Massachusetts Institute of Technology (MIT), Blueprint Labs, National Bureau of Economic Research (NBER)
Source: Blueprint Labs. 2025.
Availability: Blueprint Labs. 30 Wadsworth Street, Cambridge, MA 02142. e-mail: contact@mitblueprintlabs.org; Web site: https://blueprintlabs.mit.edu/
Peer Reviewed: N
Page Count: 85
Publication Date: 2025
Document Type: Reports - Research
Descriptors: Testing, Tests, Scores, Test Results, Response Style (Tests), Data Use, Testing Problems, Item Analysis, Item Response Theory, Data, Data Interpretation
Geographic Terms: Texas
Abstract: Much of the data collected in education is effectively thrown away. Students answer individual test questions, but administrators and researchers only see aggregate performance. All the item-level data are lost. Ex ante it is not clear this destroys much useful information, since the aggregate might be a sufficient statistic. Using data from Texas for 5 million students and 1.31 billion student-item responses, the researchers show that in fact aggregation does destroy a great deal of valuable information in education: (1) Even conditional on a summary test measure, there is additional information in the item-level data; (2) This additional information is relevant for the student outcomes that education decisions seek to optimize; and (3) This information can be made practically useful for schools.
Abstractor: As Provided
Entry Date: 2026
Access URL: https://blueprintlabs.mit.edu/research/do-test-scores-misrepresent-test-results-an-item-by-item-analysis/
Accession Number: ED678493
Database: ERIC
Description
Abstract:Much of the data collected in education is effectively thrown away. Students answer individual test questions, but administrators and researchers only see aggregate performance. All the item-level data are lost. Ex ante it is not clear this destroys much useful information, since the aggregate might be a sufficient statistic. Using data from Texas for 5 million students and 1.31 billion student-item responses, the researchers show that in fact aggregation does destroy a great deal of valuable information in education: (1) Even conditional on a summary test measure, there is additional information in the item-level data; (2) This additional information is relevant for the student outcomes that education decisions seek to optimize; and (3) This information can be made practically useful for schools.