Using Simulated Retests to Estimate the Reliability of Diagnostic Assessment Systems

Saved in:
Bibliographic Details
Title: Using Simulated Retests to Estimate the Reliability of Diagnostic Assessment Systems
Language: English
Authors: Thompson, W. Jake (ORCID 0000-0001-7339-0300), Nash, Brooke (ORCID 0000-0001-9858-7062), Clark, Amy K. (ORCID 0000-0002-5804-8336), Hoover, Jeffrey C. (ORCID 0000-0002-0276-0308)
Source: Journal of Educational Measurement. Fall 2023 60(3):455-475.
Availability: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
Peer Reviewed: Y
Page Count: 21
Publication Date: 2023
Sponsoring Agency: Office of Special Education Programs (OSEP) (ED/OSERS)
Contract Number: 84373X100001
Document Type: Journal Articles
Reports - Research
Descriptors: Diagnostic Tests, Simulation, Test Reliability, Accuracy, Language Proficiency, English, Evaluation Methods
DOI: 10.1111/jedm.12359
ISSN: 0022-0655
1745-3984
Abstract: As diagnostic classification models become more widely used in large-scale operational assessments, we must give consideration to the methods for estimating and reporting reliability. Researchers must explore alternatives to traditional reliability methods that are consistent with the design, scoring, and reporting levels of diagnostic assessment systems. In this article, we describe and evaluate a method for simulating retests to summarize reliability evidence at multiple reporting levels. We evaluate how the performance of reliability estimates from simulated retests compares to other measures of classification consistency and accuracy for diagnostic assessments that have previously been described in the literature, but which limit the level at which reliability can be reported. Overall, the findings show that reliability estimates from simulated retests are an accurate measure of reliability and are consistent with other measures of reliability for diagnostic assessments. We then apply this method to real data from the Examination for the Certificate of Proficiency in English to demonstrate the method in practice and compare reliability estimates from observed data. Finally, we discuss implications for the field and possible next directions.
Abstractor: As Provided
Entry Date: 2023
Accession Number: EJ1391123
Database: ERIC
Full text is not displayed to guests.
Description
Abstract:As diagnostic classification models become more widely used in large-scale operational assessments, we must give consideration to the methods for estimating and reporting reliability. Researchers must explore alternatives to traditional reliability methods that are consistent with the design, scoring, and reporting levels of diagnostic assessment systems. In this article, we describe and evaluate a method for simulating retests to summarize reliability evidence at multiple reporting levels. We evaluate how the performance of reliability estimates from simulated retests compares to other measures of classification consistency and accuracy for diagnostic assessments that have previously been described in the literature, but which limit the level at which reliability can be reported. Overall, the findings show that reliability estimates from simulated retests are an accurate measure of reliability and are consistent with other measures of reliability for diagnostic assessments. We then apply this method to real data from the Examination for the Certificate of Proficiency in English to demonstrate the method in practice and compare reliability estimates from observed data. Finally, we discuss implications for the field and possible next directions.
ISSN:0022-0655
1745-3984
DOI:10.1111/jedm.12359