Using Simulated Retests to Estimate the Reliability of Diagnostic Assessment Systems
Saved in:
| Title: | Using Simulated Retests to Estimate the Reliability of Diagnostic Assessment Systems |
|---|---|
| Language: | English |
| Authors: | Thompson, W. Jake (ORCID |
| Source: | Journal of Educational Measurement. Fall 2023 60(3):455-475. |
| Availability: | Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us |
| Peer Reviewed: | Y |
| Page Count: | 21 |
| Publication Date: | 2023 |
| Sponsoring Agency: | Office of Special Education Programs (OSEP) (ED/OSERS) |
| Contract Number: | 84373X100001 |
| Document Type: | Journal Articles Reports - Research |
| Descriptors: | Diagnostic Tests, Simulation, Test Reliability, Accuracy, Language Proficiency, English, Evaluation Methods |
| DOI: | 10.1111/jedm.12359 |
| ISSN: | 0022-0655 1745-3984 |
| Abstract: | As diagnostic classification models become more widely used in large-scale operational assessments, we must give consideration to the methods for estimating and reporting reliability. Researchers must explore alternatives to traditional reliability methods that are consistent with the design, scoring, and reporting levels of diagnostic assessment systems. In this article, we describe and evaluate a method for simulating retests to summarize reliability evidence at multiple reporting levels. We evaluate how the performance of reliability estimates from simulated retests compares to other measures of classification consistency and accuracy for diagnostic assessments that have previously been described in the literature, but which limit the level at which reliability can be reported. Overall, the findings show that reliability estimates from simulated retests are an accurate measure of reliability and are consistent with other measures of reliability for diagnostic assessments. We then apply this method to real data from the Examination for the Certificate of Proficiency in English to demonstrate the method in practice and compare reliability estimates from observed data. Finally, we discuss implications for the field and possible next directions. |
| Abstractor: | As Provided |
| Entry Date: | 2023 |
| Accession Number: | EJ1391123 |
| Database: | ERIC |
|
Full text is not displayed to guests.
Login for full access.
|
|
| Abstract: | As diagnostic classification models become more widely used in large-scale operational assessments, we must give consideration to the methods for estimating and reporting reliability. Researchers must explore alternatives to traditional reliability methods that are consistent with the design, scoring, and reporting levels of diagnostic assessment systems. In this article, we describe and evaluate a method for simulating retests to summarize reliability evidence at multiple reporting levels. We evaluate how the performance of reliability estimates from simulated retests compares to other measures of classification consistency and accuracy for diagnostic assessments that have previously been described in the literature, but which limit the level at which reliability can be reported. Overall, the findings show that reliability estimates from simulated retests are an accurate measure of reliability and are consistent with other measures of reliability for diagnostic assessments. We then apply this method to real data from the Examination for the Certificate of Proficiency in English to demonstrate the method in practice and compare reliability estimates from observed data. Finally, we discuss implications for the field and possible next directions. |
|---|---|
| ISSN: | 0022-0655 1745-3984 |
| DOI: | 10.1111/jedm.12359 |