Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries
Saved in:
| Title: | Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries |
|---|---|
| Language: | English |
| Authors: | Tijmstra, Jesper (ORCID |
| Source: | Journal of Educational Measurement. Win 2020 57(4):566-583. |
| Availability: | Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us |
| Peer Reviewed: | Y |
| Page Count: | 18 |
| Publication Date: | 2020 |
| Document Type: | Journal Articles Reports - Descriptive |
| Education Level: | Secondary Education |
| Descriptors: | Test Items, Goodness of Fit, Probability, Accuracy, International Assessment, Item Response Theory, Error of Measurement, Item Analysis, Simulation, Achievement Tests, Foreign Countries, Secondary School Students |
| Assessment and Survey Identifiers: | Program for International Student Assessment |
| DOI: | 10.1111/jedm.12263 |
| ISSN: | 0022-0655 |
| Abstract: | Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered countries. Specifically, items for which most respondents in a country have a very low (or high) probability of providing a correct answer will rarely be flagged by the RMSD as showing misfit, even if very strong DIF is present. With many international large-scale assessment initiatives moving toward covering a more heterogeneous group of countries, this raises issues for the ability of the RMSD to detect item-level misfit, especially in low-performing countries that are not well-aligned with the overall difficulty level of the test. This may put one at risk of incorrectly assuming measurement invariance to hold, and may also inflate estimated between-country difference in proficiency. The degree to which the RMSD is able to detect DIF in low-performing countries is studied using both an empirical example from PISA 2015 and a simulation study. |
| Abstractor: | As Provided |
| Entry Date: | 2020 |
| Accession Number: | EJ1277428 |
| Database: | ERIC |
| Abstract: | Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered countries. Specifically, items for which most respondents in a country have a very low (or high) probability of providing a correct answer will rarely be flagged by the RMSD as showing misfit, even if very strong DIF is present. With many international large-scale assessment initiatives moving toward covering a more heterogeneous group of countries, this raises issues for the ability of the RMSD to detect item-level misfit, especially in low-performing countries that are not well-aligned with the overall difficulty level of the test. This may put one at risk of incorrectly assuming measurement invariance to hold, and may also inflate estimated between-country difference in proficiency. The degree to which the RMSD is able to detect DIF in low-performing countries is studied using both an empirical example from PISA 2015 and a simulation study. |
|---|---|
| ISSN: | 0022-0655 |
| DOI: | 10.1111/jedm.12263 |