Using Automated Procedures to Score Educational Essays Written in Three Languages

Saved in:
Bibliographic Details
Title: Using Automated Procedures to Score Educational Essays Written in Three Languages
Language: English
Authors: Tahereh Firoozi (ORCID 0000-0002-6947-0516), Hamid Mohammadi, Mark J. Gierl
Source: Journal of Educational Measurement. 2025 62(1):33-56.
Availability: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
Peer Reviewed: Y
Page Count: 24
Publication Date: 2025
Document Type: Journal Articles
Reports - Research
Education Level: Higher Education
Postsecondary Education
Descriptors: College Students, Slavic Languages, German, Italian, Multilingual Materials, Computer Assisted Testing, Test Validity, Test Reliability, Interrater Reliability, Comparative Testing, Writing Assignments, Evaluation Methods
DOI: 10.1111/jedm.12406
ISSN: 0022-0655
1745-3984
Abstract: The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were holistically scored using the Common European Framework of Reference of Languages. The AES system with mBERT produced results that were consistent with human raters overall across all three language groups. The system also produced accurate predictions for some but not all of the score levels within each language. The AES system with LaBSE produced results that were even more consistent with the human raters overall across all three language groups compared to mBERT. In addition, the system produced accurate predictions for the majority of the score levels within each language. The performance differences between mBERT and LaBSE can be explained by considering how each language embedding model is implemented. Implications of this study for educational testing are also discussed.
Abstractor: As Provided
Entry Date: 2025
Accession Number: EJ1463684
Database: ERIC
Full text is not displayed to guests.
Be the first to leave a comment!
You must be logged in first