Classification of Open-Ended Responses to a Research-Based Assessment Using Natural Language Processing

Saved in:
Bibliographic Details
Title: Classification of Open-Ended Responses to a Research-Based Assessment Using Natural Language Processing
Language: English
Authors: Wilson, Joseph (ORCID 0000-0003-2111-507X), Pollard, Benjamin (ORCID 0000-0002-5109-6415), Aiken, John M. (ORCID 0000-0003-0717-4583), Lewandowski, H. J.
Source: Physical Review Physics Education Research. Jan-Jun 2022 18(1).
Availability: American Physical Society. One Physics Ellipse 4th Floor, College Park, MD 20740-3844. Tel: 301-209-3200; Fax: 301-209-0865; e-mail: assocpub@aps.org; Web site: http://prst-per.aps.org
Peer Reviewed: Y
Page Count: 16
Publication Date: 2022
Sponsoring Agency: National Science Foundation (NSF)
Contract Number: PHY1734006
Document Type: Journal Articles
Reports - Research
Education Level: Higher Education
Postsecondary Education
Descriptors: Natural Language Processing, Science Education, Physics, Artificial Intelligence, Models, Data Analysis, Classification, Student Reaction, Test Format, College Students
Geographic Terms: Colorado (Boulder)
DOI: 10.1103/PhysRevPhysEducRes.18.010141
ISSN: 2469-9896
Abstract: Surveys have long been used in physics education research to understand student reasoning and inform course improvements. However, to make analysis of large sets of responses practical, most surveys use a closed-response format with a small set of potential responses. Open-ended formats, such as written free response, can provide deeper insights into student thinking, but take much longer to analyze, especially with a large number of responses. Here, we explore natural language processing as a computational solution to this problem. We create a machine learning model that can take student responses from the Physics Measurement Questionnaire as input, and output a categorization of student reasoning based on different reasoning paradigms. Our model yields classifications with the same level of agreement as that between two humans categorizing the data, but can be done by a computer, and thus can be scaled for large datasets. In this work, we describe the algorithms and methodologies used to create, train, and test our natural language processing system. We also present the results of the analysis and discuss the utility of these approaches for analyzing open-response data in education research.
Abstractor: As Provided
Entry Date: 2022
Accession Number: EJ1355094
Database: ERIC
Be the first to leave a comment!
You must be logged in first