CorreGram: Using Corpus Data to Develop Student-Adaptable Automated Corrective Feedback for the L2 Spanish Language Classroom

Saved in:
Bibliographic Details
Title: CorreGram: Using Corpus Data to Develop Student-Adaptable Automated Corrective Feedback for the L2 Spanish Language Classroom
Language: English
Authors: Samuel S. Davidson
Source: ProQuest LLC. 2024Ph.D. Dissertation, University of California, Davis.
Availability: ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml
Peer Reviewed: N
Page Count: 162
Publication Date: 2024
Document Type: Dissertations/Theses - Doctoral Dissertations
Education Level: Higher Education
Postsecondary Education
Descriptors: Computer Assisted Testing, Automation, Student Evaluation, Feedback (Response), Second Language Learning, Spanish, Error Correction, College Students, Computer Software, Artificial Intelligence, Grammar
Geographic Terms: California
ISBN: 979-83-8448-361-8
Abstract: Automated corrective feedback (ACF), in which a computer system helps language learners identify and correct errors in their writing or speech, is considered an important tool for language instruction by many researchers. Such systems allow learners to correct their own mistakes, thereby reducing teacher workload and potentially preventing issues related to grammatical error fossilization. Research in this area has led to the development and widespread adoption of tools such as Grammarly for English learners. However, research in grammatical error correction (GEC) and other forms of ACF in languages other than English has been much more limited. This dearth of research is in part due to the large demand for English instruction, but is also driven by the limited training data available for non-English languages. However, a new corpus of learner Spanish collected at UC Davis, COWS-L2H, provided me with an opportunity to explore development of ACF for students studying Spanish. In my dissertation work, I explore the error patterns present in writing by students of Spanish in COWS-L2H, and use this information to inform a novel data augmentation technique to generate synthetic data for training language models capable of correcting learner errors in Spanish text. I then use this synthetic data, along with learner data from COWS-L2H, to train an AI-based GEC model for Spanish learners that is adaptable to learner L1 and proficiency level. Finally, I explore how this automatically corrected writing can be used to present feedback to learners in a pedagogically motivated way. To that end, I combine the GEC model trained using data from COWS-L2H with hand-written templates and feedback produced by generative LLMs to craft appropriate feedback for learners using the system. The end goal is a grammar-checker that is able to not only explain why something a student wrote is potentially incorrect, but is also able to guide the student to make the correction themselves. I demonstrate this novel system, CorreGram, and further discuss details of its implementation and proposals for how the system may be effectively utilized in the language classroom. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Abstractor: As Provided
Entry Date: 2024
Access URL: https://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:31561972
Accession Number: ED663221
Database: ERIC
Description
Abstract:Automated corrective feedback (ACF), in which a computer system helps language learners identify and correct errors in their writing or speech, is considered an important tool for language instruction by many researchers. Such systems allow learners to correct their own mistakes, thereby reducing teacher workload and potentially preventing issues related to grammatical error fossilization. Research in this area has led to the development and widespread adoption of tools such as Grammarly for English learners. However, research in grammatical error correction (GEC) and other forms of ACF in languages other than English has been much more limited. This dearth of research is in part due to the large demand for English instruction, but is also driven by the limited training data available for non-English languages. However, a new corpus of learner Spanish collected at UC Davis, COWS-L2H, provided me with an opportunity to explore development of ACF for students studying Spanish. In my dissertation work, I explore the error patterns present in writing by students of Spanish in COWS-L2H, and use this information to inform a novel data augmentation technique to generate synthetic data for training language models capable of correcting learner errors in Spanish text. I then use this synthetic data, along with learner data from COWS-L2H, to train an AI-based GEC model for Spanish learners that is adaptable to learner L1 and proficiency level. Finally, I explore how this automatically corrected writing can be used to present feedback to learners in a pedagogically motivated way. To that end, I combine the GEC model trained using data from COWS-L2H with hand-written templates and feedback produced by generative LLMs to craft appropriate feedback for learners using the system. The end goal is a grammar-checker that is able to not only explain why something a student wrote is potentially incorrect, but is also able to guide the student to make the correction themselves. I demonstrate this novel system, CorreGram, and further discuss details of its implementation and proposals for how the system may be effectively utilized in the language classroom. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
ISBN:979-83-8448-361-8