View in EDS

srcML-DKT: Enhancing Deep Knowledge Tracing with Robust Code Representations from srcML

Saved in:

Bibliographic Details
Title:	srcML-DKT: Enhancing Deep Knowledge Tracing with Robust Code Representations from srcML
Language:	English
Authors:	Maciej Pankiewicz, Yang Shi, Ryan S. Baker
Source:	International Educational Data Mining Society. 2025.
Availability:	International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: https://educationaldatamining.org/conferences/
Peer Reviewed:	Y
Page Count:	8
Publication Date:	2025
Document Type:	Speeches/Meeting Papers Reports - Research
Education Level:	Higher Education Postsecondary Education
Descriptors:	Algorithms, Artificial Intelligence, Models, Intelligent Tutoring Systems, Prediction, Programming, Computer Science Education, College Students, Foreign Countries, Educational Technology
Geographic Terms:	Europe
Abstract:	Knowledge Tracing (KT) models predicting student performance in intelligent tutoring systems have been successfully deployed in several educational domains. However, their usage in open-ended programming problems poses multiple challenges due to the complexity of the programming code and a complex interplay between syntax and logic requirements embedded in code development. As a result, traditional Bayesian Knowledge Tracing (BKT) and more advanced Deep Knowledge Tracing (DKT) approaches that use binary correctness data find limited use. Code-DKT [26] is a knowledge tracing approach that uses recurrent neural networks to model learning progress leveraging information extracted from the student-generated code, incorporating abstract syntax tree (AST)based code features, but its reliance on parsable code limits its effectiveness; unparsable submissions may constitute a substantial part of code submitted for evaluation within platforms for automated assessment of programming assignments. To overcome the ASTs limitations, we propose srcML-DKT, an extension of CodeDKT that utilizes srcML-based code representations, enabling feature extraction from both parsable and unparsable code. By capturing syntactic and structural details directly from the code text, srcML-DKT enables including all student code submissions, regardless of syntax errors. Empirical evaluations on a dataset of 610 students and six programming tasks focused on conditional statements demonstrate that srcML-DKT consistently outperforms both Code-DKT and traditional DKT models, achieving higher AUC and F1-scores across first and all attempts. These results highlight the model's ability to track student knowledge progression more accurately, in environments where trial-and-error learning is common. [For the complete proceedings, see ED675583.]
Abstractor:	As Provided
Entry Date:	2025
Accession Number:	ED675671
Database:	ERIC

Full Text from ERIC

Description
Abstract:	Knowledge Tracing (KT) models predicting student performance in intelligent tutoring systems have been successfully deployed in several educational domains. However, their usage in open-ended programming problems poses multiple challenges due to the complexity of the programming code and a complex interplay between syntax and logic requirements embedded in code development. As a result, traditional Bayesian Knowledge Tracing (BKT) and more advanced Deep Knowledge Tracing (DKT) approaches that use binary correctness data find limited use. Code-DKT [26] is a knowledge tracing approach that uses recurrent neural networks to model learning progress leveraging information extracted from the student-generated code, incorporating abstract syntax tree (AST)based code features, but its reliance on parsable code limits its effectiveness; unparsable submissions may constitute a substantial part of code submitted for evaluation within platforms for automated assessment of programming assignments. To overcome the ASTs limitations, we propose srcML-DKT, an extension of CodeDKT that utilizes srcML-based code representations, enabling feature extraction from both parsable and unparsable code. By capturing syntactic and structural details directly from the code text, srcML-DKT enables including all student code submissions, regardless of syntax errors. Empirical evaluations on a dataset of 610 students and six programming tasks focused on conditional statements demonstrate that srcML-DKT consistently outperforms both Code-DKT and traditional DKT models, achieving higher AUC and F1-scores across first and all attempts. These results highlight the model's ability to track student knowledge progression more accurately, in environments where trial-and-error learning is common. [For the complete proceedings, see ED675583.]