srcML-DKT: Enhancing Deep Knowledge Tracing with Robust Code Representations from srcML

Saved in:
Bibliographic Details
Title: srcML-DKT: Enhancing Deep Knowledge Tracing with Robust Code Representations from srcML
Language: English
Authors: Maciej Pankiewicz, Yang Shi, Ryan S. Baker
Source: International Educational Data Mining Society. 2025.
Availability: International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: https://educationaldatamining.org/conferences/
Peer Reviewed: Y
Page Count: 8
Publication Date: 2025
Document Type: Speeches/Meeting Papers
Reports - Research
Education Level: Higher Education
Postsecondary Education
Descriptors: Algorithms, Artificial Intelligence, Models, Intelligent Tutoring Systems, Prediction, Programming, Computer Science Education, College Students, Foreign Countries, Educational Technology
Geographic Terms: Europe
Abstract: Knowledge Tracing (KT) models predicting student performance in intelligent tutoring systems have been successfully deployed in several educational domains. However, their usage in open-ended programming problems poses multiple challenges due to the complexity of the programming code and a complex interplay between syntax and logic requirements embedded in code development. As a result, traditional Bayesian Knowledge Tracing (BKT) and more advanced Deep Knowledge Tracing (DKT) approaches that use binary correctness data find limited use. Code-DKT [26] is a knowledge tracing approach that uses recurrent neural networks to model learning progress leveraging information extracted from the student-generated code, incorporating abstract syntax tree (AST)based code features, but its reliance on parsable code limits its effectiveness; unparsable submissions may constitute a substantial part of code submitted for evaluation within platforms for automated assessment of programming assignments. To overcome the ASTs limitations, we propose srcML-DKT, an extension of CodeDKT that utilizes srcML-based code representations, enabling feature extraction from both parsable and unparsable code. By capturing syntactic and structural details directly from the code text, srcML-DKT enables including all student code submissions, regardless of syntax errors. Empirical evaluations on a dataset of 610 students and six programming tasks focused on conditional statements demonstrate that srcML-DKT consistently outperforms both Code-DKT and traditional DKT models, achieving higher AUC and F1-scores across first and all attempts. These results highlight the model's ability to track student knowledge progression more accurately, in environments where trial-and-error learning is common. [For the complete proceedings, see ED675583.]
Abstractor: As Provided
Entry Date: 2025
Accession Number: ED675671
Database: ERIC
Description
Abstract:Knowledge Tracing (KT) models predicting student performance in intelligent tutoring systems have been successfully deployed in several educational domains. However, their usage in open-ended programming problems poses multiple challenges due to the complexity of the programming code and a complex interplay between syntax and logic requirements embedded in code development. As a result, traditional Bayesian Knowledge Tracing (BKT) and more advanced Deep Knowledge Tracing (DKT) approaches that use binary correctness data find limited use. Code-DKT [26] is a knowledge tracing approach that uses recurrent neural networks to model learning progress leveraging information extracted from the student-generated code, incorporating abstract syntax tree (AST)based code features, but its reliance on parsable code limits its effectiveness; unparsable submissions may constitute a substantial part of code submitted for evaluation within platforms for automated assessment of programming assignments. To overcome the ASTs limitations, we propose srcML-DKT, an extension of CodeDKT that utilizes srcML-based code representations, enabling feature extraction from both parsable and unparsable code. By capturing syntactic and structural details directly from the code text, srcML-DKT enables including all student code submissions, regardless of syntax errors. Empirical evaluations on a dataset of 610 students and six programming tasks focused on conditional statements demonstrate that srcML-DKT consistently outperforms both Code-DKT and traditional DKT models, achieving higher AUC and F1-scores across first and all attempts. These results highlight the model's ability to track student knowledge progression more accurately, in environments where trial-and-error learning is common. [For the complete proceedings, see ED675583.]