The Knowledge Component Attribution Problem for Programming: Methods and Tradeoffs with Limited Labeled Data
Saved in:
| Title: | The Knowledge Component Attribution Problem for Programming: Methods and Tradeoffs with Limited Labeled Data |
|---|---|
| Language: | English |
| Authors: | Yang Shi (ORCID |
| Source: | Journal of Educational Data Mining. 2024 16(1):1-33. |
| Availability: | International Educational Data Mining. e-mail: jedm.editor@gmail.com; Web site: https://jedm.educationaldatamining.org/index.php/JEDM |
| Peer Reviewed: | Y |
| Page Count: | 33 |
| Publication Date: | 2024 |
| Sponsoring Agency: | National Science Foundation (NSF) |
| Contract Number: | 2013502 2112635 |
| Document Type: | Journal Articles Reports - Research |
| Education Level: | Higher Education Postsecondary Education |
| Descriptors: | Programming Languages, Undergraduate Students, Learning Processes, Teaching Models, Information Transfer, Data Collection, Data Use, Program Design, Cognitive Structures, Cognitive Processes |
| Geographic Terms: | Virginia |
| ISSN: | 2157-2100 |
| Abstract: | Understanding students' learning of knowledge components (KCs) is an important educational data mining task and enables many educational applications. However, in the domain of computing education, where program exercises require students to practice many KCs simultaneously, it is a challenge to attribute their errors to specific KCs and, therefore, to model student knowledge of these KCs. In this paper, we define this task as the KC attribution problem. We first demonstrate a novel approach to addressing this task using deep neural networks and explore its performance in identifying expert-defined KCs (RQ1). Because the labeling process takes costly expert resources, we further evaluate the effectiveness of transfer learning for KC attribution, using more easily acquired labels, such as problem correctness (RQ2). Finally, because prior research indicates the incorporation of educational theory in deep learning models could potentially enhance model performance, we investigated how to incorporate learning curves in the model design and evaluated their performance (RQ3). Our results show that in a supervised learning scenario, we can use a deep learning model, code2vec, to attribute KCs with a relatively high performance (AUC > 75% in two of the three examined KCs). Further using transfer learning, we achieve reasonable performance on the task without any costly expert labeling. However, the incorporation of learning curves shows limited effectiveness in this task. Our research lays important groundwork for personalized feedback for students based on which KCs they applied correctly, as well as more interpretable and accurate student models. |
| Abstractor: | As Provided |
| Entry Date: | 2024 |
| Accession Number: | EJ1430503 |
| Database: | ERIC |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://eric.ed.gov/contentdelivery/servlet/ERICServlet?accno=EJ1430503 Name: ERIC Full Text Category: fullText Text: Full Text from ERIC |
|---|---|
| Header | DbId: eric DbLabel: ERIC An: EJ1430503 AccessLevel: 3 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: The Knowledge Component Attribution Problem for Programming: Methods and Tradeoffs with Limited Labeled Data – Name: Language Label: Language Group: Lang Data: English – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Yang+Shi%22">Yang Shi</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0001-6486-4340">0000-0001-6486-4340</externalLink>)<br /><searchLink fieldCode="AR" term="%22Robin+Schmucker%22">Robin Schmucker</searchLink><br /><searchLink fieldCode="AR" term="%22Keith+Tran%22">Keith Tran</searchLink><br /><searchLink fieldCode="AR" term="%22John+Bacher%22">John Bacher</searchLink><br /><searchLink fieldCode="AR" term="%22Kenneth+Koedinger%22">Kenneth Koedinger</searchLink><br /><searchLink fieldCode="AR" term="%22Thomas+Price%22">Thomas Price</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0001-9375-2292">0000-0001-9375-2292</externalLink>)<br /><searchLink fieldCode="AR" term="%22Min+Chi%22">Min Chi</searchLink><br /><searchLink fieldCode="AR" term="%22Tiffany+Barnes%22">Tiffany Barnes</searchLink> – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="SO" term="%22Journal+of+Educational+Data+Mining%22"><i>Journal of Educational Data Mining</i></searchLink>. 2024 16(1):1-33. – Name: Avail Label: Availability Group: Avail Data: International Educational Data Mining. e-mail: jedm.editor@gmail.com; Web site: https://jedm.educationaldatamining.org/index.php/JEDM – Name: PeerReviewed Label: Peer Reviewed Group: SrcInfo Data: Y – Name: Pages Label: Page Count Group: Src Data: 33 – Name: DatePubCY Label: Publication Date Group: Date Data: 2024 – Name: SourceSuprt Label: Sponsoring Agency Group: SrcSuprt Data: National Science Foundation (NSF) – Name: NumberContract Label: Contract Number Group: NumCntrct Data: 2013502<br />2112635 – Name: TypeDocument Label: Document Type Group: TypDoc Data: Journal Articles<br />Reports - Research – Name: Audience Label: Education Level Group: Audnce Data: <searchLink fieldCode="EL" term="%22Higher+Education%22">Higher Education</searchLink><br /><searchLink fieldCode="EL" term="%22Postsecondary+Education%22">Postsecondary Education</searchLink> – Name: Subject Label: Descriptors Group: Su Data: <searchLink fieldCode="DE" term="%22Programming+Languages%22">Programming Languages</searchLink><br /><searchLink fieldCode="DE" term="%22Undergraduate+Students%22">Undergraduate Students</searchLink><br /><searchLink fieldCode="DE" term="%22Learning+Processes%22">Learning Processes</searchLink><br /><searchLink fieldCode="DE" term="%22Teaching+Models%22">Teaching Models</searchLink><br /><searchLink fieldCode="DE" term="%22Information+Transfer%22">Information Transfer</searchLink><br /><searchLink fieldCode="DE" term="%22Data+Collection%22">Data Collection</searchLink><br /><searchLink fieldCode="DE" term="%22Data+Use%22">Data Use</searchLink><br /><searchLink fieldCode="DE" term="%22Program+Design%22">Program Design</searchLink><br /><searchLink fieldCode="DE" term="%22Cognitive+Structures%22">Cognitive Structures</searchLink><br /><searchLink fieldCode="DE" term="%22Cognitive+Processes%22">Cognitive Processes</searchLink> – Name: Subject Label: Geographic Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Virginia%22">Virginia</searchLink> – Name: ISSN Label: ISSN Group: ISSN Data: 2157-2100 – Name: Abstract Label: Abstract Group: Ab Data: Understanding students' learning of knowledge components (KCs) is an important educational data mining task and enables many educational applications. However, in the domain of computing education, where program exercises require students to practice many KCs simultaneously, it is a challenge to attribute their errors to specific KCs and, therefore, to model student knowledge of these KCs. In this paper, we define this task as the KC attribution problem. We first demonstrate a novel approach to addressing this task using deep neural networks and explore its performance in identifying expert-defined KCs (RQ1). Because the labeling process takes costly expert resources, we further evaluate the effectiveness of transfer learning for KC attribution, using more easily acquired labels, such as problem correctness (RQ2). Finally, because prior research indicates the incorporation of educational theory in deep learning models could potentially enhance model performance, we investigated how to incorporate learning curves in the model design and evaluated their performance (RQ3). Our results show that in a supervised learning scenario, we can use a deep learning model, code2vec, to attribute KCs with a relatively high performance (AUC > 75% in two of the three examined KCs). Further using transfer learning, we achieve reasonable performance on the task without any costly expert labeling. However, the incorporation of learning curves shows limited effectiveness in this task. Our research lays important groundwork for personalized feedback for students based on which KCs they applied correctly, as well as more interpretable and accurate student models. – Name: AbstractInfo Label: Abstractor Group: Ab Data: As Provided – Name: DateEntry Label: Entry Date Group: Date Data: 2024 – Name: AN Label: Accession Number Group: ID Data: EJ1430503 |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1430503 |
| RecordInfo | BibRecord: BibEntity: Languages: – Text: English PhysicalDescription: Pagination: PageCount: 33 StartPage: 1 Subjects: – SubjectFull: Programming Languages Type: general – SubjectFull: Undergraduate Students Type: general – SubjectFull: Learning Processes Type: general – SubjectFull: Teaching Models Type: general – SubjectFull: Information Transfer Type: general – SubjectFull: Data Collection Type: general – SubjectFull: Data Use Type: general – SubjectFull: Program Design Type: general – SubjectFull: Cognitive Structures Type: general – SubjectFull: Cognitive Processes Type: general – SubjectFull: Virginia Type: general Titles: – TitleFull: The Knowledge Component Attribution Problem for Programming: Methods and Tradeoffs with Limited Labeled Data Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Yang Shi – PersonEntity: Name: NameFull: Robin Schmucker – PersonEntity: Name: NameFull: Keith Tran – PersonEntity: Name: NameFull: John Bacher – PersonEntity: Name: NameFull: Kenneth Koedinger – PersonEntity: Name: NameFull: Thomas Price – PersonEntity: Name: NameFull: Min Chi – PersonEntity: Name: NameFull: Tiffany Barnes IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 2024 Identifiers: – Type: issn-electronic Value: 2157-2100 Numbering: – Type: volume Value: 16 – Type: issue Value: 1 Titles: – TitleFull: Journal of Educational Data Mining Type: main |
| ResultId | 1 |