Generating In-Context, Personalized Feedback for Intelligent Tutors with Large Language Models

Saved in:
Bibliographic Details
Title: Generating In-Context, Personalized Feedback for Intelligent Tutors with Large Language Models
Language: English
Authors: Jennifer M. Reddig, Arav Arora, Christopher J. MacLellan
Source: International Journal of Artificial Intelligence in Education. 2025 35(6):3459-3500.
Availability: Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/
Peer Reviewed: Y
Page Count: 42
Publication Date: 2025
Sponsoring Agency: National Science Foundation (NSF)
Contract Number: 2112532
Document Type: Journal Articles
Reports - Research
Education Level: Higher Education
Postsecondary Education
Descriptors: Intelligent Tutoring Systems, Artificial Intelligence, Feedback (Response), Error Correction, Accuracy, Evaluation, College Mathematics, Algebra, Models
DOI: 10.1007/s40593-025-00505-6
ISSN: 1560-4292
1560-4306
Abstract: This study explores how large language models (LLMs), specifically GPT-4, could be used to generate personalized feedback within an Intelligent Tutoring System (ITS). The research focuses on evaluating the model's ability to (1) diagnose student errors, (2) generate personalized corrective feedback, and (3) assess the accuracy of diagnoses and helpfulness of the feedback. We analyze student errors from the Apprentice Tutor College Algebra ITS and prompt GPT-4 to give targeted feedback on those errors. The findings suggest that while this model can effectively diagnose a range of student errors, its feedback varies in effectiveness based on the complexity of the problem and the type of error. While GPT-4 generates relevant, specific feedback a majority of the time, 35% of the hints were too general, incorrect, or give away the correct answer. The study also explores methods for using an LLM to automatically evaluate the validity of generated feedback, and finds that only 35% of feedback passes automated helpfulness evaluations.
Abstractor: As Provided
Entry Date: 2026
Accession Number: EJ1500144
Database: ERIC
FullText Text:
  Availability: 0
Header DbId: eric
DbLabel: ERIC
An: EJ1500144
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Generating In-Context, Personalized Feedback for Intelligent Tutors with Large Language Models
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Jennifer+M%2E+Reddig%22">Jennifer M. Reddig</searchLink><br /><searchLink fieldCode="AR" term="%22Arav+Arora%22">Arav Arora</searchLink><br /><searchLink fieldCode="AR" term="%22Christopher+J%2E+MacLellan%22">Christopher J. MacLellan</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22International+Journal+of+Artificial+Intelligence+in+Education%22"><i>International Journal of Artificial Intelligence in Education</i></searchLink>. 2025 35(6):3459-3500.
– Name: Avail
  Label: Availability
  Group: Avail
  Data: Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 42
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2025
– Name: SourceSuprt
  Label: Sponsoring Agency
  Group: SrcSuprt
  Data: National Science Foundation (NSF)
– Name: NumberContract
  Label: Contract Number
  Group: NumCntrct
  Data: 2112532
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Reports - Research
– Name: Audience
  Label: Education Level
  Group: Audnce
  Data: <searchLink fieldCode="EL" term="%22Higher+Education%22">Higher Education</searchLink><br /><searchLink fieldCode="EL" term="%22Postsecondary+Education%22">Postsecondary Education</searchLink>
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Intelligent+Tutoring+Systems%22">Intelligent Tutoring Systems</searchLink><br /><searchLink fieldCode="DE" term="%22Artificial+Intelligence%22">Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Feedback+%28Response%29%22">Feedback (Response)</searchLink><br /><searchLink fieldCode="DE" term="%22Error+Correction%22">Error Correction</searchLink><br /><searchLink fieldCode="DE" term="%22Accuracy%22">Accuracy</searchLink><br /><searchLink fieldCode="DE" term="%22Evaluation%22">Evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22College+Mathematics%22">College Mathematics</searchLink><br /><searchLink fieldCode="DE" term="%22Algebra%22">Algebra</searchLink><br /><searchLink fieldCode="DE" term="%22Models%22">Models</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1007/s40593-025-00505-6
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 1560-4292<br />1560-4306
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: This study explores how large language models (LLMs), specifically GPT-4, could be used to generate personalized feedback within an Intelligent Tutoring System (ITS). The research focuses on evaluating the model's ability to (1) diagnose student errors, (2) generate personalized corrective feedback, and (3) assess the accuracy of diagnoses and helpfulness of the feedback. We analyze student errors from the Apprentice Tutor College Algebra ITS and prompt GPT-4 to give targeted feedback on those errors. The findings suggest that while this model can effectively diagnose a range of student errors, its feedback varies in effectiveness based on the complexity of the problem and the type of error. While GPT-4 generates relevant, specific feedback a majority of the time, 35% of the hints were too general, incorrect, or give away the correct answer. The study also explores methods for using an LLM to automatically evaluate the validity of generated feedback, and finds that only 35% of feedback passes automated helpfulness evaluations.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2026
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1500144
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1500144
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1007/s40593-025-00505-6
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 42
        StartPage: 3459
    Subjects:
      – SubjectFull: Intelligent Tutoring Systems
        Type: general
      – SubjectFull: Artificial Intelligence
        Type: general
      – SubjectFull: Feedback (Response)
        Type: general
      – SubjectFull: Error Correction
        Type: general
      – SubjectFull: Accuracy
        Type: general
      – SubjectFull: Evaluation
        Type: general
      – SubjectFull: College Mathematics
        Type: general
      – SubjectFull: Algebra
        Type: general
      – SubjectFull: Models
        Type: general
    Titles:
      – TitleFull: Generating In-Context, Personalized Feedback for Intelligent Tutors with Large Language Models
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Jennifer M. Reddig
      – PersonEntity:
          Name:
            NameFull: Arav Arora
      – PersonEntity:
          Name:
            NameFull: Christopher J. MacLellan
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-print
              Value: 1560-4292
            – Type: issn-electronic
              Value: 1560-4306
          Numbering:
            – Type: volume
              Value: 35
            – Type: issue
              Value: 6
          Titles:
            – TitleFull: International Journal of Artificial Intelligence in Education
              Type: main
ResultId 1