AI-Powered Scoring for Creative Thinking: Methods and Challenges in PISA Assessment

Saved in:
Bibliographic Details
Title: AI-Powered Scoring for Creative Thinking: Methods and Challenges in PISA Assessment
Language: English
Authors: Ricardo Primi (ORCID 0000-0003-4227-6745), Roger E. Beaty (ORCID 0000-0001-6114-5973), Mathias Benedek (ORCID 0000-0001-6258-4476), Denis Dumas (ORCID 0000-0002-8446-4720), Peter Organisciak (ORCID 0000-0002-9058-2280), John D. Patterson (ORCID 0000-0002-7455-3535), Tiago Calico (ORCID 0000-0003-3080-343X), Mario Piacentini (ORCID 0000-0001-8624-2833)
Source: Journal of Creative Behavior. 2026 60(1).
Availability: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
Peer Reviewed: Y
Page Count: 12
Publication Date: 2026
Document Type: Journal Articles
Reports - Research
Education Level: Secondary Education
Descriptors: Artificial Intelligence, Computer Assisted Testing, Scoring, Creativity Tests, Creative Thinking, Natural Language Processing, Automation, Semantics, Prompting, Inferences, Psychometrics, Item Response Theory, Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Assessment and Survey Identifiers: Program for International Student Assessment
DOI: 10.1002/jocb.70082
ISSN: 0022-0175
2162-6057
Abstract: The introduction of the PISA 2022 Creative Thinking assessment underscores the growing need for scalable, valid, and reliable methods to evaluate creativity in international large-scale assessments. Traditional human scoring, while nuanced, is time-consuming, costly, and subject to inconsistencies. This paper explores recent advances in artificial intelligence (AI) and natural language processing (NLP)--particularly transformer-based large language models (LLMs)--as promising alternatives for automated scoring. We review three methodological approaches: (1) unsupervised methods using semantic distance, (2) supervised fine-tuning with labeled data, and (3) few-/zero-shot learning using prompt-based inference. Empirical findings from verbal and visual creative tasks show that AI-based scoring systems can approximate human ratings with substantial accuracy (r = 0.70-0.85), even across different languages and task formats. A case study using the PISA Book Covers task demonstrates convergence between AI and human scores, with reliability levels comparable to traditional scoring. However, key challenges remain, particularly regarding cross-cultural comparability, bias mitigation, and interpretability. We discuss psychometric strategies (e.g., Many-Facet Rasch Models) to model these issues and propose future directions, including scoring of distinct creativity dimensions and developing transparent, open-source platforms. If rigorously validated, AI-based scoring offers a feasible and equitable path forward for assessing creativity globally.
Abstractor: As Provided
Entry Date: 2026
Accession Number: EJ1500530
Database: ERIC
FullText Text:
  Availability: 0
Header DbId: eric
DbLabel: ERIC
An: EJ1500530
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: AI-Powered Scoring for Creative Thinking: Methods and Challenges in PISA Assessment
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Ricardo+Primi%22">Ricardo Primi</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0003-4227-6745">0000-0003-4227-6745</externalLink>)<br /><searchLink fieldCode="AR" term="%22Roger+E%2E+Beaty%22">Roger E. Beaty</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0001-6114-5973">0000-0001-6114-5973</externalLink>)<br /><searchLink fieldCode="AR" term="%22Mathias+Benedek%22">Mathias Benedek</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0001-6258-4476">0000-0001-6258-4476</externalLink>)<br /><searchLink fieldCode="AR" term="%22Denis+Dumas%22">Denis Dumas</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-8446-4720">0000-0002-8446-4720</externalLink>)<br /><searchLink fieldCode="AR" term="%22Peter+Organisciak%22">Peter Organisciak</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-9058-2280">0000-0002-9058-2280</externalLink>)<br /><searchLink fieldCode="AR" term="%22John+D%2E+Patterson%22">John D. Patterson</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-7455-3535">0000-0002-7455-3535</externalLink>)<br /><searchLink fieldCode="AR" term="%22Tiago+Calico%22">Tiago Calico</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0003-3080-343X">0000-0003-3080-343X</externalLink>)<br /><searchLink fieldCode="AR" term="%22Mario+Piacentini%22">Mario Piacentini</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0001-8624-2833">0000-0001-8624-2833</externalLink>)
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Journal+of+Creative+Behavior%22"><i>Journal of Creative Behavior</i></searchLink>. 2026 60(1).
– Name: Avail
  Label: Availability
  Group: Avail
  Data: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 12
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2026
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Reports - Research
– Name: Audience
  Label: Education Level
  Group: Audnce
  Data: <searchLink fieldCode="EL" term="%22Secondary+Education%22">Secondary Education</searchLink>
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Artificial+Intelligence%22">Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Assisted+Testing%22">Computer Assisted Testing</searchLink><br /><searchLink fieldCode="DE" term="%22Scoring%22">Scoring</searchLink><br /><searchLink fieldCode="DE" term="%22Creativity+Tests%22">Creativity Tests</searchLink><br /><searchLink fieldCode="DE" term="%22Creative+Thinking%22">Creative Thinking</searchLink><br /><searchLink fieldCode="DE" term="%22Natural+Language+Processing%22">Natural Language Processing</searchLink><br /><searchLink fieldCode="DE" term="%22Automation%22">Automation</searchLink><br /><searchLink fieldCode="DE" term="%22Semantics%22">Semantics</searchLink><br /><searchLink fieldCode="DE" term="%22Prompting%22">Prompting</searchLink><br /><searchLink fieldCode="DE" term="%22Inferences%22">Inferences</searchLink><br /><searchLink fieldCode="DE" term="%22Psychometrics%22">Psychometrics</searchLink><br /><searchLink fieldCode="DE" term="%22Item+Response+Theory%22">Item Response Theory</searchLink><br /><searchLink fieldCode="DE" term="%22Achievement+Tests%22">Achievement Tests</searchLink><br /><searchLink fieldCode="DE" term="%22Foreign+Countries%22">Foreign Countries</searchLink><br /><searchLink fieldCode="DE" term="%22International+Assessment%22">International Assessment</searchLink><br /><searchLink fieldCode="DE" term="%22Secondary+School+Students%22">Secondary School Students</searchLink>
– Name: SubjectThesaurus
  Label: Assessment and Survey Identifiers
  Group: Su
  Data: <searchLink fieldCode="SU" term="%22Program+for+International+Student+Assessment%22">Program for International Student Assessment</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1002/jocb.70082
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 0022-0175<br />2162-6057
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: The introduction of the PISA 2022 Creative Thinking assessment underscores the growing need for scalable, valid, and reliable methods to evaluate creativity in international large-scale assessments. Traditional human scoring, while nuanced, is time-consuming, costly, and subject to inconsistencies. This paper explores recent advances in artificial intelligence (AI) and natural language processing (NLP)--particularly transformer-based large language models (LLMs)--as promising alternatives for automated scoring. We review three methodological approaches: (1) unsupervised methods using semantic distance, (2) supervised fine-tuning with labeled data, and (3) few-/zero-shot learning using prompt-based inference. Empirical findings from verbal and visual creative tasks show that AI-based scoring systems can approximate human ratings with substantial accuracy (r = 0.70-0.85), even across different languages and task formats. A case study using the PISA Book Covers task demonstrates convergence between AI and human scores, with reliability levels comparable to traditional scoring. However, key challenges remain, particularly regarding cross-cultural comparability, bias mitigation, and interpretability. We discuss psychometric strategies (e.g., Many-Facet Rasch Models) to model these issues and propose future directions, including scoring of distinct creativity dimensions and developing transparent, open-source platforms. If rigorously validated, AI-based scoring offers a feasible and equitable path forward for assessing creativity globally.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2026
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1500530
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1500530
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1002/jocb.70082
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 12
    Subjects:
      – SubjectFull: Artificial Intelligence
        Type: general
      – SubjectFull: Computer Assisted Testing
        Type: general
      – SubjectFull: Scoring
        Type: general
      – SubjectFull: Creativity Tests
        Type: general
      – SubjectFull: Creative Thinking
        Type: general
      – SubjectFull: Natural Language Processing
        Type: general
      – SubjectFull: Automation
        Type: general
      – SubjectFull: Semantics
        Type: general
      – SubjectFull: Prompting
        Type: general
      – SubjectFull: Inferences
        Type: general
      – SubjectFull: Psychometrics
        Type: general
      – SubjectFull: Item Response Theory
        Type: general
      – SubjectFull: Achievement Tests
        Type: general
      – SubjectFull: Foreign Countries
        Type: general
      – SubjectFull: International Assessment
        Type: general
      – SubjectFull: Secondary School Students
        Type: general
      – SubjectFull: Program for International Student Assessment
        Type: general
    Titles:
      – TitleFull: AI-Powered Scoring for Creative Thinking: Methods and Challenges in PISA Assessment
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Ricardo Primi
      – PersonEntity:
          Name:
            NameFull: Roger E. Beaty
      – PersonEntity:
          Name:
            NameFull: Mathias Benedek
      – PersonEntity:
          Name:
            NameFull: Denis Dumas
      – PersonEntity:
          Name:
            NameFull: Peter Organisciak
      – PersonEntity:
          Name:
            NameFull: John D. Patterson
      – PersonEntity:
          Name:
            NameFull: Tiago Calico
      – PersonEntity:
          Name:
            NameFull: Mario Piacentini
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 03
              Type: published
              Y: 2026
          Identifiers:
            – Type: issn-print
              Value: 0022-0175
            – Type: issn-electronic
              Value: 2162-6057
          Numbering:
            – Type: volume
              Value: 60
            – Type: issue
              Value: 1
          Titles:
            – TitleFull: Journal of Creative Behavior
              Type: main
ResultId 1