Human vs. Machine Marking: A Comparative Study of Chemistry Assessments.

Saved in:
Bibliographic Details
Title: Human vs. Machine Marking: A Comparative Study of Chemistry Assessments.
Authors: Ade-Ibijola, Abejide1 (AUTHOR) abejide@jbs.ac.za, Chikezie, Ijeoma Joy2 (AUTHOR) drijeomajchikezie@gmail.com, Oyelere, Solomon Sunday1,3 (AUTHOR) s.oyelere@exeter.ac.uk
Source: Journal of Science Education & Technology. Dec2025, Vol. 34 Issue 6, p1430-1440. 11p.
Subject Terms: *Artificial intelligence, *Comparative studies, *Evaluation methodology, *Students, *Educational evaluation, Chemical testing
Geographic Terms: Nigeria
Abstract: Artificial intelligence (AI) has transformed educational assessment with automated marking, enhancing efficiency, objectivity, immediate feedback, and identifying students' response patterns. This paper explored the comparative analysis of human expert marking and machine marking in a chemistry class. The study used a comparative research design. The participants comprised 30 Senior Secondary Two (SS2) students and two chemistry experts from the National Institute for Nigerian Languages (NDSS), Abia State, Nigeria, randomly drawn from 98 students offering chemistry. A set of three chemistry short answer questions (SAQs) adopted from NECOSSCE past examination papers was used for data collection. Responses from students were marked by two human chemistry experts and ChatGPT using the marking guide. Pearson product moment correlation (PPMC) was employed to evaluate the relationship between the scores assigned by human experts and those assigned by ChatGPT. The results revealed a substantial correlation between the two human experts (r = 0.75), while the correlations between the human experts and ChatGPT were lower (r = 0.56 and 0.57, respectively). Admittedly, most differences in scores between human experts and ChatGPT were within one point, although larger discrepancies occurred less frequently. Item-by-item analyses of the scores indicated that ChatGPT's scores were within an acceptable range of human expert scores, although ChatGPT's marking exhibited some inconsistencies, particularly in assessing more complex SAQs. The study suggests, among others, that combining human and machine marking is highly recommended to enhance assessment practices in secondary school chemistry, leveraging the strengths of both methods. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Science Education & Technology is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Education Research Complete
Full text is not displayed to guests.
FullText Links:
  – Type: pdflink
Text:
  Availability: 1
Header DbId: ehh
DbLabel: Education Research Complete
An: 189796759
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Human vs. Machine Marking: A Comparative Study of Chemistry Assessments.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Ade-Ibijola%2C+Abejide%22">Ade-Ibijola, Abejide</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> abejide@jbs.ac.za</i><br /><searchLink fieldCode="AR" term="%22Chikezie%2C+Ijeoma+Joy%22">Chikezie, Ijeoma Joy</searchLink><relatesTo>2</relatesTo> (AUTHOR)<i> drijeomajchikezie@gmail.com</i><br /><searchLink fieldCode="AR" term="%22Oyelere%2C+Solomon+Sunday%22">Oyelere, Solomon Sunday</searchLink><relatesTo>1,3</relatesTo> (AUTHOR)<i> s.oyelere@exeter.ac.uk</i>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22Journal+of+Science+Education+%26+Technology%22">Journal of Science Education & Technology</searchLink>. Dec2025, Vol. 34 Issue 6, p1430-1440. 11p.
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: *<searchLink fieldCode="DE" term="%22Artificial+intelligence%22">Artificial intelligence</searchLink><br />*<searchLink fieldCode="DE" term="%22Comparative+studies%22">Comparative studies</searchLink><br />*<searchLink fieldCode="DE" term="%22Evaluation+methodology%22">Evaluation methodology</searchLink><br />*<searchLink fieldCode="DE" term="%22Students%22">Students</searchLink><br />*<searchLink fieldCode="DE" term="%22Educational+evaluation%22">Educational evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22Chemical+testing%22">Chemical testing</searchLink>
– Name: SubjectGeographic
  Label: Geographic Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Nigeria%22">Nigeria</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Artificial intelligence (AI) has transformed educational assessment with automated marking, enhancing efficiency, objectivity, immediate feedback, and identifying students' response patterns. This paper explored the comparative analysis of human expert marking and machine marking in a chemistry class. The study used a comparative research design. The participants comprised 30 Senior Secondary Two (SS2) students and two chemistry experts from the National Institute for Nigerian Languages (NDSS), Abia State, Nigeria, randomly drawn from 98 students offering chemistry. A set of three chemistry short answer questions (SAQs) adopted from NECOSSCE past examination papers was used for data collection. Responses from students were marked by two human chemistry experts and ChatGPT using the marking guide. Pearson product moment correlation (PPMC) was employed to evaluate the relationship between the scores assigned by human experts and those assigned by ChatGPT. The results revealed a substantial correlation between the two human experts (r = 0.75), while the correlations between the human experts and ChatGPT were lower (r = 0.56 and 0.57, respectively). Admittedly, most differences in scores between human experts and ChatGPT were within one point, although larger discrepancies occurred less frequently. Item-by-item analyses of the scores indicated that ChatGPT's scores were within an acceptable range of human expert scores, although ChatGPT's marking exhibited some inconsistencies, particularly in assessing more complex SAQs. The study suggests, among others, that combining human and machine marking is highly recommended to enhance assessment practices in secondary school chemistry, leveraging the strengths of both methods. [ABSTRACT FROM AUTHOR]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of Journal of Science Education & Technology is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=ehh&AN=189796759
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1007/s10956-025-10223-2
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 11
        StartPage: 1430
    Subjects:
      – SubjectFull: Artificial intelligence
        Type: general
      – SubjectFull: Comparative studies
        Type: general
      – SubjectFull: Evaluation methodology
        Type: general
      – SubjectFull: Students
        Type: general
      – SubjectFull: Educational evaluation
        Type: general
      – SubjectFull: Chemical testing
        Type: general
      – SubjectFull: Nigeria
        Type: general
    Titles:
      – TitleFull: Human vs. Machine Marking: A Comparative Study of Chemistry Assessments.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Ade-Ibijola, Abejide
      – PersonEntity:
          Name:
            NameFull: Chikezie, Ijeoma Joy
      – PersonEntity:
          Name:
            NameFull: Oyelere, Solomon Sunday
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 12
              Text: Dec2025
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-print
              Value: 10590145
          Numbering:
            – Type: volume
              Value: 34
            – Type: issue
              Value: 6
          Titles:
            – TitleFull: Journal of Science Education & Technology
              Type: main
ResultId 1