How Raters Differ: A Study of Structured Oral Mathematics Assessment
Saved in:
| Title: | How Raters Differ: A Study of Structured Oral Mathematics Assessment |
|---|---|
| Language: | English |
| Authors: | Samuel Sollerman (ORCID |
| Source: | Practical Assessment, Research & Evaluation. 2026 31(1). |
| Availability: | University of Massachusetts Amherst Libraries. 154 Hicks Way, Amherst, MA 01003. e-mail: pare@umass.edu; Web site: https://openpublishing.library.umass.edu/pare/ |
| Peer Reviewed: | Y |
| Page Count: | 16 |
| Publication Date: | 2026 |
| Document Type: | Journal Articles Reports - Research |
| Education Level: | Secondary Education |
| Descriptors: | Mathematics Achievement, Student Evaluation, Foreign Countries, Verbal Tests, Mathematics Tests, Evaluation Methods, Experienced Teachers, Test Format, Secondary School Mathematics, Scoring Rubrics, Secondary School Students, Scoring, National Competency Tests, Performance Based Assessment |
| Geographic Terms: | Sweden |
| ISSN: | 1531-7714 |
| Abstract: | This study examines the nature and extent of interpretive variability in structured oral mathematics assessments. Using Swedish national test data from 74 students across three oral formats, six experienced teachers independently rated reasoning, communication, and method using shared rubrics. Multiple reliability indicators and Svensson's method were employed to distinguish systematic and unsystematic interpretive variation. Exact agreement was low across formats, with higher but still modest adjacent agreement. Relative Position effects were frequent, indicating systematic differences in rater thresholds. In contrast, the most dialogic format showed greater Relative Rank Variance, suggesting more random inconsistency. Raters reported high confidence even when statistical agreement was low, revealing a gap between perceived certainty and interpretive alignment. The analysis indicates that assessment structure and interactional demands shape both what students display and how raters apply criteria, making variability a feature of professional judgment rather than merely error. Implications include the use of calibrated exemplars, targeted calibration activities, and collaborative scoring practices to enhance reliability without sacrificing the diagnostic value of oral assessment in competency-based systems. |
| Abstractor: | As Provided |
| Entry Date: | 2026 |
| Accession Number: | EJ1495825 |
| Database: | ERIC |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://eric.ed.gov/contentdelivery/servlet/ERICServlet?accno=EJ1495825 Name: ERIC Full Text Category: fullText Text: Full Text from ERIC |
|---|---|
| Header | DbId: eric DbLabel: ERIC An: EJ1495825 AccessLevel: 3 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: How Raters Differ: A Study of Structured Oral Mathematics Assessment – Name: Language Label: Language Group: Lang Data: English – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Samuel+Sollerman%22">Samuel Sollerman</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-9676-9521">0000-0002-9676-9521</externalLink>) – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="SO" term="%22Practical+Assessment%2C+Research+%26+Evaluation%22"><i>Practical Assessment, Research & Evaluation</i></searchLink>. 2026 31(1). – Name: Avail Label: Availability Group: Avail Data: University of Massachusetts Amherst Libraries. 154 Hicks Way, Amherst, MA 01003. e-mail: pare@umass.edu; Web site: https://openpublishing.library.umass.edu/pare/ – Name: PeerReviewed Label: Peer Reviewed Group: SrcInfo Data: Y – Name: Pages Label: Page Count Group: Src Data: 16 – Name: DatePubCY Label: Publication Date Group: Date Data: 2026 – Name: TypeDocument Label: Document Type Group: TypDoc Data: Journal Articles<br />Reports - Research – Name: Audience Label: Education Level Group: Audnce Data: <searchLink fieldCode="EL" term="%22Secondary+Education%22">Secondary Education</searchLink> – Name: Subject Label: Descriptors Group: Su Data: <searchLink fieldCode="DE" term="%22Mathematics+Achievement%22">Mathematics Achievement</searchLink><br /><searchLink fieldCode="DE" term="%22Student+Evaluation%22">Student Evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22Foreign+Countries%22">Foreign Countries</searchLink><br /><searchLink fieldCode="DE" term="%22Verbal+Tests%22">Verbal Tests</searchLink><br /><searchLink fieldCode="DE" term="%22Mathematics+Tests%22">Mathematics Tests</searchLink><br /><searchLink fieldCode="DE" term="%22Evaluation+Methods%22">Evaluation Methods</searchLink><br /><searchLink fieldCode="DE" term="%22Experienced+Teachers%22">Experienced Teachers</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Format%22">Test Format</searchLink><br /><searchLink fieldCode="DE" term="%22Secondary+School+Mathematics%22">Secondary School Mathematics</searchLink><br /><searchLink fieldCode="DE" term="%22Scoring+Rubrics%22">Scoring Rubrics</searchLink><br /><searchLink fieldCode="DE" term="%22Secondary+School+Students%22">Secondary School Students</searchLink><br /><searchLink fieldCode="DE" term="%22Scoring%22">Scoring</searchLink><br /><searchLink fieldCode="DE" term="%22National+Competency+Tests%22">National Competency Tests</searchLink><br /><searchLink fieldCode="DE" term="%22Performance+Based+Assessment%22">Performance Based Assessment</searchLink> – Name: Subject Label: Geographic Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Sweden%22">Sweden</searchLink> – Name: ISSN Label: ISSN Group: ISSN Data: 1531-7714 – Name: Abstract Label: Abstract Group: Ab Data: This study examines the nature and extent of interpretive variability in structured oral mathematics assessments. Using Swedish national test data from 74 students across three oral formats, six experienced teachers independently rated reasoning, communication, and method using shared rubrics. Multiple reliability indicators and Svensson's method were employed to distinguish systematic and unsystematic interpretive variation. Exact agreement was low across formats, with higher but still modest adjacent agreement. Relative Position effects were frequent, indicating systematic differences in rater thresholds. In contrast, the most dialogic format showed greater Relative Rank Variance, suggesting more random inconsistency. Raters reported high confidence even when statistical agreement was low, revealing a gap between perceived certainty and interpretive alignment. The analysis indicates that assessment structure and interactional demands shape both what students display and how raters apply criteria, making variability a feature of professional judgment rather than merely error. Implications include the use of calibrated exemplars, targeted calibration activities, and collaborative scoring practices to enhance reliability without sacrificing the diagnostic value of oral assessment in competency-based systems. – Name: AbstractInfo Label: Abstractor Group: Ab Data: As Provided – Name: DateEntry Label: Entry Date Group: Date Data: 2026 – Name: AN Label: Accession Number Group: ID Data: EJ1495825 |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1495825 |
| RecordInfo | BibRecord: BibEntity: Languages: – Text: English PhysicalDescription: Pagination: PageCount: 16 Subjects: – SubjectFull: Mathematics Achievement Type: general – SubjectFull: Student Evaluation Type: general – SubjectFull: Foreign Countries Type: general – SubjectFull: Verbal Tests Type: general – SubjectFull: Mathematics Tests Type: general – SubjectFull: Evaluation Methods Type: general – SubjectFull: Experienced Teachers Type: general – SubjectFull: Test Format Type: general – SubjectFull: Secondary School Mathematics Type: general – SubjectFull: Scoring Rubrics Type: general – SubjectFull: Secondary School Students Type: general – SubjectFull: Scoring Type: general – SubjectFull: National Competency Tests Type: general – SubjectFull: Performance Based Assessment Type: general – SubjectFull: Sweden Type: general Titles: – TitleFull: How Raters Differ: A Study of Structured Oral Mathematics Assessment Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Samuel Sollerman IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 2026 Identifiers: – Type: issn-electronic Value: 1531-7714 Numbering: – Type: volume Value: 31 – Type: issue Value: 1 Titles: – TitleFull: Practical Assessment, Research & Evaluation Type: main |
| ResultId | 1 |