Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes

Saved in:
Bibliographic Details
Title: Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes
Language: English
Authors: Joshua B. Gilbert, Zachary Himmelsbach, Luke W. Miratrix, Andrew D. Ho, Benjamin W. Domingue
Source: Grantee Submission. 2026.
Peer Reviewed: Y
Page Count: 63
Publication Date: 2026
Sponsoring Agency: Institute of Education Sciences (ED)
Contract Number: R305D240025
Document Type: Reports - Research
Education Level: Elementary Secondary Education
Descriptors: Value Added Models, Reliability, Comparative Analysis, Effect Size, Generalizability Theory, Educational Policy, Accountability, Equations (Mathematics), Simulation
DOI: 10.3102/10769986251393339
Abstract: Value added models (VAMs) attempt to estimate the causal effects of teachers and schools on student test scores. We apply Generalizability Theory to show how estimated VA effects depend upon the selection of test items. Standard VAMs estimate causal effects on the items that are included on the test. Generalizability demands consideration of how estimates would differ had the test included alternative items. We introduce a model that estimates the magnitude of item-by-teacher/school variance accurately, revealing that standard VAMs can overstate reliability and overestimate differences between units. Using 16 academic outcomes from 8 studies with item-level data, we show how standard VAMs overstate reliability by a median of 0.04 on the 0-1 reliability scale (mean = 0.09, SD = 0.10) and provide standard deviations of teacher/school effects that are a median of 3% too large (mean = 12%, SD = 23% points). We discuss how imprecision due to heterogeneous VA effects across items attenuates effect sizes, complicates comparisons across studies, and contributes to temporal instability, though these effects are reduced when the number of items is high. Our results suggest that accurate estimation and interpretation of VAMs may be improved using item-level data, including qualitative data about how items represent the content domain. [This paper was published in "Journal of Educational and Behavioral Statistics" 2025.]
Abstractor: As Provided
Notes: https://doi.org/10.7910/DVN/89YITQ
IES Funded: Yes
Entry Date: 2026
Accession Number: ED679453
Database: ERIC
FullText Text:
  Availability: 0
Header DbId: eric
DbLabel: ERIC
An: ED679453
AccessLevel: 3
PubType: Report
PubTypeId: report
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Joshua+B%2E+Gilbert%22">Joshua B. Gilbert</searchLink><br /><searchLink fieldCode="AR" term="%22Zachary+Himmelsbach%22">Zachary Himmelsbach</searchLink><br /><searchLink fieldCode="AR" term="%22Luke+W%2E+Miratrix%22">Luke W. Miratrix</searchLink><br /><searchLink fieldCode="AR" term="%22Andrew+D%2E+Ho%22">Andrew D. Ho</searchLink><br /><searchLink fieldCode="AR" term="%22Benjamin+W%2E+Domingue%22">Benjamin W. Domingue</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Grantee+Submission%22"><i>Grantee Submission</i></searchLink>. 2026.
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 63
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2026
– Name: SourceSuprt
  Label: Sponsoring Agency
  Group: SrcSuprt
  Data: Institute of Education Sciences (ED)
– Name: NumberContract
  Label: Contract Number
  Group: NumCntrct
  Data: R305D240025
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Reports - Research
– Name: Audience
  Label: Education Level
  Group: Audnce
  Data: <searchLink fieldCode="EL" term="%22Elementary+Secondary+Education%22">Elementary Secondary Education</searchLink>
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Value+Added+Models%22">Value Added Models</searchLink><br /><searchLink fieldCode="DE" term="%22Reliability%22">Reliability</searchLink><br /><searchLink fieldCode="DE" term="%22Comparative+Analysis%22">Comparative Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Effect+Size%22">Effect Size</searchLink><br /><searchLink fieldCode="DE" term="%22Generalizability+Theory%22">Generalizability Theory</searchLink><br /><searchLink fieldCode="DE" term="%22Educational+Policy%22">Educational Policy</searchLink><br /><searchLink fieldCode="DE" term="%22Accountability%22">Accountability</searchLink><br /><searchLink fieldCode="DE" term="%22Equations+%28Mathematics%29%22">Equations (Mathematics)</searchLink><br /><searchLink fieldCode="DE" term="%22Simulation%22">Simulation</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.3102/10769986251393339
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Value added models (VAMs) attempt to estimate the causal effects of teachers and schools on student test scores. We apply Generalizability Theory to show how estimated VA effects depend upon the selection of test items. Standard VAMs estimate causal effects on the items that are included on the test. Generalizability demands consideration of how estimates would differ had the test included alternative items. We introduce a model that estimates the magnitude of item-by-teacher/school variance accurately, revealing that standard VAMs can overstate reliability and overestimate differences between units. Using 16 academic outcomes from 8 studies with item-level data, we show how standard VAMs overstate reliability by a median of 0.04 on the 0-1 reliability scale (mean = 0.09, SD = 0.10) and provide standard deviations of teacher/school effects that are a median of 3% too large (mean = 12%, SD = 23% points). We discuss how imprecision due to heterogeneous VA effects across items attenuates effect sizes, complicates comparisons across studies, and contributes to temporal instability, though these effects are reduced when the number of items is high. Our results suggest that accurate estimation and interpretation of VAMs may be improved using item-level data, including qualitative data about how items represent the content domain. [This paper was published in "Journal of Educational and Behavioral Statistics" 2025.]
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: Note
  Label: Notes
  Group: Note
  Data: https://doi.org/10.7910/DVN/89YITQ
– Name: CodeSource
  Label: IES Funded
  Group: SrcInfo
  Data: Yes
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2026
– Name: AN
  Label: Accession Number
  Group: ID
  Data: ED679453
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=ED679453
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.3102/10769986251393339
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 63
    Subjects:
      – SubjectFull: Value Added Models
        Type: general
      – SubjectFull: Reliability
        Type: general
      – SubjectFull: Comparative Analysis
        Type: general
      – SubjectFull: Effect Size
        Type: general
      – SubjectFull: Generalizability Theory
        Type: general
      – SubjectFull: Educational Policy
        Type: general
      – SubjectFull: Accountability
        Type: general
      – SubjectFull: Equations (Mathematics)
        Type: general
      – SubjectFull: Simulation
        Type: general
    Titles:
      – TitleFull: Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Joshua B. Gilbert
      – PersonEntity:
          Name:
            NameFull: Zachary Himmelsbach
      – PersonEntity:
          Name:
            NameFull: Luke W. Miratrix
      – PersonEntity:
          Name:
            NameFull: Andrew D. Ho
      – PersonEntity:
          Name:
            NameFull: Benjamin W. Domingue
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 02
              M: 04
              Type: published
              Y: 2026
          Titles:
            – TitleFull: Grantee Submission
              Type: main
ResultId 1