View in EDS

Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes

Saved in:

Bibliographic Details
Title:	Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes
Language:	English
Authors:	Joshua B. Gilbert, Zachary Himmelsbach, Luke W. Miratrix, Andrew D. Ho, Benjamin W. Domingue
Source:	Grantee Submission. 2026.
Peer Reviewed:	Y
Page Count:	63
Publication Date:	2026
Sponsoring Agency:	Institute of Education Sciences (ED)
Contract Number:	R305D240025
Document Type:	Reports - Research
Education Level:	Elementary Secondary Education
Descriptors:	Value Added Models, Reliability, Comparative Analysis, Effect Size, Generalizability Theory, Educational Policy, Accountability, Equations (Mathematics), Simulation
DOI:	10.3102/10769986251393339
Abstract:	Value added models (VAMs) attempt to estimate the causal effects of teachers and schools on student test scores. We apply Generalizability Theory to show how estimated VA effects depend upon the selection of test items. Standard VAMs estimate causal effects on the items that are included on the test. Generalizability demands consideration of how estimates would differ had the test included alternative items. We introduce a model that estimates the magnitude of item-by-teacher/school variance accurately, revealing that standard VAMs can overstate reliability and overestimate differences between units. Using 16 academic outcomes from 8 studies with item-level data, we show how standard VAMs overstate reliability by a median of 0.04 on the 0-1 reliability scale (mean = 0.09, SD = 0.10) and provide standard deviations of teacher/school effects that are a median of 3% too large (mean = 12%, SD = 23% points). We discuss how imprecision due to heterogeneous VA effects across items attenuates effect sizes, complicates comparisons across studies, and contributes to temporal instability, though these effects are reduced when the number of items is high. Our results suggest that accurate estimation and interpretation of VAMs may be improved using item-level data, including qualitative data about how items represent the content domain. [This paper was published in "Journal of Educational and Behavioral Statistics" 2025.]
Abstractor:	As Provided
Notes:	https://doi.org/10.7910/DVN/89YITQ
IES Funded:	Yes
Entry Date:	2026
Accession Number:	ED679453
Database:	ERIC

FullText	Text: Availability: 0
Header	DbId: eric DbLabel: ERIC An: ED679453 AccessLevel: 3 PubType: Report PubTypeId: report PreciseRelevancyScore: 0
IllustrationInfo
Items	– Name: Title Label: Title Group: Ti Data: Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes – Name: Language Label: Language Group: Lang Data: English – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Joshua+B%2E+Gilbert%22">Joshua B. Gilbert</searchLink><br /><searchLink fieldCode="AR" term="%22Zachary+Himmelsbach%22">Zachary Himmelsbach</searchLink><br /><searchLink fieldCode="AR" term="%22Luke+W%2E+Miratrix%22">Luke W. Miratrix</searchLink><br /><searchLink fieldCode="AR" term="%22Andrew+D%2E+Ho%22">Andrew D. Ho</searchLink><br /><searchLink fieldCode="AR" term="%22Benjamin+W%2E+Domingue%22">Benjamin W. Domingue</searchLink> – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="SO" term="%22Grantee+Submission%22"><i>Grantee Submission</i></searchLink>. 2026. – Name: PeerReviewed Label: Peer Reviewed Group: SrcInfo Data: Y – Name: Pages Label: Page Count Group: Src Data: 63 – Name: DatePubCY Label: Publication Date Group: Date Data: 2026 – Name: SourceSuprt Label: Sponsoring Agency Group: SrcSuprt Data: Institute of Education Sciences (ED) – Name: NumberContract Label: Contract Number Group: NumCntrct Data: R305D240025 – Name: TypeDocument Label: Document Type Group: TypDoc Data: Reports - Research – Name: Audience Label: Education Level Group: Audnce Data: <searchLink fieldCode="EL" term="%22Elementary+Secondary+Education%22">Elementary Secondary Education</searchLink> – Name: Subject Label: Descriptors Group: Su Data: <searchLink fieldCode="DE" term="%22Value+Added+Models%22">Value Added Models</searchLink><br /><searchLink fieldCode="DE" term="%22Reliability%22">Reliability</searchLink><br /><searchLink fieldCode="DE" term="%22Comparative+Analysis%22">Comparative Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Effect+Size%22">Effect Size</searchLink><br /><searchLink fieldCode="DE" term="%22Generalizability+Theory%22">Generalizability Theory</searchLink><br /><searchLink fieldCode="DE" term="%22Educational+Policy%22">Educational Policy</searchLink><br /><searchLink fieldCode="DE" term="%22Accountability%22">Accountability</searchLink><br /><searchLink fieldCode="DE" term="%22Equations+%28Mathematics%29%22">Equations (Mathematics)</searchLink><br /><searchLink fieldCode="DE" term="%22Simulation%22">Simulation</searchLink> – Name: DOI Label: DOI Group: ID Data: 10.3102/10769986251393339 – Name: Abstract Label: Abstract Group: Ab Data: Value added models (VAMs) attempt to estimate the causal effects of teachers and schools on student test scores. We apply Generalizability Theory to show how estimated VA effects depend upon the selection of test items. Standard VAMs estimate causal effects on the items that are included on the test. Generalizability demands consideration of how estimates would differ had the test included alternative items. We introduce a model that estimates the magnitude of item-by-teacher/school variance accurately, revealing that standard VAMs can overstate reliability and overestimate differences between units. Using 16 academic outcomes from 8 studies with item-level data, we show how standard VAMs overstate reliability by a median of 0.04 on the 0-1 reliability scale (mean = 0.09, SD = 0.10) and provide standard deviations of teacher/school effects that are a median of 3% too large (mean = 12%, SD = 23% points). We discuss how imprecision due to heterogeneous VA effects across items attenuates effect sizes, complicates comparisons across studies, and contributes to temporal instability, though these effects are reduced when the number of items is high. Our results suggest that accurate estimation and interpretation of VAMs may be improved using item-level data, including qualitative data about how items represent the content domain. [This paper was published in "Journal of Educational and Behavioral Statistics" 2025.] – Name: AbstractInfo Label: Abstractor Group: Ab Data: As Provided – Name: Note Label: Notes Group: Note Data: https://doi.org/10.7910/DVN/89YITQ – Name: CodeSource Label: IES Funded Group: SrcInfo Data: Yes – Name: DateEntry Label: Entry Date Group: Date Data: 2026 – Name: AN Label: Accession Number Group: ID Data: ED679453
PLink	https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=ED679453
RecordInfo	BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.3102/10769986251393339 Languages: – Text: English PhysicalDescription: Pagination: PageCount: 63 Subjects: – SubjectFull: Value Added Models Type: general – SubjectFull: Reliability Type: general – SubjectFull: Comparative Analysis Type: general – SubjectFull: Effect Size Type: general – SubjectFull: Generalizability Theory Type: general – SubjectFull: Educational Policy Type: general – SubjectFull: Accountability Type: general – SubjectFull: Equations (Mathematics) Type: general – SubjectFull: Simulation Type: general Titles: – TitleFull: Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Joshua B. Gilbert – PersonEntity: Name: NameFull: Zachary Himmelsbach – PersonEntity: Name: NameFull: Luke W. Miratrix – PersonEntity: Name: NameFull: Andrew D. Ho – PersonEntity: Name: NameFull: Benjamin W. Domingue IsPartOfRelationships: – BibEntity: Dates: – D: 02 M: 04 Type: published Y: 2026 Titles: – TitleFull: Grantee Submission Type: main
ResultId	1