How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116

Saved in:
Bibliographic Details
Title: How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116
Language: English
Authors: Paul T. von Hippel (ORCID 0000-0003-4498-4374), Brendan A. Schuetze (ORCID 0000-0002-5210-6785), Annenberg Institute for School Reform at Brown University
Source: Annenberg Institute for School Reform at Brown University. 2025.
Availability: Annenberg Institute for School Reform at Brown University. Brown University Box 1985, Providence, RI 02912. Tel: 401-863-7990; Fax: 401-863-1290; e-mail: annenberg@brown.edu; Web site: https://annenberg.brown.edu/
Peer Reviewed: N
Page Count: 49
Publication Date: 2025
Document Type: Reports - Research
Descriptors: Educational Research, Replication (Evaluation), Generalizability Theory, Inferences, Error of Measurement, Predictor Variables, Context Effect, Individual Differences, Aptitude Treatment Interaction, Learning Processes, Testing Problems, Test Validity, Test Reliability
Abstract: Researchers across many fields have called for greater attention to heterogeneity of treatment effects--shifting focus from the average effect to variation in effects between different treatments, studies, or subgroups. True heterogeneity is important, but many reports of heterogeneity have proved to be false, non-replicable, or exaggerated. In this review, we catalog ways that past researchers fooled themselves about heterogeneity, and recommend ways that we can stop fooling ourselves about heterogeneity in the future. We make 18 specific recommendations and illustrate them with examples from education research. The most common themes are to (1) seek heterogeneity only when the mechanism offers clear motivation and the data offer adequate power, (2) shy away from seeking "no-but" heterogeneity when there is no main effect, (3) separate the noise of estimation error from the signal of true heterogeneity, (4) shrink variation in estimates toward zero, (5) increase p values and widen confidence intervals when conducting multiple tests, (6) estimate interactions rather than subgroup effects, and (7) check whether findings of heterogeneity are sensitive to changes in model or measurement. We also resolve longstanding debates about centering interactions in linear models and estimating interactions in nonlinear models such as logistic, ordinal, and interval regression. If researchers follow these recommendations, the search for heterogeneity should yield more trustworthy results in the future.
Abstractor: As Provided
Entry Date: 2025
Accession Number: ED671075
Database: ERIC
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://eric.ed.gov/contentdelivery/servlet/ERICServlet?accno=ED671075
    Name: ERIC Full Text
    Category: fullText
    Text: Full Text from ERIC
Header DbId: eric
DbLabel: ERIC
An: ED671075
AccessLevel: 3
PubType: Report
PubTypeId: report
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Paul+T%2E+von+Hippel%22">Paul T. von Hippel</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0003-4498-4374">0000-0003-4498-4374</externalLink>)<br /><searchLink fieldCode="AR" term="%22Brendan+A%2E+Schuetze%22">Brendan A. Schuetze</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-5210-6785">0000-0002-5210-6785</externalLink>)<br /><searchLink fieldCode="AR" term="%22Annenberg+Institute+for+School+Reform+at+Brown+University%22">Annenberg Institute for School Reform at Brown University</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Annenberg+Institute+for+School+Reform+at+Brown+University%22"><i>Annenberg Institute for School Reform at Brown University</i></searchLink>. 2025.
– Name: Avail
  Label: Availability
  Group: Avail
  Data: Annenberg Institute for School Reform at Brown University. Brown University Box 1985, Providence, RI 02912. Tel: 401-863-7990; Fax: 401-863-1290; e-mail: annenberg@brown.edu; Web site: https://annenberg.brown.edu/
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: N
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 49
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2025
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Reports - Research
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Educational+Research%22">Educational Research</searchLink><br /><searchLink fieldCode="DE" term="%22Replication+%28Evaluation%29%22">Replication (Evaluation)</searchLink><br /><searchLink fieldCode="DE" term="%22Generalizability+Theory%22">Generalizability Theory</searchLink><br /><searchLink fieldCode="DE" term="%22Inferences%22">Inferences</searchLink><br /><searchLink fieldCode="DE" term="%22Error+of+Measurement%22">Error of Measurement</searchLink><br /><searchLink fieldCode="DE" term="%22Predictor+Variables%22">Predictor Variables</searchLink><br /><searchLink fieldCode="DE" term="%22Context+Effect%22">Context Effect</searchLink><br /><searchLink fieldCode="DE" term="%22Individual+Differences%22">Individual Differences</searchLink><br /><searchLink fieldCode="DE" term="%22Aptitude+Treatment+Interaction%22">Aptitude Treatment Interaction</searchLink><br /><searchLink fieldCode="DE" term="%22Learning+Processes%22">Learning Processes</searchLink><br /><searchLink fieldCode="DE" term="%22Testing+Problems%22">Testing Problems</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Validity%22">Test Validity</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Reliability%22">Test Reliability</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Researchers across many fields have called for greater attention to heterogeneity of treatment effects--shifting focus from the average effect to variation in effects between different treatments, studies, or subgroups. True heterogeneity is important, but many reports of heterogeneity have proved to be false, non-replicable, or exaggerated. In this review, we catalog ways that past researchers fooled themselves about heterogeneity, and recommend ways that we can stop fooling ourselves about heterogeneity in the future. We make 18 specific recommendations and illustrate them with examples from education research. The most common themes are to (1) seek heterogeneity only when the mechanism offers clear motivation and the data offer adequate power, (2) shy away from seeking "no-but" heterogeneity when there is no main effect, (3) separate the noise of estimation error from the signal of true heterogeneity, (4) shrink variation in estimates toward zero, (5) increase p values and widen confidence intervals when conducting multiple tests, (6) estimate interactions rather than subgroup effects, and (7) check whether findings of heterogeneity are sensitive to changes in model or measurement. We also resolve longstanding debates about centering interactions in linear models and estimating interactions in nonlinear models such as logistic, ordinal, and interval regression. If researchers follow these recommendations, the search for heterogeneity should yield more trustworthy results in the future.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2025
– Name: AN
  Label: Accession Number
  Group: ID
  Data: ED671075
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=ED671075
RecordInfo BibRecord:
  BibEntity:
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 49
    Subjects:
      – SubjectFull: Educational Research
        Type: general
      – SubjectFull: Replication (Evaluation)
        Type: general
      – SubjectFull: Generalizability Theory
        Type: general
      – SubjectFull: Inferences
        Type: general
      – SubjectFull: Error of Measurement
        Type: general
      – SubjectFull: Predictor Variables
        Type: general
      – SubjectFull: Context Effect
        Type: general
      – SubjectFull: Individual Differences
        Type: general
      – SubjectFull: Aptitude Treatment Interaction
        Type: general
      – SubjectFull: Learning Processes
        Type: general
      – SubjectFull: Testing Problems
        Type: general
      – SubjectFull: Test Validity
        Type: general
      – SubjectFull: Test Reliability
        Type: general
    Titles:
      – TitleFull: How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Annenberg Institute for School Reform at Brown University
      – PersonEntity:
          Name:
            NameFull: Paul T. von Hippel
      – PersonEntity:
          Name:
            NameFull: Brendan A. Schuetze
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2025
          Titles:
            – TitleFull: Annenberg Institute for School Reform at Brown University
              Type: main
ResultId 1