A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Saved in:
Bibliographic Details
Title: A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement
Language: English
Authors: Jordan M. Wheeler, Allan S. Cohen, Shiyu Wang
Source: Journal of Educational and Behavioral Statistics. 2024 49(5):848-874.
Availability: SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com
Peer Reviewed: Y
Page Count: 27
Publication Date: 2024
Document Type: Journal Articles
Reports - Research
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability, Responses, Mathematical Models, Correlation, Language Usage, Item Analysis, Test Items, Measurement Techniques, Algorithms, Scoring, Thinking Skills, Simulation, Comparative Analysis
DOI: 10.3102/10769986231209446
ISSN: 1076-9986
1935-1054
Abstract: Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming more common in educational measurement research as a method for analyzing students' responses to constructed-response items. Two popular topic models are latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). LSA uses linear algebra techniques, whereas LDA uses an assumed statistical model and generative process. In educational measurement, LSA is often used in algorithmic scoring of essays due to its high reliability and agreement with human raters. LDA is often used as a supplemental analysis to gain additional information about students, such as their thinking and reasoning. This article reviews and compares the LSA and LDA topic models. This article also introduces a methodology for comparing the semantic spaces obtained by the two models and uses a simulation study to investigate their similarities.
Abstractor: As Provided
Entry Date: 2024
Accession Number: EJ1442196
Database: ERIC
FullText Text:
  Availability: 0
Header DbId: eric
DbLabel: ERIC
An: EJ1442196
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Jordan+M%2E+Wheeler%22">Jordan M. Wheeler</searchLink><br /><searchLink fieldCode="AR" term="%22Allan+S%2E+Cohen%22">Allan S. Cohen</searchLink><br /><searchLink fieldCode="AR" term="%22Shiyu+Wang%22">Shiyu Wang</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Journal+of+Educational+and+Behavioral+Statistics%22"><i>Journal of Educational and Behavioral Statistics</i></searchLink>. 2024 49(5):848-874.
– Name: Avail
  Label: Availability
  Group: Avail
  Data: SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 27
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2024
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Reports - Research
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Semantics%22">Semantics</searchLink><br /><searchLink fieldCode="DE" term="%22Educational+Assessment%22">Educational Assessment</searchLink><br /><searchLink fieldCode="DE" term="%22Evaluators%22">Evaluators</searchLink><br /><searchLink fieldCode="DE" term="%22Reliability%22">Reliability</searchLink><br /><searchLink fieldCode="DE" term="%22Responses%22">Responses</searchLink><br /><searchLink fieldCode="DE" term="%22Mathematical+Models%22">Mathematical Models</searchLink><br /><searchLink fieldCode="DE" term="%22Correlation%22">Correlation</searchLink><br /><searchLink fieldCode="DE" term="%22Language+Usage%22">Language Usage</searchLink><br /><searchLink fieldCode="DE" term="%22Item+Analysis%22">Item Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Items%22">Test Items</searchLink><br /><searchLink fieldCode="DE" term="%22Measurement+Techniques%22">Measurement Techniques</searchLink><br /><searchLink fieldCode="DE" term="%22Algorithms%22">Algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22Scoring%22">Scoring</searchLink><br /><searchLink fieldCode="DE" term="%22Thinking+Skills%22">Thinking Skills</searchLink><br /><searchLink fieldCode="DE" term="%22Simulation%22">Simulation</searchLink><br /><searchLink fieldCode="DE" term="%22Comparative+Analysis%22">Comparative Analysis</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.3102/10769986231209446
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 1076-9986<br />1935-1054
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming more common in educational measurement research as a method for analyzing students' responses to constructed-response items. Two popular topic models are latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). LSA uses linear algebra techniques, whereas LDA uses an assumed statistical model and generative process. In educational measurement, LSA is often used in algorithmic scoring of essays due to its high reliability and agreement with human raters. LDA is often used as a supplemental analysis to gain additional information about students, such as their thinking and reasoning. This article reviews and compares the LSA and LDA topic models. This article also introduces a methodology for comparing the semantic spaces obtained by the two models and uses a simulation study to investigate their similarities.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2024
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1442196
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1442196
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.3102/10769986231209446
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 27
        StartPage: 848
    Subjects:
      – SubjectFull: Semantics
        Type: general
      – SubjectFull: Educational Assessment
        Type: general
      – SubjectFull: Evaluators
        Type: general
      – SubjectFull: Reliability
        Type: general
      – SubjectFull: Responses
        Type: general
      – SubjectFull: Mathematical Models
        Type: general
      – SubjectFull: Correlation
        Type: general
      – SubjectFull: Language Usage
        Type: general
      – SubjectFull: Item Analysis
        Type: general
      – SubjectFull: Test Items
        Type: general
      – SubjectFull: Measurement Techniques
        Type: general
      – SubjectFull: Algorithms
        Type: general
      – SubjectFull: Scoring
        Type: general
      – SubjectFull: Thinking Skills
        Type: general
      – SubjectFull: Simulation
        Type: general
      – SubjectFull: Comparative Analysis
        Type: general
    Titles:
      – TitleFull: A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Jordan M. Wheeler
      – PersonEntity:
          Name:
            NameFull: Allan S. Cohen
      – PersonEntity:
          Name:
            NameFull: Shiyu Wang
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 10
              Type: published
              Y: 2024
          Identifiers:
            – Type: issn-print
              Value: 1076-9986
            – Type: issn-electronic
              Value: 1935-1054
          Numbering:
            – Type: volume
              Value: 49
            – Type: issue
              Value: 5
          Titles:
            – TitleFull: Journal of Educational and Behavioral Statistics
              Type: main
ResultId 1