Data Imbalances in Coincidence Analysis: A Simulation Study

Saved in:
Bibliographic Details
Title: Data Imbalances in Coincidence Analysis: A Simulation Study
Language: English
Authors: Martyna Daria Swiatczak (ORCID 0000-0002-7537-1813), Michael Baumgartner (ORCID 0000-0003-1536-2816)
Source: Sociological Methods & Research. 2025 54(2):739-771.
Availability: SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com
Peer Reviewed: Y
Page Count: 33
Publication Date: 2025
Document Type: Journal Articles
Reports - Research
Descriptors: Causal Models, Comparative Analysis, Data Analysis, Statistical Distributions, Statistical Data
DOI: 10.1177/00491241241227039
ISSN: 0049-1241
1552-8294
Abstract: In this paper, we investigate the conditions under which data imbalances, a common data characteristic that occurs when factor values are unevenly distributed, are problematic for the performance of Coincidence Analysis (CNA). We further examine how such imbalances relate to fragmentation and noise in data. We show that even extreme data imbalances, when not combined with fragmentation or noise, do not negatively affect CNA's performance. However, an extended series of simulation experiments on fuzzy-set data reveals that, when mixed with fragmentation or noise, data imbalances may substantially impair CNA's performance. Furthermore, we find that the performance impairment is higher when endogenous factors are imbalanced than when exogenous factors are concerned. Our results allow us to quantify these impacts and demarcate degrees at which data imbalances should be considered as problematic. Thus, applied researchers can use our demarcation guidelines to enhance the validity of their studies.
Abstractor: As Provided
Entry Date: 2025
Accession Number: EJ1473620
Database: ERIC
FullText Links:
  – Type: pdflink
    Url: https://content.ebscohost.com/cds/retrieve?content=AQICAHj0k_4E0hTGH8RJwT4gCJyBsGNe_WN95AvKlDbXJGqwxwEOaL-HACir_qKoIFGo5h6TAAAA4zCB4AYJKoZIhvcNAQcGoIHSMIHPAgEAMIHJBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDPNAu7pSkExt88BragIBEICBm0PTC-nnEtwvAqdjuTdd-TEtdw8xZ3HfsgQbZEdNSqtOIu0eGet9yomrQidhkZs2z4YJBC4vcHjQE_xKpFjYFuHJYK0x_4gGvgA6qKm1orT0SoWtIm-FrhMa6RwolyEkx68cOx3MEtoqStEtyui1pQbD76tYd2jhoOynhVpiSjRP8askAzW2wLpDTIT6Uu1bLLM433seC2VdUGhV
Text:
  Availability: 0
Header DbId: eric
DbLabel: ERIC
An: EJ1473620
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Data Imbalances in Coincidence Analysis: A Simulation Study
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Martyna+Daria+Swiatczak%22">Martyna Daria Swiatczak</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-7537-1813">0000-0002-7537-1813</externalLink>)<br /><searchLink fieldCode="AR" term="%22Michael+Baumgartner%22">Michael Baumgartner</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0003-1536-2816">0000-0003-1536-2816</externalLink>)
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Sociological+Methods+%26+Research%22"><i>Sociological Methods & Research</i></searchLink>. 2025 54(2):739-771.
– Name: Avail
  Label: Availability
  Group: Avail
  Data: SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 33
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2025
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Reports - Research
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Causal+Models%22">Causal Models</searchLink><br /><searchLink fieldCode="DE" term="%22Comparative+Analysis%22">Comparative Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Data+Analysis%22">Data Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Statistical+Distributions%22">Statistical Distributions</searchLink><br /><searchLink fieldCode="DE" term="%22Statistical+Data%22">Statistical Data</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1177/00491241241227039
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 0049-1241<br />1552-8294
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: In this paper, we investigate the conditions under which data imbalances, a common data characteristic that occurs when factor values are unevenly distributed, are problematic for the performance of Coincidence Analysis (CNA). We further examine how such imbalances relate to fragmentation and noise in data. We show that even extreme data imbalances, when not combined with fragmentation or noise, do not negatively affect CNA's performance. However, an extended series of simulation experiments on fuzzy-set data reveals that, when mixed with fragmentation or noise, data imbalances may substantially impair CNA's performance. Furthermore, we find that the performance impairment is higher when endogenous factors are imbalanced than when exogenous factors are concerned. Our results allow us to quantify these impacts and demarcate degrees at which data imbalances should be considered as problematic. Thus, applied researchers can use our demarcation guidelines to enhance the validity of their studies.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2025
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1473620
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1473620
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1177/00491241241227039
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 33
        StartPage: 739
    Subjects:
      – SubjectFull: Causal Models
        Type: general
      – SubjectFull: Comparative Analysis
        Type: general
      – SubjectFull: Data Analysis
        Type: general
      – SubjectFull: Statistical Distributions
        Type: general
      – SubjectFull: Statistical Data
        Type: general
    Titles:
      – TitleFull: Data Imbalances in Coincidence Analysis: A Simulation Study
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Martyna Daria Swiatczak
      – PersonEntity:
          Name:
            NameFull: Michael Baumgartner
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 05
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-print
              Value: 0049-1241
            – Type: issn-electronic
              Value: 1552-8294
          Numbering:
            – Type: volume
              Value: 54
            – Type: issue
              Value: 2
          Titles:
            – TitleFull: Sociological Methods & Research
              Type: main
ResultId 1