Discrete-Time Survival Analysis Incorporating Time Structure in Developmental Research

Saved in:
Bibliographic Details
Title: Discrete-Time Survival Analysis Incorporating Time Structure in Developmental Research
Language: English
Authors: Sooyong Lee (ORCID 0000-0002-7964-4508), Kahyun Lee (ORCID 0000-0002-5497-8114), Kejin Lee (ORCID 0000-0001-6775-282X)
Source: Structural Equation Modeling: A Multidisciplinary Journal. 2025 32(5):929-940.
Availability: Routledge. Available from: Taylor & Francis, Ltd. 530 Walnut Street Suite 850, Philadelphia, PA 19106. Tel: 800-354-1420; Tel: 215-625-8900; Fax: 215-207-0050; Web site: http://www.tandf.co.uk/journals
Peer Reviewed: Y
Page Count: 12
Publication Date: 2025
Document Type: Journal Articles
Reports - Research
Descriptors: Statistical Analysis, Time, Smoking, Longitudinal Studies, National Surveys, Age, Social Science Research, Behavioral Science Research
Assessment and Survey Identifiers: National Longitudinal Survey of Youth
DOI: 10.1080/10705511.2024.2432598
ISSN: 1070-5511
1532-8007
Abstract: Discrete-time survival analysis (DTSA) is a method widely used by social and behavioral researchers as it aids in the exploration of patterns in time-to-event measures. However, the traditional DTSA models often fail to adequately represent the structured dynamics of hazardous processes. This study introduces structural DTSA, an alternative approach that extends traditional DTSA by incorporating functional forms of hazard changes. With structural DTSA, a reparameterization of the functional forms is also possible for more meaningful interpretations of the results of time-to-event data analyses. This study aims to provide a detailed tutorial on structured DTSA, demonstrating its applicability in social and behavioral research. Henceforth, we demonstrate the application of structured DTSA using data on smoking initiation from the National Longitudinal Study of Youth 1997 (NLSY97). These findings highlight the potential of structured DTSA for various developmental studies.
Abstractor: As Provided
Entry Date: 2026
Accession Number: EJ1501485
Database: ERIC
Full text is not displayed to guests.
FullText Links:
  – Type: pdflink
    Url: https://content.ebscohost.com/cds/retrieve?content=AQICAHj0k_4E0hTGH8RJwT4gCJyBsGNe_WN95AvKlDbXJGqwxwEkKRmINc3e34lqhln0p2nJAAAA4zCB4AYJKoZIhvcNAQcGoIHSMIHPAgEAMIHJBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDEoAolJUCCWwJhd5GwIBEICBm3b0hwNPhq1nyc_SoRV-555TKIiN07StlesTdmEnfxGL8Hd1srNkpx833aypOABXYE0frI0YgwhCknJj6XjWt_cNxMxArXamXuo0kLEm3bfqy5l7YtOllDXrgPd4iGW_O4pAFXvsfDRSqIBzzB2842ON_Ft6Y_Vgk3dmRK-ahbyDrbj_TX5t6x4Q0GYFoI4Q7gvs6TifJzQU1kqi
Text:
  Availability: 1
  Value: <anid>AN0187593736;7mz01sep.25;2025Sep01.03:17;v2.2.500</anid> <title id="AN0187593736-1">Discrete-Time Survival Analysis Incorporating Time Structure in Developmental Research </title> <p>Discrete-time survival analysis (DTSA) is a method widely used by social and behavioral researchers as it aids in the exploration of patterns in time-to-event measures. However, the traditional DTSA models often fail to adequately represent the structured dynamics of hazardous processes. This study introduces structural DTSA, an alternative approach that extends traditional DTSA by incorporating functional forms of hazard changes. With structural DTSA, a reparameterization of the functional forms is also possible for more meaningful interpretations of the results of time-to-event data analyses. This study aims to provide a detailed tutorial on structured DTSA, demonstrating its applicability in social and behavioral research. Henceforth, we demonstrate the application of structured DTSA using data on smoking initiation from the National Longitudinal Study of Youth 1997 (NLSY97). These findings highlight the potential of structured DTSA for various developmental studies.</p> <p>Keywords: Discrete-time survival analysis; latent growth modeling; longitudinal data analysis; National Longitudinal Study of Youth; Structural Discrete-Time Survival Analysis</p> <hd id="AN0187593736-2">1. Introduction</hd> <p>In social and behavioral science research, individuals' life events (e.g., marriage and divorce) are typically treated as time-to-event measures and have been widely analyzed using survival analysis or discrete-time survival analysis (DTSA). DTSA is a statistical technique initially designed to predict and explain the occurrence and timing of events (e.g., death, marriage, Singer & Willett, [<reflink idref="bib32" id="ref1">32</reflink>]; Vermunt & Moors, [<reflink idref="bib37" id="ref2">37</reflink>]). Unlike traditional survival analysis models, DTSA deals with event timing in discrete intervals (e.g., semesters), where respondents retrospectively report the timing of event onsets (Keiley et al., [<reflink idref="bib12" id="ref3">12</reflink>]). In DTSA, a hazard function plays a key role in representing the chronological pattern of event probabilities, such as the proportion of at-risk individuals who experience an event within a given time period.</p> <p>Time-to-event measures often follow specific functional forms over time, such as linear or quadratic hazard probability patterns. The need to handle quadratic patterns of change over time using DTSA has been increasing among social science researchers (e.g., Graham et al., [<reflink idref="bib9" id="ref4">9</reflink>]; Singer & Willett, [<reflink idref="bib33" id="ref5">33</reflink>]). To address this methodological gap, traditional DTSA models have been extended to allow the modeling of various functional forms over time (e.g., linear or quadratic patterns of hazard probability). An application of the structured DTSA has been discussed in Edelen et al. ([<reflink idref="bib6" id="ref6">6</reflink>]), who investigated the timing of the smoking initiation among adolescents as a function of age, ranging from 5 to 23 years. The study included a sample of 7th graders (<emph>N</emph> = 6,255) from 30 schools in California and Oregon initially collected in 1985. Edelen et al. ([<reflink idref="bib6" id="ref7">6</reflink>]) utilized self-reported data obtained from the 7th grader sample regarding the age of first cigarette use to model the pattern of adolescents on their first cigarette use. The study found that the timing of cigarette smoking onset was well fitted by the quadratic functional form, with the adolescents' risk of smoking cigarettes increasing to the highest point at approximately 10-years-old and then decreasing afterward. The structured DTSA approach introduced by Edelen et al. ([<reflink idref="bib6" id="ref8">6</reflink>]) not only enhanced the descriptive explanation, but also provided us with theoretically more meaningful interpretations of hazard patterns underlying observed hazard probabilities.</p> <p>Despite the advantages of structured DTSA, comprehensive guidelines for structured DTSA are still limited, and the structured DTSA approach remains underutilized in social and behavioral research. Social science researchers frequently employ traditional DTSA models, focusing on estimating event probabilities without considering the potential shapes of the hazard probability curves. While the traditional DTSA approach is relatively straightforward, in that it does not require assumptions about the shape of the hazard probability, the traditional DTSA has a limitation in employing functional forms to parsimoniously capture hazard probabilities over time (Willett & Singer, [<reflink idref="bib38" id="ref9">38</reflink>]). Recognizing this gap, this study aims to provide researchers with a detailed step-by-step tutorial on structured discrete-time survival analysis (S-DTSA). We illustrate how traditional survival models (e.g., DTSA) can be effectively modified to S-DTSA for a more meaningful interpretation.</p> <p>Hence, the goal of this study is to demonstrate the applicability of structured DTSA in social and behavioral research and to help researchers interpret the functional forms of time-to-event measures in a more meaningful way. This study is organized to enhance our understanding of the DTSA framework and the use of structured DTSA. The foundational backgrounds of DTSA and structured DTSA are presented in this section. The next section offers a foundational understanding that is critical for comprehending the more complex aspects of DTSA. The following sections introduce the empirical data analysis and demonstrate the application of DTSA and structured DTSA.</p> <hd id="AN0187593736-3">1.1. A Brief History of Survival Analysis</hd> <p>Before delving into structured discrete-time survival analysis, the brief history of survival analysis is noteworthy. Survival analysis was developed to model event timing, with a long history in medical research where the actual "survival" or "death" is of interest (Kleinbaum & Klein, [<reflink idref="bib14" id="ref10">14</reflink>]). This traditional approach treats time as continuous, requiring the measurement of the exact timing of events such as reactions to medicine. It assumes that no two events occur at exactly the same time (known as "tied" data), and when many tied events are present, the results can be biased (Allison, [<reflink idref="bib1" id="ref11">1</reflink>]).</p> <p>While powerful in medical research, traditional continuous-time survival analysis does not perfectly apply to the social and behavioral sciences, where the timing of events is often measured in broader intervals (i.e., months or years) and the exact timing is less critical. For example, when studying the age of marriage, it is more meaningful to know the year of marriage rather than the exact hours and minutes. Applying continuous-time models to such data often results in a large amount of tied data, which is undesirable (Masyn, [<reflink idref="bib18" id="ref12">18</reflink>]).</p> <p>To address this issue, Cox ([<reflink idref="bib3" id="ref13">3</reflink>]) suggested discrete-time survival analysis (or event-history analysis), broadly advocated by Singer and Willett ([<reflink idref="bib32" id="ref14">32</reflink>]) and Willett and Singer ([<reflink idref="bib38" id="ref15">38</reflink>]),[<reflink idref="bib1" id="ref16">1</reflink>] who used logistic regression to analyze survival data by converting them into person-period data (long format). This approach is particularly useful when event timing is recorded in large metrics, such as years, or where the occurrence of events is naturally discrete. Specifically, for cases with few time intervals (e.g., less than five years), few event occurrences, or small sample sizes, discrete-time models are recommended (Kim, [<reflink idref="bib13" id="ref17">13</reflink>]).</p> <p>However, the distinction between the discrete- and continuous-time survival analyses can be arbitrary. In some cases, timing is truly continuous; however, when the number of discrete time points is large, discrete-time models can yield results that are close to those of continuous-time models. For instance, when using fine metrics (e.g., 48 time points), either discrete or continuous models can be appropriately applied (Kim, [<reflink idref="bib13" id="ref18">13</reflink>]). Although continuous-time models can be used in social and behavioral sciences depending on how time is treated, these fields typically deal with fewer than 20 or 30 time points. Consequently, discrete-time survival analysis is more practical and preferable in these contexts.</p> <p>Following the introduction of discrete-time survival methods, Masyn ([<reflink idref="bib18" id="ref19">18</reflink>]) demonstrated how to model event-history data within a latent variable modeling framework. This advanced the methodology to handle more complex situations, such as unobserved heterogeneity (Muthén & Masyn, [<reflink idref="bib23" id="ref20">23</reflink>]), recurrent events (Masyn, [<reflink idref="bib19" id="ref21">19</reflink>]), and competing events (Schmid & Berger, [<reflink idref="bib31" id="ref22">31</reflink>]). Recent developments include incorporating DTSA into mediation models (Fairchild et al., [<reflink idref="bib7" id="ref23">7</reflink>]), evaluating lagged effects on hazard probabilities (Raykov et al., [<reflink idref="bib30" id="ref24">30</reflink>]), and applying machine learning classification algorithms to predict event-history outcomes (Suresh et al., [<reflink idref="bib34" id="ref25">34</reflink>]). Despite these advances, few studies have introduced DTSA, which is the focus of this study. This study aims to contribute to this developing research field by addressing the gaps for applied researchers, emphasizing discrete-time survival analysis, and exploring a more parsimonious model known as structural DTSA.</p> <hd id="AN0187593736-4">1.2. Unstructured Discrete-Time Survival Analysis Model</hd> <p>In the survival analysis, there are two types of information needed: <emph>whether</emph> an event happened and <emph>when</emph> does the event happen. To facilitate DTSA, the variables of event occurrence and timing of the event should be translated into event history outcome measures. Let an event occurrence and the time-points be</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>y</mi></mrow><mi>i</mi></msub></mrow></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>t</mi></mrow><mi>i</mi></msub></mrow><mo>,</mo></math> </ephtml> where <emph>i</emph> indicates individuals. Let <emph>T</emph> denote the total time length, or study duration.</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>t</mi></mrow><mi>i</mi></msub></mrow></math> </ephtml> is always lower than <emph>T</emph> regardless of the event occurrence.</p> <p>With two variables, we can make a set of <emph>J</emph> binary event history indicators</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>u</mi></mrow><mrow><mtext mathvariant="italic">ij</mtext></mrow></msub></mrow><mo>,</mo></math> </ephtml> where</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>u</mi></mrow><mrow><mtext mathvariant="italic">ij</mtext></mrow></msub><mo>=</mo><mn>1</mn></mrow></math> </ephtml> if person <emph>i</emph> experiences the target event in time period <emph>j</emph>, whereas</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>u</mi></mrow><mrow><mtext mathvariant="italic">ij</mtext></mrow></msub><mo>=</mo><mn>0</mn></mrow></math> </ephtml> when the person does not experience the event before that period. Once the event occurs,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>u</mi></mrow><mrow><mtext mathvariant="italic">ij</mtext></mrow></msub></mrow></math> </ephtml> is coded as missing afterward. When one drops out from a survey at <emph>j</emph> without experiencing the event before <emph>T</emph>,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>u</mi></mrow><mrow><mtext mathvariant="italic">ij</mtext></mrow></msub></mrow></math> </ephtml> is coded as 0 until <emph>j</emph> and missing values afterward. In the survival analysis framework, this kind of missingness is called "censored," which means that an individual is no longer at a risk set (Masyn, [<reflink idref="bib20" id="ref26">20</reflink>]).</p> <p>To conduct discrete-time survival analysis, it is necessary to understand the hazard probability. The hazard probability is expressed as follows</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>h</mi></mrow><mi>j</mi></msub><mo>=</mo><mtext mathvariant="italic">Pr</mtext><mrow><mo stretchy="true">(</mo><mrow><mi>T</mi><mo>=</mo><mi>j</mi><mo>|</mo><mi>T</mi><mo>≥</mo><mi>j</mi></mrow><mo stretchy="true">)</mo></mrow><mo>.</mo></mrow></math> </ephtml> (<reflink idref="bib1" id="ref27">1</reflink>)</p> <p>which is defined as the conditional probability that the event occurs at <emph>j</emph> given that it did not occur prior to <emph>j</emph>. The hazard probability</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>h</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> [<reflink idref="bib2" id="ref28">2</reflink>] is estimated as the number of events that occurred at <emph>j</emph> over the number of individuals in the risk set. It thus provides researchers with a unique risk of event occurrence for each period of time among those eligible to experience the event: whether and when events occur.</p> <p>It is estimable with censored individuals as it is a conditional probability computed only using individuals eligible to experience the event and can be computed for every time period when event occurrence is recorded. With the assumption of noninformative censoring, we can assume the estimated hazard function applies to the entire population, as all non-censored individuals at each time period are representative of all individuals who would have remained in the study if censoring had not occurred (Masyn, [<reflink idref="bib20" id="ref29">20</reflink>]).</p> <p>To model the probability distribution and add covariates to the model to examine their influence, the unstructured hazard probability at time <emph>j</emph> unconditional on covariates can be expressed as a logistic function is then given by:</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>h</mi></mrow><mi>j</mi></msub><mo>=</mo><mtext mathvariant="italic">Pr</mtext><mrow><mo stretchy="true">(</mo><mrow><msub><mrow><mi>u</mi></mrow><mi>j</mi></msub><mo>=</mo><mn>1</mn></mrow><mo stretchy="true">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mtext mathvariant="normal">exp</mtext><mrow><mo stretchy="true">(</mo><mrow><mo>−</mo><msub><mrow><mi>τ</mi></mrow><mi>j</mi></msub></mrow><mo stretchy="true">)</mo></mrow><mo /></mrow></mfrac><mo>,</mo></mrow></math> </ephtml> (<reflink idref="bib2" id="ref30">2</reflink>)</p> <p>[<reflink idref="bib3" id="ref31">3</reflink>]</p> <p>where</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>τ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> is the intercept (or threshold) parameter corresponding to the hazard probability at time <emph>j</emph>. This model represents the log-odds of event occurrence as a function of the time period only.</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>τ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> captures the risk of an event occurring in a <emph>j</emph> time period given that the event does not occur in the previous time point.</p> <p>When both time invariant- and time varying predictors are added to the model,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>h</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> is modeled as follows</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>h</mi></mrow><mi>j</mi></msub><mo>=</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mtext mathvariant="normal">exp </mtext><mrow><mo stretchy="true">(</mo><mrow><mo>−</mo><msub><mrow><mi>τ</mi></mrow><mi>j</mi></msub><mo>+</mo><mi>β</mi><mi>x</mi><mo>+</mo><msub><mrow><mi>γ</mi></mrow><mi>j</mi></msub><msub><mrow><mi>z</mi></mrow><mi>j</mi></msub></mrow><mo stretchy="true">)</mo></mrow><mo /></mrow></mfrac><mo>,</mo></mrow></math> </ephtml> (<reflink idref="bib3" id="ref32">3</reflink>)</p> <p>where</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi></mrow></math> </ephtml> indicates a time invariant effect of</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>x</mi></mrow><mo>,</mo></math> </ephtml> such as gender or ethnicity, and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>γ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> represents a time-variant effect of</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>z</mi></mrow><mi>j</mi></msub></mrow><mo>,</mo></math> </ephtml> such as marital status across event time. It is important to note that hazard models typically assume proportional hazards, meaning that hazard probabilities are expected to be parallel across all predictor values, with effects determined by</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi></mrow></math> </ephtml> or</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>γ</mi></mrow><mo>.</mo></math> </ephtml> When this assumption holds,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi></mrow></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>γ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> should be consistent across time points, resulting in the same effects of both <emph>X</emph> and <emph>Z</emph> on each event-history indicator over time.</p> <p>However, there are cases where the proportional hazards assumption doesn't hold. In such cases, this assumption needs to be relaxed (Singer & Willett, [<reflink idref="bib32" id="ref33">32</reflink>]). When this happens,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>β</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>γ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> can be estimated separately for each time point, allowing the effects of both time-invariant and time-varying predictors to fluctuate over time. To evaluate whether the proportional hazards assumption is held, researchers can compare models with and without equality constraints on</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>β</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>γ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> with relative fit indices (e.g., the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) where a lower value indicates a better fit). It's worth noting that the effects of predictors may not always follow a simple pattern. They might remain consistent across all time points, or they could vary across some or all of them. This flexibility allows for a better understanding of how different factors influence the hazard probability over time.</p> <p>Based on the aforementioned points, survival analysis makes it possible to determine at what time periods the event of interest is most likely to occur, as well as to identify why some individuals experience the event earlier than others and why some do not experience the event of interest at all during the study periods. By utilizing discrete-time survival analysis and modeling the hazard probability, researchers can gain valuable insights into the timing and determinants of events of interest.</p> <hd id="AN0187593736-5">1.3. Structured Discrete-Time Survival Analysis</hd> <p>DTSA can be extended to integrate a time structure into the hazard process, adopting a specific functional form for the logit-hazard profile shape (Masyn, [<reflink idref="bib18" id="ref34">18</reflink>], Graham et al., [<reflink idref="bib9" id="ref35">9</reflink>]). Baseline hazard probabilities might be represented as a function of time, such as a linear increase. With the discrete-logit, it is the logit of the hazard probability that is being modeled. DTSA with time structures can be possible by using the latent growth curve modeling (LGCM) framework. LGCM is a specialized form of structural equation modeling to deal with longitudinal repeated measures data (Kline, [<reflink idref="bib15" id="ref36">15</reflink>]). By treating event history data as longitudinal data, researchers can fully leverage the capabilities of the LGCM approach within the SEM framework.</p> <p>To estimate the structured hazard model via the LGCM framework, the logit hazard probabilities are then expressed as a function of growth curve factors, represented by latent variables, with factor loadings constraining the structure. Suppose that there are <emph>T</emph> event history variables,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>u</mi></mrow><mn>1</mn></msub><mo>,</mo><mo>...</mo><msub><mrow><mi>u</mi></mrow><mi>j</mi></msub></mrow><mo>,</mo></math> </ephtml> that serve as the indicators in the latent growth curve model, where <emph>j</emph> is the number of time points during which a target event could possibly happen. While the hazard probabilities in each</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>u</mi></mrow></math> </ephtml> indicator are reflected by</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>τ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> in the unstructured DTSA, they are modeled with growth curve factors and time loadings in the structured DTSA. Growth curve factors reflect the average of change, and time loadings capture the nature of the dependence of the hazard function on time. The baseline logit hazard is specified by including some functions of time as explanatory variables.</p> <p>For example, when the hazard probability seems to steadily increase over time, the unstructured DTSA would estimate every hazard probability across all time-points. Instead, in the S-DTSA, a linear growth trajectory can be applied to account for the change in hazard probabilities with two growth curve factors, such as an intercept (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow></math> </ephtml> ); a linear slope (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow></math> </ephtml> ), and linear factor loadings.</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mtable><mtr><mtd><mrow><mtext mathvariant="italic">Pr</mtext><mrow><mo stretchy="true">(</mo><mrow><msub><mrow><mi>u</mi></mrow><mi>j</mi></msub><mo>=</mo><mn>1</mn></mrow><mo stretchy="true">)</mo></mrow><mo>=</mo><msub><mrow><mi>h</mi></mrow><mi>j</mi></msub><mo>=</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mtext mathvariant="normal">exp </mtext><mrow><mo stretchy="true">(</mo><mrow><mo>−</mo><msub><mrow><mi>τ</mi></mrow><mi>j</mi></msub><mo>+</mo><msub><mrow><mi>Λ</mi></mrow><mi>j</mi></msub><mi>η</mi></mrow><mo stretchy="true">)</mo></mrow><mo /></mrow></mfrac></mrow></mtd></mtr><mtr><mtd><mrow><mi>η</mi><mo>=</mo><mi>α</mi><mo>+</mo><mi>ζ</mi></mrow></mtd></mtr></mtable></mrow></math> </ephtml> (<reflink idref="bib4" id="ref37">4</reflink>)</p> <p>where</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>Λ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> is a <emph>j × m</emph> time-loading matrix to reflect the functional form of growth curve models;</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>η</mi></mrow></math> </ephtml> is a <emph>m × 1</emph> vector for latent growth factors;</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>α</mi></mrow></math> </ephtml> is a <emph>m × 1</emph> vector for the means of latent growth factors;</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>τ</mi></mrow><mi>j</mi></msub></mrow></math> </ephtml> is a <emph>j × 1</emph> vector, constrained to be zero for identification. With the linear functional form with 5 timepoints,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>Λ</mi><mo>=</mo><msup><mrow><mrow><mrow><mo stretchy="true">[</mo><mrow><mtable><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>2</mn></mtd><mtd><mn>3</mn></mtd><mtd><mn>4</mn></mtd></mtr></mtable></mrow></mtd></mtr></mtable></mrow><mo stretchy="true">]</mo></mrow></mrow></mrow><mo>′</mo></msup></mrow></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>η</mi><mo>=</mo><msup><mrow><mrow><mrow><mo stretchy="true">[</mo><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub><mo /><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow><mo stretchy="true">]</mo></mrow></mrow></mrow><mo>′</mo></msup></mrow><mo>.</mo></math> </ephtml> It is important to note that in contrast to a typical LGCM specification, the growth variances (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>ζ</mi></mrow></math> </ephtml> ) in structured discrete-time survival analysis should be constrained to zero, indicating the absence of heterogeneity in hazard profiles (Masyn, [<reflink idref="bib18" id="ref38">18</reflink>]). The presence of non-zero growth variances would suggest that individuals have different hazard probabilities, even after accounting for the effects of covariates. However, if unobserved heterogeneity in hazard probabilities is assumed,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>ζ</mi></mrow></math> </ephtml> will be a <emph>m × m</emph> matrix, following a multivariate normal distribution. Alternatively, researchers can employ finite mixture modeling, which allows for the identification of distinct subgroups or classes of individuals who share similar hazard profiles (see Muthén & Masyn, [<reflink idref="bib23" id="ref39">23</reflink>] for more details).</p> <p>Covariates can be integrated into the structured DTSA framework. This integration allows for a direct modeling of covariates on event history indicators in the same way as Equation (<reflink idref="bib3" id="ref40">3</reflink>). Additionally, the growth factors that define the hazard profile can be made conditional on these covariates as follows</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>η</mi><mo>=</mo><mi>ν</mi><mo>+</mo><mtext mathvariant="italic">BX</mtext></mrow></math> </ephtml> (<reflink idref="bib5" id="ref41">5</reflink>)</p> <p>where <emph>X</emph> is a <emph>q</emph> vector for covariates,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>B</mi></mrow></math> </ephtml> is a <emph>m × q</emph> matrix that contains the coefficients of the covariates of the latent growth factors, and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>ν</mi></mrow></math> </ephtml> is a <emph>m × 1</emph> matrix representing the intercepts of growth factors.</p> <p>LGCM also handles nonlinear change trajectories such as models via polynomial growth factors (Meredith & Tisak, [<reflink idref="bib21" id="ref42">21</reflink>]), with a shape on factor loadings (e.g., quadratic, or piecewise imposed). In general, linear and quadratic patterns are most likely used in educational research. However, the other types of structure, such as exponential function, can be utilized to represent the underlying trajectory of hazard probabilities over time.</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mtable><mtr><mtd><mrow><mtext>Linear</mtext></mrow></mtd></mtr><mtr><mtd><mrow><mi mathvariant="bold">Λ</mi><mo>=</mo><mrow><mo stretchy="true">[</mo><mrow><mtable><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>2</mn></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>3</mn></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>4</mn></mtd></mtr></mtable></mrow></mtd></mtr></mtable></mrow><mo stretchy="true">]</mo></mrow></mrow></mtd></mtr></mtable><mtable><mtr><mtd><mrow><mtext>Quadratic</mtext></mrow></mtd></mtr><mtr><mtd><mrow><mi mathvariant="bold">Λ</mi><mo>=</mo><mrow><mo stretchy="true">[</mo><mrow><mtable><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd><mtd><mrow><mo /><mn>0</mn></mrow></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd><mtd><mrow><mo /><mn>1</mn></mrow></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>2</mn></mtd><mtd><mrow><mo /><mn>4</mn></mrow></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>3</mn></mtd><mtd><mrow><mo /><mn>9</mn></mrow></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>4</mn></mtd><mtd><mrow><mn>16</mn></mrow></mtd></mtr></mtable></mrow></mtd></mtr></mtable></mrow><mo stretchy="true">]</mo></mrow></mrow></mtd></mtr></mtable><mtable><mtr><mtd><mrow><mtext>Piecewise</mtext></mrow></mtd></mtr><mtr><mtd><mrow><mi mathvariant="bold">Λ</mi><mo>=</mo><mrow><mo stretchy="true">[</mo><mrow><mtable><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>2</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable></mrow></mtd></mtr><mtr><mtd><mrow><mtable><mtr><mtd><mn>1</mn></mtd><mtd><mn>2</mn></mtd><mtd><mn>1</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd><mtd><mn>2</mn></mtd><mtd><mn>2</mn></mtd></mtr></mtable></mrow></mtd></mtr></mtable></mrow><mo stretchy="true">]</mo></mrow></mrow></mtd></mtr></mtable></mrow></math> </ephtml> </p> <p>It is crucial to identify a change function that best represents the overall trajectory in the data, similar to what is typically done in latent growth curve modeling. This process often involves comparing fit indices such as the AIC and the BIC. Starting from a baseline model where only the intercept is estimated, one can test higher-order models (e.g., linear, quadratic, or cubic). If the trajectory indicates a change-point at a specific time, a piecewise model should be considered. Smaller AIC and BIC values can be used as evidence to identify the preferred model (Lee et al., [<reflink idref="bib16" id="ref43">16</reflink>]). Although fit indices are important criteria for choosing the functional form of the hazard trajectory, the substantive meaning of the model should also be considered (Preacher, [<reflink idref="bib26" id="ref44">26</reflink>]). There are possible growth functions to consider (e.g., Flora, [<reflink idref="bib8" id="ref45">8</reflink>]; Marcoulides, [<reflink idref="bib17" id="ref46">17</reflink>]; Sterba, [<reflink idref="bib35" id="ref47">35</reflink>]).</p> <hd id="AN0187593736-6">1.4. Strengths of S-DTSA</hd> <p>Structured Discrete-Time Survival Analysis (S-DTSA) offers several advantages over the traditional DTSA, specifically in terms of analytical efficiency and interpretive depth. A primary advantage of S-DTSA lies in its parsimonious approach to data modeling, particularly beneficial in scenarios with extensive time periods. Traditional DTSA can be inefficient and less interpretable when dealing with a large number of time points. It estimates hazard probabilities for each time point independently, leading to redundancy and over-parameterization. In contrast, S-DTSA provides a clearer interpretation with a limited number of parameters to define a functional form of hazard change. The S-DTSA results in a significant reduction in the number of estimated parameters by imposing a structure on the hazard function, such as a polynomial or piecewise constant form, leading to a parsimonious representation for the hazard process (e.g., Clark Goings et al., [<reflink idref="bib2" id="ref48">2</reflink>]). This structured approach captures the overall trend of the hazard and provides clearer interpretation. Moreover, while the traditional DTSA typically provides descriptive statistics without probing into the underlying dynamics of hazard processes, S-DTSA introduces the ability to conduct hypothesis testing about these dynamics (e.g., Graham et al., [<reflink idref="bib9" id="ref49">9</reflink>]). S-DTSA is particularly valuable for testing theoretical models or assumptions about how hazard probabilities change over time. For example, researchers could hypothesize specific patterns in life events, such as a quadratic trajectory in the onset of marriage, with a peak at a certain age range. S-DTSA allows for the testing of such hypotheses against empirical data, facilitating a deeper understanding of event patterns. Additionally, it enables comparisons with previous research or theoretical models, enriching the study with contextually relevant insights.</p> <p>Further, S-DTSA stands out in its ability to incorporate covariates and use hazard growth functions as predictors, offering a clear interpretation of the impact of various factors on hazard probabilities. By summarizing overall hazard probabilities through growth factors, S-DTSA allows researchers to focus on the overarching trends rather than getting lost in the details of individual coefficients for each covariate-to-event indicator. This approach not only simplifies the interpretation process but also provides a more comprehensive understanding of how different covariates influence the hazard process. Researchers can thus draw more meaningful conclusions about the effects of specific variables on the likelihood and timing of events, making S-DTSA a powerful tool for understanding complex survival data.</p> <p>In summary, S-DTSA enhances the analytical process by offering a more efficient, hypothesis-testing capable, and interpretatively rich framework compared to the usage of the traditional DTSA. The S-DTSA approach allows for a deeper exploration of survival data, making it a valuable tool for researchers seeking to uncover the nuanced patterns and influences within their datasets.</p> <hd id="AN0187593736-7">1.5. Re-parameterization of S-DTSA for Interpretable Parameters</hd> <p>While a variety of functional forms (e.g., quadratic and cubic) can be applied with DTSA, the interpretation using DTSA parameters is oftentimes challenging. For instance, in the study by Edelen et al. ([<reflink idref="bib6" id="ref50">6</reflink>]), the quadratic function employed revealed that the risk peaked at age 14 and then the risk declined. However, interpreting the implications of the linear and quadratic components in such models was not straightforward. This is because the linear component indicates the instantaneous rate of change at the initial point (akin to the slope of the curve's tangent line at time zero) even though the intercept of this quadratic model still reflects the model-implied value at the initial assessment. Furthermore, the quadratic component reflects the acceleration or deceleration of this rate over time. Besides, the complication of the model parameter interpretation escalates even when predictors are introduced into these non-linear DTSA models. Consider a scenario where gender differences exist in linear terms. The interpretation of findings such as boys having a higher growth slope of the tangent line of the curve in hazard probability for cigarette initiation becomes intricate and less intuitive.</p> <p>To address the interpretational challenges associated with non-linear model parameters, some studies have proposed re-parameterization strategies that yield parameters with more substantive meaning (Cudeck & Du Toit, [<reflink idref="bib4" id="ref51">4</reflink>]; Preacher & Hancock, [<reflink idref="bib27" id="ref52">27</reflink>], [<reflink idref="bib28" id="ref53">28</reflink>]). A common approach, particularly with quadratic terms, involves transforming conventional LGCM into re-parameterized equations. For instance, a standard quadratic LGCM can be represented as:</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>f</mi></mrow><mn>1</mn></msub><mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow><mo>=</mo><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub><mo>+</mo><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub><mi>t</mi><mo>+</mo><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub><msup><mrow><mi>t</mi></mrow><mn>2</mn></msup><mo>.</mo></mrow></math> </ephtml> (<reflink idref="bib6" id="ref54">6</reflink>)</p> <p>From this, the model with a non-linear growth trend can be re-parameterized to:</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mtable><mtr><mtd><mrow><msub><mrow><mi>α</mi></mrow><mi>t</mi></msub><mo>=</mo><mo>−</mo><mfrac><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow><mrow><mn>2</mn><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></mfrac></mrow></mtd></mtr><mtr><mtd><mrow><msub><mrow><mi>α</mi></mrow><mi>y</mi></msub><mo>=</mo><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub><mo>−</mo><mfrac><mrow><msubsup><mrow><mi>η</mi></mrow><mn>1</mn><mn>2</mn></msubsup></mrow><mrow><mn>4</mn><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></mfrac><mo>=</mo><mo /><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub><mo>−</mo><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub><msubsup><mrow><mi>α</mi></mrow><mi>t</mi><mn>2</mn></msubsup></mrow></mtd></mtr><mtr><mtd><mrow><msub><mrow><mi>f</mi></mrow><mn>2</mn></msub><mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow><mo>=</mo><mo /><msub><mrow><mi>α</mi></mrow><mi>y</mi></msub><mo>+</mo><mrow><mo stretchy="true">(</mo><mrow><msub><mrow><mi>α</mi></mrow><mi>y</mi></msub><mo>−</mo><msub><mrow><mi>α</mi></mrow><mn>0</mn></msub></mrow><mo stretchy="true">)</mo></mrow><msup><mrow><mrow><mrow><mo stretchy="true">(</mo><mrow><mfrac><mi>t</mi><mrow><msub><mrow><mi>α</mi></mrow><mi>t</mi></msub></mrow></mfrac><mo>−</mo><mn>1</mn></mrow><mo stretchy="true">)</mo></mrow></mrow></mrow><mn>2</mn></msup></mrow></mtd></mtr></mtable></mrow></math> </ephtml> (<reflink idref="bib7" id="ref55">7</reflink>)</p> <p>Though appearing distinct,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>f</mi></mrow><mn>2</mn></msub><mrow><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo></mrow></mrow></math> </ephtml> is mathematically equivalent to</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>f</mi></mrow><mn>1</mn></msub><mrow><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo></mrow></mrow><mo>,</mo></math> </ephtml> derived for a more intuitive interpretation. In this re-parameterized form, <emph>α<subs>0</subs></emph> represents the initial value corresponding to</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow><mo>,</mo></math> </ephtml> </p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>α</mi></mrow><mi>y</mi></msub></mrow></math> </ephtml> the maximum, and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>α</mi></mrow><mi>t</mi></msub></mrow></math> </ephtml> the point of maximization, such that</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>f</mi></mrow><mn>2</mn></msub><mrow><mo stretchy="true">(</mo><mrow><mi>t</mi><mo>=</mo><msub><mrow><mi>a</mi></mrow><mi>t</mi></msub></mrow><mo stretchy="true">)</mo></mrow><mo>=</mo><msub><mrow><mi>α</mi></mrow><mi>y</mi></msub></mrow><mo>.</mo></math> </ephtml> </p> <p>This re-parameterization not only simplifies the growth model parameters' interpretation but also allows for more meaningful use of information about each latent variable (e.g., maximizer term, minimizer term). For example, knowing the maximizer and maximum parameter estimates, researchers can analyze variations in the timing and magnitude of the peak hazard probability, such as in the onset of cigarette use. Furthermore, this approach facilitates the effective incorporation of predictors or independent variables, enabling the analysis of conditions under which adolescents might face earlier, later, higher, or lower risk probabilities.</p> <p>In this study, we apply re-parameterization to DTSA models, thereby achieving more interpretable and substantively meaningful parameters. The analytic approach suggested by the current study not only clarifies the complexity inherent in non-linear DTSA model parameters but also enhances the applicability of the analysis in understanding individuals' dynamic behavioral patterns.</p> <hd id="AN0187593736-8">2. Empirical Data Analysis</hd> <p>We used the dataset obtained from the National Longitudinal Study of Youth 1997 (NLSY97). NLSY97 encompassed a survey of 8,984 U.S. adolescents born between 1980 and 1984, which is designed to longitudinally track various dimensions of the participants' lives, including their social, economic, psychological, and physical well-being. At the time of the initial interview and survey, their ages ranged from 14 to 18 years. NLSY97 is. The data collection for the age of first smoking spanned until 2005.</p> <p>For the current data illustration, we selected specific variables from the questionnaire: the age of first smoking a cigarette (YSAQ-360), gender, race, and poverty ratio. Table 1 illustrates the distribution of these variables, showing proportions for the age of first smoking and the three demographic covariates. The age of first smoking was reported by participants, ranging from 1 to 17 years. This information was gathered over five years (1997, 1998, 1999, 2004, and 2005). In cases where participants reported their first smoking age multiple times, the earliest reported age was chosen to represent their initial smoking experience.</p> <p>Table 1. Descriptive statistics for event history indicators.</p> <p> <ephtml> <table><thead><tr><td>Age</td><td>Number initiating smoking</td><td>Number at risk</td><td>Probability</td></tr></thead><tbody valign="top"><tr><td>7</td><td char=".">53</td><td char=".">8,818</td><td char=".">0.005</td></tr><tr><td>8</td><td char=".">93</td><td char=".">8,725</td><td char=".">0.010</td></tr><tr><td>9</td><td char=".">113</td><td char=".">8,612</td><td char=".">0.013</td></tr><tr><td>10</td><td char=".">258</td><td char=".">8,354</td><td char=".">0.029</td></tr><tr><td>11</td><td char=".">373</td><td char=".">7,981</td><td char=".">0.044</td></tr><tr><td>12</td><td char=".">705</td><td char=".">7,276</td><td char=".">0.088</td></tr><tr><td>13</td><td char=".">718</td><td char=".">6,558</td><td char=".">0.098</td></tr><tr><td>14</td><td char=".">593</td><td char=".">5,965</td><td char=".">0.090</td></tr><tr><td>15</td><td char=".">366</td><td char=".">5,599</td><td char=".">0.061</td></tr><tr><td>16</td><td char=".">153</td><td char=".">5,446</td><td char=".">0.027</td></tr></tbody></table> </ephtml> </p> <p>1 <emph>Note</emph>: Number initiating smoking = the count of individuals who started smoking at each specific age, number at risk = all individuals who had not yet started smoking by the beginning of each age interval, probability = the ratio of the number of individuals initiating smoking to the number at risk.</p> <p>In our study, the final sample size is <emph>N</emph> = 8,871 after excluding instances of reported smoking onset below the age of 6. This decision was based on concerns regarding the reliability and accuracy of such early-age smoking onset reports. Additionally, we combined the very few cases (14 in total) at age 17 with those at age 16. These adjustments resulted in a consolidated age range of 7 to 16 years for our analysis. Table 1 reports the descriptive statistics for all event-history indicators for smoking initialization.</p> <hd id="AN0187593736-9">2.1. Data Illustration Background</hd> <p>In the United States, smoking remains a prevalent issue, with the focus shifting from "if" adolescents start smoking to "when" they begin. The age at which individuals first start smoking is a critical predictor of long-term health outcomes and the likelihood of using other drugs (e.g., Grant & Dawson, [<reflink idref="bib10" id="ref56">10</reflink>]; Hingson et al., [<reflink idref="bib11" id="ref57">11</reflink>]). Early initiation of smoking is also linked to more immediate adolescent and young adult problems. These include impaired brain development, engagement in risky sexual behaviors, poor academic performance, and increased likelihood of illicit drug use (Morean et al., [<reflink idref="bib22" id="ref58">22</reflink>]). Understanding and modeling the timing of smoking onset is crucial. It allows prevention scientists to assess and intervene in the risks associated with early smoking initiation, thereby potentially reducing smoking prevalence over an individual's lifespan. Our analysis aims to utilize Structured Discrete-Time Survival Analysis (S-DTSA) to accurately estimate the hazard probabilities of smoking onset. This approach will help us gain a deeper understanding of the patterns and implications of smoking initiation among adolescents.</p> <hd id="AN0187593736-10">2.2. Measures</hd> <p>In our study, we aim to pinpoint the age at which individuals first start smoking, focusing on the age range of 7 to 16 years. To facilitate this analysis through Discrete-Time Survival Analysis (DTSA), we have meticulously structured our event-history data. For example, consider a participant who began smoking at age 10. Their event-history vector is encoded as (0 0 0 1 -9 ... −9). In this vector representation, "1" denotes the year smoking commenced, "0" indicates years without smoking, and "-9" is used for years post-event onset, marking them as periods beyond the scope of observation or as censored data.</p> <p>For participants who did not initiate smoking within the study's timeframe, their event-history vector consists solely of zeros (0 0 0 ... 0), signifying the non-occurrence of the smoking event. In instances where a participant withdrew from the study prematurely, their data vector might be represented as (0 0 0 -9 -9 ... −9), indicating no smoking up to the point of dropout, with no subsequent data available post-dropout.</p> <p>Additionally, the study incorporates demographic variables such as gender, with males constituting 50.9% (males = 0; female = 1) of the participants, and race, where 51.2% of the participants are identified as non-black or non-Hispanic (non-black/Hispanic = 0; black/Hispanic = 1). Another critical measure in our analysis is the poverty ratio, which reflects the ratio of household income to the poverty level in the previous year. We standardized a poverty ratio for our analysis. A higher value on this scale indicates a higher level of poverty, providing an understanding of socioeconomic factors potentially influencing the age of smoking onset. These covariates allow for a detailed examination of the onset age of smoking and its correlation with various demographic and socio-economic factors.</p> <hd id="AN0187593736-11">2.3. Data Analytic Approach</hd> <p>In this study, we analyzed smoking onset (YSAQ-360) data using both conventional and structured DTSA. Two models were conducted M<emph>plus</emph> (Appendix A1 and A2) (version 8.1; Muthén & Muthén, 1998–[<reflink idref="bib24" id="ref59">24</reflink>]) using the robust maximum likelihood estimator (MLR). Example code with data used for the present study is available at [https://osf.io/jsywc/]. Based on the analyses results obtained using unstructured DTSA, we chose to implement a more parsimonious parametric form, namely one that is quadratic with age. The use of the structured DTSA's growth parameters helps clearly understand the complex dynamics of smoking onset. After fitting the two models without any covariates, we next added covariates to examine the effects of race, ethnicity, and poverty ratio on the hazard patterns of smoking initiation. Subsequently, we demonstrated the ways to interpret the growth parameters from the structured DTSA, using reparameterization for clearer and more meaningful insights.</p> <hd id="AN0187593736-12">3. Empirical Data Analysis Results</hd> <p></p> <hd id="AN0187593736-13">3.1. Unstructured Discrete-Time Survival Analysis</hd> <p>Firstly, we fitted an unconditional discrete-time survival model (DTSA) to assess the initiation of smoking among adolescents. This model allowed us to estimate hazard logits, which we then converted the hazard logits into hazard probabilities. Hazard probabilities represent the likelihood of an individual starting smoking within a specific age interval, conditional upon not having started in previous intervals. Figure 1 represents this trend: it depicts a gradual rise in the hazard-logit curve for smoking onset, peaking at age 13, followed by a steady decline thereafter.</p> <p>Graph: Figure 1. Unstructured DTSA in hazard logits (left) and probabilities (right).</p> <p>Without any structural complexities, the findings from the unstructured DTSA analysis align perfectly with the descriptive statistics we previously provided in Table 2. A critical aspect in any event history analysis is understanding how the hazard function varies over time. Our traditional DTSA analysis indicates that the hazard trajectory for smoking onset among adolescents closely follows a quadratic function. Given this pattern, it's logical to explore functional forms that more parsimoniously represent the timing of smoking initiation in adolescents. To achieve this, we then apply a quadratic function to the baseline discrete-time survival model within the latent growth curve modeling framework.</p> <p>Table 2. Estimated hazard logit and probabilities of unstructured DTSA.</p> <p> <ephtml> <table><thead><tr><td>Age</td><td>Hazard logit</td><td>Hazard probability</td></tr></thead><tbody valign="top"><tr><td>7</td><td char=".">5.114</td><td char=".">0.006</td></tr><tr><td>8</td><td char=".">4.541</td><td char=".">0.011</td></tr><tr><td>9</td><td char=".">4.334</td><td char=".">0.013</td></tr><tr><td>10</td><td char=".">3.478</td><td char=".">0.030</td></tr><tr><td>11</td><td char=".">3.063</td><td char=".">0.045</td></tr><tr><td>12</td><td char=".">2.334</td><td char=".">0.088</td></tr><tr><td>13</td><td char=".">2.212</td><td char=".">0.099</td></tr><tr><td>14</td><td char=".">2.308</td><td char=".">0.090</td></tr><tr><td>15</td><td char=".">2.728</td><td char=".">0.061</td></tr><tr><td>16</td><td char=".">3.572</td><td char=".">0.027</td></tr></tbody></table> </ephtml> </p> <p>2 <emph>Note.</emph> DTSA = discrete-time survival analysis, Hazard Logit = the logit of the hazard probability represents the odds of initiating smoking at each age given that it has not started earlier, hazard probability = the probability that individuals initiate smoking within each age interval, given that they have not yet started smoking.</p> <hd id="AN0187593736-14">3.2. Structured Discrete-Time Survival Analysis</hd> <p></p> <hd id="AN0187593736-15">3.2.1. Identifying the Structure of Time</hd> <p>We first investigated the trajectory shape of the DTSA for smoking initiation and the adequacy of the proposed quadratic representation. The quadratic functional hazard assumption was evaluated to assess whether the hazard rate followed the quadratic form during adolescents.</p> <p>In determining the most accurate representation of the hazard trajectory with our sample dataset, we compared the unstructured DTSA model with the structured DTSA model, using comparative fit indices such as the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). The baseline unstructured model exhibited slightly higher AIC and BIC statistics, suggesting it may provide a marginally better fit to the data than the structured DTSA with a quadratic function (AIC: 26,003, BIC: 26,024 for S-DTSA; AIC: 25,865, BIC: 25,936 for DTSA).</p> <p>Although the baseline DTSA resulted in better model fit than the model fit indices of S-DTSA, the difference in BIC between the two models was only 88. As such, we chose the S-DTSA as the preferred model without further model fit tests as it offers a more parsimonious model for the purposes of our analysis, demonstrating the benefits of structured approaches. Consequently, we moved away from the unstructured baseline hazard assumption, opting for a structured model that better captured the hazard trajectory in our data, even when traditional model fit indices may slightly prefer the baseline DTSA.</p> <hd id="AN0187593736-16">3.2.2. Unconditional S-DTSA</hd> <p>Table 3 represents the model results for S-DTSA with a quadratic function. We investigated the growth parameters of the structured DTSA. The mean of the growth intercept significantly differed from zero (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow></math> </ephtml> = −2.769, <emph>p</emph> <.05). That indicates that, on average, approximately 5.9% (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mfrac><mrow><mtext mathvariant="italic">exp</mtext><mrow><mo stretchy="true">(</mo><mrow><mo>−</mo><mn>2.769</mn></mrow><mo stretchy="true">)</mo></mrow></mrow><mrow><mn>1</mn><mo>+</mo><mtext mathvariant="italic">exp</mtext><mrow><mo stretchy="true">(</mo><mrow><mo>−</mo><mn>2.769</mn></mrow><mo stretchy="true">)</mo></mrow></mrow></mfrac></mrow></math> </ephtml> ) adolescents began smoking at the age of 11. The linear (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow></math> </ephtml> = 0.423) and quadratic (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math> </ephtml> = −0.103) terms were both statistically significant (<emph>p</emph> <.05 for both).</p> <p>Table 3. Estimates of unconditional structured DTSA.</p> <p> <ephtml> <table><thead><tr><td /><td>Estimate</td><td><italic>SE</italic></td><td><italic>p</italic>-Value</td></tr></thead><tbody valign="top"><tr><td><p><graphic href="hsem_a_2432598_ilm0067.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow></math></p></td><td char=".">−2.769</td><td char=".">0.023</td><td char=".">0.000</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0068.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow></math></p></td><td char=".">0.423</td><td char=".">0.015</td><td char=".">0.000</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0069.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math></p></td><td char=".">−0.103</td><td char=".">0.004</td><td char=".">0.000</td></tr></tbody></table> </ephtml> </p> <p>3 <emph>Note.</emph> DTSA = discrete-time survival analysis, <emph>SE =</emph> standard error,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow><mo>,</mo></math> </ephtml> </p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow><mo>,</mo></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math> </ephtml> represent the intercept, linear term, and quadratic term, respectively.</p> <p>Interpreting these terms in the context of a quadratic function requires careful consideration. The linear term (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow></math> </ephtml> ) represents a constant rate of change, while the quadratic term (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math> </ephtml> ) shapes the pattern of this change. Specifically, a positive linear term increases the rate of change across all ages. The quadratic term affects the curvature of this pattern: a higher and positive quadratic term results in a concave-upward pattern, while a negative quadratic term yields a concave-downward pattern. When it combines, the quadratic S-DTSA model suggests that the pattern of smoking initiation hazard is nonlinear. It starts at −2.769 and exhibits an increasing trend over time but with a concave-downward trajectory, indicating a deceleration in the rate of smoking initiation as age increases.</p> <p>The unconditional hazard function in Figure 2 illustrates an inverted U-shaped curve (quadratic function) in risk rates of cigarette onset during adolescence, with a gradual incline from 7 to 13 and decline from 13 to 16 years.</p> <p>Graph: Figure 2. Comparison of the quadratic S-DTSA for hazard logits and probabilities of smoking initiation against those of unstructured DTSA.</p> <hd id="AN0187593736-17">3.2.3. Conditional S-DTSA</hd> <p>The key time-invariant predictors including gender, race, and poverty ratio were incorporated to analyze the change in hazard logits for cigarette initiation. By introducing three covariates into S-DTSA, we aimed to understand how they influenced the different growth curve parameters. The relationships between each covariate and each component of the growth curve—namely, the intercept, linear slope, and quadratic component—was quantified using regression coefficients. Table 4 shows the results of the conditional S-DTSA, highlighting the significant influence of each predictor on the smoking onset trajectory.</p> <p>Table 4. Estimates of conditional structured DTSA.</p> <p> <ephtml> <table><thead><tr><td /><td>Estimates</td><td><italic>SE</italic></td><td><italic>p-</italic>Value</td></tr></thead><tbody valign="top"><tr><td><p><graphic href="hsem_a_2432598_ilm0073.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow></math></p> on Gen (<p><graphic href="hsem_a_2432598_ilm0074.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>01</mn></mrow></msub></mrow></math></p>)</td><td char=".">−0.246</td><td char=".">0.046</td><td char=".">0.000</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0075.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow></math></p> on Race (<p><graphic href="hsem_a_2432598_ilm0076.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>02</mn></mrow></msub></mrow></math></p>)</td><td char=".">−0.633</td><td char=".">0.050</td><td char=".">0.000</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0077.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow></math></p> on Poverty (<p><graphic href="hsem_a_2432598_ilm0078.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>03</mn></mrow></msub></mrow></math></p>)</td><td char=".">0.164</td><td char=".">0.028</td><td char=".">0.000</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0079.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow></math></p> on Gen (<p><graphic href="hsem_a_2432598_ilm0080.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>11</mn></mrow></msub><mo stretchy="false">)</mo></mrow></math></p></td><td char=".">0.134</td><td char=".">0.032</td><td char=".">0.000</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0081.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow></math></p> on Race (<p><graphic href="hsem_a_2432598_ilm0082.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>12</mn></mrow></msub><mo stretchy="false">)</mo></mrow></math></p></td><td char=".">0.159</td><td char=".">0.034</td><td char=".">0.004</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0083.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow></math></p> on Poverty (<p><graphic href="hsem_a_2432598_ilm0084.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>13</mn></mrow></msub><mo stretchy="false">)</mo></mrow></math></p></td><td char=".">−0.104</td><td char=".">0.021</td><td char=".">0.000</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0085.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math></p> on Gen (<p><graphic href="hsem_a_2432598_ilm0086.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>21</mn></mrow></msub><mo stretchy="false">)</mo></mrow></math></p></td><td char=".">−0.005</td><td char=".">0.008</td><td char=".">0.522</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0087.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math></p> on Race (<p><graphic href="hsem_a_2432598_ilm0088.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>22</mn></mrow></msub><mo stretchy="false">)</mo></mrow></math></p></td><td char=".">−0.010</td><td char=".">0.009</td><td char=".">0.264</td></tr><tr><td><p><graphic href="hsem_a_2432598_ilm0089.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math></p> on Poverty (<p><graphic href="hsem_a_2432598_ilm0090.gif" content-type="Graph" /><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow xmlns=""><msub><mrow><mi>β</mi></mrow><mrow><mn>23</mn></mrow></msub><mo stretchy="false">)</mo></mrow></math></p></td><td char=".">0.013</td><td char=".">0.005</td><td char=".">0.009</td></tr></tbody></table> </ephtml> </p> <p>4 <emph>Note.</emph> DTSA = discrete-time survival analysis, <emph>SE =</emph> standard error,</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow><mo>,</mo></math> </ephtml> </p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow><mo>,</mo></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math> </ephtml> represent the intercept, linear term, and quadratic term, respectively.</p> <p>The analysis revealed notable gender differences in smoking initiation. Boys exhibited a higher propensity to start smoking at around age 11, with the gender variable negatively influencing the initial likelihood of smoking onset; specifically, boys began smoking at a rate 0.246 logits higher than girls (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi><mn>01</mn></mrow></math> </ephtml> = −0.246). In terms of racial demographics, the non-black/Hispanic group showed a lower likelihood of initiating smoking at age 11, with a difference of 0.633 logits compared to their black/Hispanic counterparts (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi><mn>02</mn></mrow></math> </ephtml> = −0.633). Additionally, a higher poverty ratio correlated with a decreased likelihood of smoking initiation, dropping by 0.164 logits (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi><mn>03</mn></mrow></math> </ephtml> = 0.164).</p> <p>For the linear component, boys had a higher rate of change in smoking initiation, higher by 0.134 logits than girls (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi><mn>11</mn></mrow></math> </ephtml> = 0.134). For racial differences in the linear component, the rate of change for non-black/Hispanic individuals was 0.159 logits higher than for black/Hispanic individuals (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi><mn>12</mn></mrow></math> </ephtml> = 0.159). The poverty ratio also played a significant role; an increase in the poverty ratio corresponded with an increased rate of change in smoking initiation by 0.104 logits (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi><mn>13</mn></mrow></math> </ephtml> = −0.104). For the quadratic component, the poverty ratio was the only significant predictor, influencing the steepness of change in smoking initiation by 0.013 logits (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>β</mi><mn>23</mn></mrow></math> </ephtml> = 0.013).</p> <hd id="AN0187593736-18">3.3. Reparameterization</hd> <p>To enhance the interpretability of the event change over time using growth parameters (e.g., linear and quadratic term), we implemented the reparameterization of growth parameters using the LGCM framework in our analysis. Although S-DTSA offers parsimonious insights to understand hazard trajectories, interpreting the results from S-DTSA can be challenging. For example, while we identified significant gender differences in the quadratic growth factor, the meaning gender difference in the quadratic term can be interpreted like "boys have a higher quadratic term than girls" is not an intuitive interpretation.</p> <p>The growth factor reparameterization via the LGCM framework offers a more meaningful interpretation for researchers. With the S-DTSA model, the original growth parameters with three covariates can be represented as follows:</p> <p>Graph</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mtable><mtr><mtd><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub><mo>=</mo><msub><mrow><mi>ν</mi></mrow><mn>0</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>01</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>1</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>02</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>2</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>03</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>3</mn></msub><mo>,</mo></mrow></mtd></mtr><mtr><mtd><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub><mo>=</mo><msub><mrow><mi>ν</mi></mrow><mn>1</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>11</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>1</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>12</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>2</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>13</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>3</mn></msub><mo>,</mo><mtext> and</mtext></mrow></mtd></mtr><mtr><mtd><mrow><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub><mo>=</mo><msub><mrow><mi>ν</mi></mrow><mn>2</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>21</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>1</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>22</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>2</mn></msub><mo>+</mo><msub><mrow><mi>β</mi></mrow><mrow><mn>23</mn></mrow></msub><msub><mrow><mi>x</mi></mrow><mn>3</mn></msub><mo>,</mo></mrow></mtd></mtr></mtable></mrow></math> </ephtml> (<reflink idref="bib8" id="ref60">8</reflink>)</p> <p>where</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>0</mn></msub></mrow><mo>,</mo></math> </ephtml> </p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>1</mn></msub></mrow><mo>,</mo></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>η</mi></mrow><mn>2</mn></msub></mrow></math> </ephtml> reflect the intercept, slope, and quadratic term in S-DTSA, respectively;</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>x</mi></mrow><mn>1</mn></msub></mrow><mo>,</mo></math> </ephtml> </p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>x</mi></mrow><mn>2</mn></msub></mrow><mo>,</mo></math> </ephtml> and</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>x</mi></mrow><mn>3</mn></msub></mrow></math> </ephtml> indicates gender, race, and poverty ratio, respectively. With the transformation discussed above (see Equation 7), the maximizer (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>α</mi></mrow><mi>x</mi></msub></mrow></math> </ephtml> ) and and its maximum value (</p> <p>Graph</p> <p> <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>α</mi></mrow><mi>y</mi></msub></mrow></math> </ephtml> ) can be defined by offering an interpretation of how gender, race, and poverty ratio intersect to influence the timing and hazard of smoking initiation among adolescents.</p> <p>According to the model reparameterization, boys and girls from different racial backgrounds start smoking at different ages and face varying levels of hazard. For example, boys who are not Black or Hispanic typically begin smoking at the age of 12.55 with a hazard logit of −2.14 (10.5%), whereas Black or Hispanic boys start smoking slightly later, at 13.12 ages, but with a higher hazard logit of −2.52 (7.4%). Similarly, non-black/Hispanic girls start smoking at the age of 13.10 with a hazard logit of −2.16 (10.3%), in contrast to black/Hispanic girls who initiate smoking at the age of 13.60 with a hazard of −2.47 (7.8%). Additionally, the model identifies a "maximizer"—the age when smoking initiation peaks—ranging from 12.03 to 13.59 years. The maximum hazard logit, indicating the highest risk of initiation, varies from −2.32 to −2.22 (9.0% ∼ 9.8%) between different levels of poverty (± 3 SD). This approach simplifies interpretation and provides deeper insights into the demographic factors shaping the likelihood and timing of smoking initiation in youth.</p> <hd id="AN0187593736-19">4. Discussion</hd> <p>This study explored the application of time structure in discrete-time survival (DTSA) analysis. While discrete-time survival analysis has been extensively studied, its application in structuring the survival processes of events (e.g., school drop-out, smoking onset) within the field of social science has been less explored. Our study demonstrates how latent growth curve modeling (LGCM) can be effectively employed in the structured form of the survival processes, or structured DTSA (S-DTSA), providing a more integrated approach to time structure when compared to the use of traditional DTSA. The S-DTSA method along with the LGCM reparameterization offers a more succinct explanation of hazard profiles and enhances the process of hypothesis testing related to these profiles.</p> <p>One of the key advantages of the S-DTSA is that it clearly presents trends in hazard probabilities. Furthermore, it allows for a direct comparison of the functional forms of hazard trajectories across different studies. For instance, our analysis paralleled the study of Edelen et al. ([<reflink idref="bib6" id="ref61">6</reflink>]) regarding the onset of cigarette smoking. Although the peak hazard differs, our findings support the notion that the onset of adolescent smoking typically follows a quadratic pattern. This suggests the importance of targeted educational and monitoring efforts for middle school students to prevent the initiation of smoking.</p> <p>Additionally, the present study asserts a special emphasis on re-parameterizing functional forms, particularly the quadratic function. This re-parameterization using the LGCM framework aims to simplify the understanding of these forms for researchers. While our focus was on the quadratic function, the techniques we discussed for re-parameterization are applicable to various other functional forms, as highlighted in Preacher and Hancock ([<reflink idref="bib28" id="ref62">28</reflink>]). This approach not only enriches our understanding of discrete-time survival data but also expands the potential applications of these analytical techniques.</p> <p>The findings from the current study may be limited to certain aspects. Firstly, while our model exhibited unobserved heterogeneity in the hazard trajectory, we did not explicitly address this aspect for simplicity and clarity. Integrating latent class analysis—a method that blends survival analysis with mixture modeling—could provide a solution for incorporating this heterogeneity. Accounting for such heterogeneity is crucial in applied research contexts. For instance, Dean et al. ([<reflink idref="bib5" id="ref63">5</reflink>]) explored drug use initiation over time using a survival mixture model. In their study, latent classes were instrumental in capturing the unobserved heterogeneity in drug use initiation patterns. Similarly, incorporating heterogeneity in structured discrete-time survival analysis (S-DTSA) could significantly enhance our understanding of event occurrences. An additional aspect not addressed in our study is the significance of time-centering in structured discrete-time survival analysis (S-DTSA). This factor is especially critical in S-DTSA, contrasting with conventional latent growth curve modeling (LGCM) that uses continuous indicators.</p> <p>Survival analysis often involves datasets with a high prevalence of zeros reflecting that few events occur over time and extensive missingness due to censoring. In this context, the selection of the intercept, or centering time point, becomes critical. Selecting an intercept at a time point where zeros or missingness predominates can lead to disproportionately large logit values, a consequence of converting minimal percentages into probabilities. Therefore, choosing a suitable time point for centering is crucial not only for enhancing its interpretability in the context of the research question but also for maintaining the model's statistical robustness.</p> <p>Additionally, our investigation was limited to the use of a quadratic functional form of change over time. However, in practice, hazard trajectories can assume a variety of shapes, including linear or piecewise patterns. Thus, it could be carefully evaluated to find out the best representative functional form for hazard processes for S-DTSA. It can be another avenue for future research to apply the S-DTSA to event-history measures having different forms of hazard processes. Furthermore, our analysis here focused on time-invariant variables and assumed proportional hazards. However, S-DTSA is capable of handling time-varying covariates and time-varying effects. Specifically, time-varying effects can also be modeled, such as through piecewise constant functions. For more information, interested readers can refer to Puth et al. ([<reflink idref="bib29" id="ref64">29</reflink>]) and Tutz and Schmid ([<reflink idref="bib36" id="ref65">36</reflink>]). Lastly, our study assumed single, non-repeating events. However, S-DTSA can be extended to accommodate repeating events, such as birth, employment, or recidivism (e.g., Masyn, [<reflink idref="bib19" id="ref66">19</reflink>]). Modeling repeating hazard trajectories with S-DTSA presents another interesting avenue for future research. These represent potential extensions of S-DTSA that merit further exploration.</p> <p>To summarize, while the structured approach to DTSA and the reparameterization of functional forms provide substantial improvements in the interpretation and presentation of survival data, they introduce specific analytical challenges. Key challenges among these are considerations around time-centering and the interpretation of reparametrized coefficients. These challenges not only underscore the complexity inherent in such analyses but also highlight opportunities for future research and methodological development. Addressing these challenges will further enhance the utility of DTSA in various research settings.</p> <hd id="AN0187593736-20">Appendix</hd> <p></p> <hd id="AN0187593736-21">A1. Mplus Syntax for Unconditional Structured Discrete-Time Survival Analysis</hd> <p>data: file is smoke_cov.csv; variable: names are id v1-v10 sex citi race dgrade mgrade povrat; usevariables are v1-v10; categorical are v1-v10; missing = all (-99); analysis: estimator = MLR; model:</p> <p>! Intercepts are constrained to zero for identification</p> <p>[v1$1-v10$1@0];</p> <p>! Time structure: quadratic</p> <p>! Age: 7 8 9 10 11 12 13 14 15 16</p> <p>I S Q| v1@-4 v2@-3 v3@-2 v4@-1 v5@0 v6@1 v7@2 v8@3 v9@4 v10@5;</p> <p>! Growth factor means</p> <p>[I*-2 S*0.5 Q*-0.5] (eta0 eta1 eta2);</p> <p>! Growth variances and Covariance</p> <p>! Fixed to zero assuming no heterogeneity</p> <p>I@0 S@0 Q@0;</p> <p>I with S@0;</p> <p>I with Q@0;</p> <p>S with Q@0;</p> <p>MODEL CONSTRAINT:</p> <p>! Calculate the maximizer (ax) and the maximum values (ay) new(ax ay); ax = - eta1/(2*eta2); ay = eta0 - (eta1^2/(4*eta2));</p> <hd id="AN0187593736-22">A2. Mplus Syntax for Conditional Structured Discrete-Time Survival Analysis</hd> <p>data: file is smoke_cov.csv; variable: names are id v1-v10 sex citi race dgrade mgrade povrat; usevariables are v1-v10 sex race povrat; categorical are v1-v10; missing = all (-99); define: povrat = −1 * povrat; analysis: estimator = MLR; model:</p> <p>! Intercepts are constrained to zero for identification</p> <p>[v1$1-v10$1@0];</p> <p>! Time structure: quadratic</p> <p>! Age: 7 8 9 10 11 12 13 14 15 16</p> <p>I S Q| v1@-4 v2@-3 v3@-2 v4@-1 v5@0 v6@1 v7@2 v8@3 v9@4 v10@5;</p> <p>! Growth factor means</p> <p>[I*-2 S*0.5 Q*-0.5] (eta0 eta1 eta2);</p> <p>! Growth variances and Covariance</p> <p>! Fixed to zero assuming no heterogeneity</p> <p>I@0 S@0 Q@0;</p> <p>I with S@0;</p> <p>I with Q@0;</p> <p>S with Q@0;</p> <p>! Covariate effects on growth factors</p> <p>I on sex*-0.3 race*-0.5 povrat*-0.2 (b0_1-b0_3);</p> <p>S on sex*0.1 race*0.1 povrat*0.1 (b1_1-b1_3);</p> <p>Q on sex*0 race*0 povrat*0 (b2_1-b2_3);</p> <p>MODEL CONSTRAINT:</p> <p>! Calculate the maximizer (ax) and the maximum values (ay)</p> <p>! The difference in the maximizer and the maximum values</p> <p>! when covariates change by 1 unit (0 to 1).</p> <p>! This way may offer a statistical significance test for covariates' effects on</p> <p>! the maximizer and the maximum values new(eta0_x1-eta0_x3 eta1_x1-eta1_x3 eta2_x1-eta2_x3 ax0 ay0 ax_x1-ax_x3 ay_x1-ay_x3 ax_d1-ax_d3 ay_d1-ay_d3); ax0 = - eta1/(2*eta2); ay0 = eta0 - (eta1^2/(4*eta2));</p> <p>DO ($,1,3) DO (%,0,2) eta%_x$ = eta% + b%_$;</p> <p>DO ($,1,3) ax_x$ = - eta1_x$/(2*eta2_x$);</p> <p>DO ($,1,3) ay_x$ = eta0_x$ - (eta1_x$^2/(4*eta2_x$));</p> <p>DO ($,1,3) ax_d$ = ax_x$ - ax0;</p> <p>DO ($,1,3) ay_d$ = ay_x$ - ay0;</p> <ref id="AN0187593736-23"> <title> References </title> <blist> <bibl id="bib1" idref="ref11" type="bt">1</bibl> <bibtext> Allison, P. D. (2010). Survival analysis using SAS: a practical guide. SAS Institute.</bibtext> </blist> <blist> <bibl id="bib2" idref="ref28" type="bt">2</bibl> <bibtext> Clark Goings, T., Martinez, A., Joseph, P. L., Goode, R., & Bauer, D. (2023). Parenting, peers, and alcohol use initiation among black, white, and black-white adolescents: Evidence using discrete-time survival analysis. Journal of Psychoactive Drugs, 1 – 8. https://doi.org/10.1080/02791072.2023.2297193</bibtext> </blist> <blist> <bibl id="bib3" idref="ref13" type="bt">3</bibl> <bibtext> Cox, D. R. (1972). Regression models and life‐tables. Journal of the Royal Statistical Society: Series B (Methodological), 34, 187 – 202. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x</bibtext> </blist> <blist> <bibl id="bib4" idref="ref37" type="bt">4</bibl> <bibtext> Cudeck, R., & Du Toit, S. H. (2002). A version of quadratic regression with interpretable parameters. Multivariate Behavioral Research, 37, 501 – 519. https://doi.org/10.1207/S15327906MBR3704_04</bibtext> </blist> <blist> <bibl id="bib5" idref="ref41" type="bt">5</bibl> <bibtext> Dean, D. O., Cole, V., & Bauer, D. J. (2015). Delineating prototypical patterns of substance use initiations over time. Addiction (Abingdon, England), 110, 585 – 594. https://doi.org/10.1111/add.12816</bibtext> </blist> <blist> <bibl id="bib6" idref="ref6" type="bt">6</bibl> <bibtext> Edelen, M. O., Tucker, J. S., & Ellickson, P. L. (2007). A discrete time hazards model of smoking initiation among West Coast youth from age 5 to 23. Preventive Medicine, 44, 52 – 54. https://doi.org/10.1016/j.ypmed.2006.09.004</bibtext> </blist> <blist> <bibl id="bib7" idref="ref23" type="bt">7</bibl> <bibtext> Fairchild, A. L., Bayer, R., & Lee, J. S. (2019). The e-cigarette debate: What counts as evidence? American Journal of Public Health, 109, 1000 – 1006.</bibtext> </blist> <blist> <bibl id="bib8" idref="ref45" type="bt">8</bibl> <bibtext> Flora, D. B. (2008). Specifying piecewise latent trajectory models for longitudinal data. Structural Equation Modeling: A Multidisciplinary Journal, 15, 513 – 533. https://doi.org/10.1080/10705510802154349</bibtext> </blist> <blist> <bibl id="bib9" idref="ref4" type="bt">9</bibl> <bibtext> Graham, S. E., Willett, J. B., & Singer, J. D. (2013). Using discrete-time survival analysis to study event occurrence. In Longitudinal data analysis (pp. 329 – 371). Routledge.</bibtext> </blist> <blist> <bibtext> Grant, B. F., & Dawson, D. A. (1997). Age at onset of alcohol use and its association with DSM-IV alcohol abuse and dependence: Results from the national longitudinal alcohol epidemiologic survey. Journal of Substance Abuse, 9, 103 – 110. https://doi.org/10.1016/s0899-3289(97)90009-2</bibtext> </blist> <blist> <bibtext> Hingson, R. W., Heeren, T., & Winter, M. R. (2006). Age at drinking onset and alcohol dependence: Age at onset, duration, and severity. Archives of Pediatrics & Adolescent Medicine, 160, 739 – 746. https://doi.org/10.1001/archpedi.160.7.739</bibtext> </blist> <blist> <bibtext> Keiley, M. K., Martin, N. C., Canino, J., Singer, J., & Willett, J. (2007). Discrete-time survival analysis: Predicting whether, and if so when, an event occurs. Handbook of Longitudinal Research: Design, Measurement, and Analysis, 441 – 463.</bibtext> </blist> <blist> <bibtext> Kim, S. (2014). The comparison of discrete and continuous survival analysis [Doctoral dissertation]. Virginia Polytechnic Institute and State University.</bibtext> </blist> <blist> <bibtext> Kleinbaum, D. G., & Klein, M. (1996). Survival analysis a self-learning text. Springer.</bibtext> </blist> <blist> <bibtext> Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). The Guilford Press.</bibtext> </blist> <blist> <bibtext> Lee, T. K., Wickrama, K. K. A. S., & O'Neal, C. W. (2018). Application of latent growth curve analysis with categorical responses in social behavioral research. Structural Equation Modeling: a Multidisciplinary Journal, 25, 294 – 306. https://doi.org/10.1080/10705511.2017.1375858</bibtext> </blist> <blist> <bibtext> Marcoulides, K. M. (2018). Automated latent growth curve model fitting: A segmentation and knot selection approach. Structural Equation Modeling: A Multidisciplinary Journal, 25, 687 – 699. https://doi.org/10.1080/10705511.2018.1424548</bibtext> </blist> <blist> <bibtext> Masyn, K. E. (2003). Discrete-time survival mixture analysis for single and recurrent events using latent variables. University of California.</bibtext> </blist> <blist> <bibtext> Masyn, K. E. (2009). Discrete-time survival factor mixture analysis for low-frequency recurrent event histories. Research in Human Development, 6, 165 – 194. https://doi.org/10.1080/15427600902911270</bibtext> </blist> <blist> <bibtext> Masyn, K. E. (2014). Discrete-time survival analysis in prevention science. In Sloboda, Z., Petras, H. (Eds.), Defining prevention science. Advances in Prevention Science. Springer. https://doi.org/10.1007/978-1-4899-7424-2_22</bibtext> </blist> <blist> <bibtext> Meredith, W., & Tisak, J. (1990). Latent curve analysis. Psychometrika, 55, 107 – 122. https://doi.org/10.1007/BF02294746</bibtext> </blist> <blist> <bibtext> Morean, M. E., Corbin, W. R., & Fromme, K. (2012). Age of first use and delay to first intoxication in relation to trajectories of heavy drinking and alcohol‐related problems during emerging adulthood. Alcoholism, Clinical and Experimental Research, 36, 1991 – 1999. https://doi.org/10.1111/j.1530-0277.2012.01812.x</bibtext> </blist> <blist> <bibtext> Muthén, B., & Masyn, K. (2005). Discrete-time survival mixture analysis. Journal of Educational and Behavioral Statistics, 30, 27 – 58. https://doi.org/10.3102/10769986030001027</bibtext> </blist> <blist> <bibtext> Muthén, L. K., & Muthén, B. O. (1998 –2017). Mplus user's guide (8th ed.). Muthén & Muthén.</bibtext> </blist> <blist> <bibtext> Pratschke, J., Haase, T., Comber, H., Sharp, L., de Camargo Cancela, M., & Johnson, H. (2016). Mechanisms and mediation in survival analysis: Towards an integrated analytical framework. BMC Medical Research Methodology, 16, 27. https://doi.org/10.1186/s12874-016-0130-6</bibtext> </blist> <blist> <bibtext> Preacher, K. J. (2018). Latent growth curve models. In The reviewer's guide to quantitative methods in the social sciences (pp. 178 – 192). Routledge.</bibtext> </blist> <blist> <bibtext> Preacher, K. J., & Hancock, G. R. (2012). On interpretable reparameterizations of linear and nonlinear latent growth curve models. In J. R. Harring & G. R. Hancock (Eds.), Advances in longitudinal methods in the social and behavioral sciences (pp. 25 – 58). IAP Information Age Publishing.</bibtext> </blist> <blist> <bibtext> Preacher, K. J., & Hancock, G. R. (2015). Meaningful aspects of change as novel random coefficients: A general method for reparameterizing longitudinal models. Psychological Methods, 20, 84 – 101. https://doi.org/10.1037/met0000028</bibtext> </blist> <blist> <bibtext> Puth, M.-T., Tutz, G., Heim, N., Münster, E., Schmid, M., & Berger, M. (2020). Tree-based modeling of time-varying cofficients in discrete time-to-event models. Lifetime Data Analysis, 26, 545 – 572. https://doi.org/10.1007/s10985-019-09489-7</bibtext> </blist> <blist> <bibtext> Raykov, T., Gorelick, P. B., Zajacova, A., & Marcoulides, G. A. (2018). Discrete time survival analysis via latent variable modeling: A note on lagged depression links to stroke in middle and late life. Structural Equation Modeling: A Multidisciplinary Journal, 25, 115 – 120. https://doi.org/10.1080/10705511.2017.1327817</bibtext> </blist> <blist> <bibtext> Schmid, M., & Berger, M. (2021). Competing risks analysis for discrete time‐to‐event data. Wiley Interdisciplinary Reviews: Computational Statistics, 13, e1529.</bibtext> </blist> <blist> <bibtext> Singer, J. D., & Willett, J. B. (1993). It's about time: Using discrete-time survival analysis to study duration and the timing of events. Journal of Educational Statistics, 18, 155 – 195. https://doi.org/10.2307/1165085</bibtext> </blist> <blist> <bibtext> Singer, J. D., & Willett, J. B. (2003). Extending the discrete-time hazard model. In Applied longitudinal data analysis: Modeling change and event occurrence (pp. 407 – 467). Oxford University Press.</bibtext> </blist> <blist> <bibtext> Suresh, K., Severn, C., & Ghosh, D. (2022). Survival prediction models: An introduction to discrete-time modeling. BMC Medical Research Methodology, 22, 207.</bibtext> </blist> <blist> <bibtext> Sterba, S. K. (2014). Fitting nonlinear latent growth curve models with individually varying time points. Structural Equation Modeling: A Multidisciplinary Journal, 21, 630 – 647. https://doi.org/10.1080/10705511.2014.919828</bibtext> </blist> <blist> <bibtext> Tutz, G Schmid. (2016). Modeling Discrete Time-to-Event Data.</bibtext> </blist> <blist> <bibtext> Vermunt, J. K., & Moors, G. (2009). Event history analysis. Handbook of Quantitative Methods in Psychology, 658 – 674.</bibtext> </blist> <blist> <bibtext> Willett, J. B., & Singer, J. D. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press.</bibtext> </blist> </ref> <ref id="AN0187593736-24"> <title> Footnotes </title> <blist> <bibtext> According to Google Scholar, the first has been cited over 1,200 times and the second over 14,000 times.</bibtext> </blist> <blist> <bibtext> In continuous-time survival analysis, the function <emph>h<subs>j</subs></emph> is known as the "hazard function". This function represents the hazard rate, or instantaneous event rate. Specifically, <emph>h<subs>j</subs>Δj</emph> approximates the probability that an individual who has not experienced the event by time j will experience it in the next instant, assuming time is continuous. It's crucial to understand that in continuous-time survival analysis, the hazard rate is not a probability in the traditional sense (for a detailed explanation, see Masyn, [18], p. 30). This contrasts with discrete-time survival analysis, where the hazard function is equivalent to the hazard probability.</bibtext> </blist> <blist> <bibtext> This specification is based on a latent variable modeling framework, which is equivalent to the logistic regression survival model (Cox, [3]; Singer & Willett, [32]) in its most basic form (Muthén & Masyn, [23]). One key difference between these approaches is the data format: the logistic approach uses long-format data, while the latent variable modeling approach utilizes wide-format data. This study adopts the latent variable modeling approach because it allows for the easier modeling of complex survival models (e.g., Pratschke et al., [25]).</bibtext> </blist> </ref> <aug> <p>By Sooyong Lee; Kahyun Lee and Kejin Lee</p> <p>Reported by Author; Author; Author</p> </aug> <nolink nlid="nl1" bibid="bib32" firstref="ref1"></nolink> <nolink nlid="nl2" bibid="bib37" firstref="ref2"></nolink> <nolink nlid="nl3" bibid="bib12" firstref="ref3"></nolink> <nolink nlid="nl4" bibid="bib33" firstref="ref5"></nolink> <nolink nlid="nl5" bibid="bib38" firstref="ref9"></nolink> <nolink nlid="nl6" bibid="bib14" firstref="ref10"></nolink> <nolink nlid="nl7" bibid="bib18" firstref="ref12"></nolink> <nolink nlid="nl8" bibid="bib13" firstref="ref17"></nolink> <nolink nlid="nl9" bibid="bib23" firstref="ref20"></nolink> <nolink nlid="nl10" bibid="bib19" firstref="ref21"></nolink> <nolink nlid="nl11" bibid="bib31" firstref="ref22"></nolink> <nolink nlid="nl12" bibid="bib30" firstref="ref24"></nolink> <nolink nlid="nl13" bibid="bib34" firstref="ref25"></nolink> <nolink nlid="nl14" bibid="bib20" firstref="ref26"></nolink> <nolink nlid="nl15" bibid="bib15" firstref="ref36"></nolink> <nolink nlid="nl16" bibid="bib21" firstref="ref42"></nolink> <nolink nlid="nl17" bibid="bib16" firstref="ref43"></nolink> <nolink nlid="nl18" bibid="bib26" firstref="ref44"></nolink> <nolink nlid="nl19" bibid="bib17" firstref="ref46"></nolink> <nolink nlid="nl20" bibid="bib35" firstref="ref47"></nolink> <nolink nlid="nl21" bibid="bib27" firstref="ref52"></nolink> <nolink nlid="nl22" bibid="bib28" firstref="ref53"></nolink> <nolink nlid="nl23" bibid="bib10" firstref="ref56"></nolink> <nolink nlid="nl24" bibid="bib11" firstref="ref57"></nolink> <nolink nlid="nl25" bibid="bib22" firstref="ref58"></nolink> <nolink nlid="nl26" bibid="bib24" firstref="ref59"></nolink> <nolink nlid="nl27" bibid="bib29" firstref="ref64"></nolink> <nolink nlid="nl28" bibid="bib36" firstref="ref65"></nolink>
Header DbId: eric
DbLabel: ERIC
An: EJ1501485
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Discrete-Time Survival Analysis Incorporating Time Structure in Developmental Research
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Sooyong+Lee%22">Sooyong Lee</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-7964-4508">0000-0002-7964-4508</externalLink>)<br /><searchLink fieldCode="AR" term="%22Kahyun+Lee%22">Kahyun Lee</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-5497-8114">0000-0002-5497-8114</externalLink>)<br /><searchLink fieldCode="AR" term="%22Kejin+Lee%22">Kejin Lee</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0001-6775-282X">0000-0001-6775-282X</externalLink>)
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Structural+Equation+Modeling%3A+A+Multidisciplinary+Journal%22"><i>Structural Equation Modeling: A Multidisciplinary Journal</i></searchLink>. 2025 32(5):929-940.
– Name: Avail
  Label: Availability
  Group: Avail
  Data: Routledge. Available from: Taylor & Francis, Ltd. 530 Walnut Street Suite 850, Philadelphia, PA 19106. Tel: 800-354-1420; Tel: 215-625-8900; Fax: 215-207-0050; Web site: http://www.tandf.co.uk/journals
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 12
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2025
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Reports - Research
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Statistical+Analysis%22">Statistical Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Time%22">Time</searchLink><br /><searchLink fieldCode="DE" term="%22Smoking%22">Smoking</searchLink><br /><searchLink fieldCode="DE" term="%22Longitudinal+Studies%22">Longitudinal Studies</searchLink><br /><searchLink fieldCode="DE" term="%22National+Surveys%22">National Surveys</searchLink><br /><searchLink fieldCode="DE" term="%22Age%22">Age</searchLink><br /><searchLink fieldCode="DE" term="%22Social+Science+Research%22">Social Science Research</searchLink><br /><searchLink fieldCode="DE" term="%22Behavioral+Science+Research%22">Behavioral Science Research</searchLink>
– Name: SubjectThesaurus
  Label: Assessment and Survey Identifiers
  Group: Su
  Data: <searchLink fieldCode="SU" term="%22National+Longitudinal+Survey+of+Youth%22">National Longitudinal Survey of Youth</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1080/10705511.2024.2432598
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 1070-5511<br />1532-8007
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Discrete-time survival analysis (DTSA) is a method widely used by social and behavioral researchers as it aids in the exploration of patterns in time-to-event measures. However, the traditional DTSA models often fail to adequately represent the structured dynamics of hazardous processes. This study introduces structural DTSA, an alternative approach that extends traditional DTSA by incorporating functional forms of hazard changes. With structural DTSA, a reparameterization of the functional forms is also possible for more meaningful interpretations of the results of time-to-event data analyses. This study aims to provide a detailed tutorial on structured DTSA, demonstrating its applicability in social and behavioral research. Henceforth, we demonstrate the application of structured DTSA using data on smoking initiation from the National Longitudinal Study of Youth 1997 (NLSY97). These findings highlight the potential of structured DTSA for various developmental studies.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2026
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1501485
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1501485
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1080/10705511.2024.2432598
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 12
        StartPage: 929
    Subjects:
      – SubjectFull: Statistical Analysis
        Type: general
      – SubjectFull: Time
        Type: general
      – SubjectFull: Smoking
        Type: general
      – SubjectFull: Longitudinal Studies
        Type: general
      – SubjectFull: National Surveys
        Type: general
      – SubjectFull: Age
        Type: general
      – SubjectFull: Social Science Research
        Type: general
      – SubjectFull: Behavioral Science Research
        Type: general
      – SubjectFull: National Longitudinal Survey of Youth
        Type: general
    Titles:
      – TitleFull: Discrete-Time Survival Analysis Incorporating Time Structure in Developmental Research
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Sooyong Lee
      – PersonEntity:
          Name:
            NameFull: Kahyun Lee
      – PersonEntity:
          Name:
            NameFull: Kejin Lee
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-print
              Value: 1070-5511
            – Type: issn-electronic
              Value: 1532-8007
          Numbering:
            – Type: volume
              Value: 32
            – Type: issue
              Value: 5
          Titles:
            – TitleFull: Structural Equation Modeling: A Multidisciplinary Journal
              Type: main
ResultId 1