Sequential Reservoir Computing for Log File-Based Behavior Process Data Analyses

Saved in:
Bibliographic Details
Title: Sequential Reservoir Computing for Log File-Based Behavior Process Data Analyses
Language: English
Authors: Jiawei Xiong (ORCID 0000-0002-2069-8720), Shiyu Wang, Cheng Tang, Qidi Liu (ORCID 0000-0002-6797-4163), Rufei Sheng, Bowen Wang, Huan Kuang (ORCID 0000-0003-2651-2867), Allan S. Cohen, Xinhui Xiong
Source: Journal of Educational Measurement. 2026 63(1).
Availability: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
Peer Reviewed: Y
Page Count: 41
Publication Date: 2026
Document Type: Journal Articles
Reports - Research
Descriptors: Data Use, Data Analysis, Computation, Computer Assisted Testing, Response Style (Tests), Algorithms, Sequential Approach, Data Collection
DOI: 10.1111/jedm.12413
ISSN: 0022-0655
1745-3984
Abstract: The use of process data in assessment has gained attention in recent years as more assessments are administered by computers. Process data, recorded in computer log files, capture the sequence of examinees' response activities, for example, timestamped keystrokes, during the assessment. Traditional measurement methods are often inadequate for handling this type of data. In this paper, we proposed a sequential reservoir method (SRM) based on a reservoir computing model using the echo state network, with the particle swarm optimization and singular value decomposition as optimization. Designed to regularize features from process data through a computational self-learning algorithm, this method has been evaluated using both simulated and empirical data. Simulation results suggested that, on one hand, the model effectively transforms action sequences into standardized and meaningful features, and on the other hand, these features are instrumental in categorizing latent behavioral groups and predicting latent information. Empirical results further indicate that SRM can predict assessment efficiency. The features extracted by SRM have been verified as related to action sequence lengths through the correlation analysis. This proposed method enhances the extraction and accessibility of meaningful information from process data, presenting an alternative to existing process data technologies.
Abstractor: As Provided
Entry Date: 2026
Accession Number: EJ1501512
Database: ERIC
Full text is not displayed to guests.
FullText Links:
  – Type: pdflink
    Url: https://content.ebscohost.com/cds/retrieve?content=AQICAHj0k_4E0hTGH8RJwT4gCJyBsGNe_WN95AvKlDbXJGqwxwFnZbcc-zl38Sq0MHfkMc_1AAAA4zCB4AYJKoZIhvcNAQcGoIHSMIHPAgEAMIHJBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDD_PhyeYPemgYCreqAIBEICBmxqtZ3Xf7_g8rJz-Oq8H8cqfrOHSQ26HWPvQ0v_MDQxTtMLrK7XofSO6grhfu4NqaNp1hSO9Xg2EFRT2_XyICHpp1DrDpZb7hQx0tmgCZXwksKTt5CHlOSBwXFYMudFIJxOEbI4ZDgWKTAAIfGwT-_j48XqWAZ20trBVN0jJEFWD4jrbAlIzszyxYxJ-vZFkWJRXvWmHr2izWjDw
Text:
  Availability: 1
  Value: <anid>AN0192629994;mea01mar.26;2026Apr01.06:22;v2.2.500</anid> <title id="AN0192629994-1">Sequential Reservoir Computing for Log File‐Based Behavior Process Data Analyses </title> <sbt id="AN0192629994-2">Introduction</sbt> <p>The use of process data in assessment has gained attention in recent years as more assessments are administered by computers. Process data, recorded in computer log files, capture the sequence of examinees' response activities, for example, timestamped keystrokes, during the assessment. Traditional measurement methods are often inadequate for handling this type of data. In this paper, we proposed a sequential reservoir method (SRM) based on a reservoir computing model using the echo state network, with the particle swarm optimization and singular value decomposition as optimization. Designed to regularize features from process data through a computational self‐learning algorithm, this method has been evaluated using both simulated and empirical data. Simulation results suggested that, on one hand, the model effectively transforms action sequences into standardized and meaningful features, and on the other hand, these features are instrumental in categorizing latent behavioral groups and predicting latent information. Empirical results further indicate that SRM can predict assessment efficiency. The features extracted by SRM have been verified as related to action sequence lengths through the correlation analysis. This proposed method enhances the extraction and accessibility of meaningful information from process data, presenting an alternative to existing process data technologies.</p> <p>Process data captures examinees' activities during the assessment. These activities offer valuable source of information, including improving assessment design, quality, and validity; serving as evidence of construct validity; and examining group differences and fairness (Ercikan et al., [<reflink idref="bib6" id="ref1">6</reflink>]). Collecting these data is facilitated by computer‐based assessments as that kind of environment typically has a user‐friendly interface, making it easy to capture and store the activities involved in the response process in computer log files. The data generated during the assessment process typically consist of sequences of actions with corresponding time points. These sequential time‐stamped data are usually referred to as the process data (Tang et al., [<reflink idref="bib26" id="ref2">26</reflink>]). Figure 1 illustrates a general format of process data, indicating examinees' action sequences along with time. Examinees may employ diverse answering paths when responding to the same item over time, leading to variations in their behaviors and time allocation for that particular item. Figure 1 illustrates the varied lengths of response data, and shows a diverse array of activities captured within the log files generated by examinees' response. For example, an examinee could potentially select response "<emph>a</emph>" at one time point, and then delete that response with the action "<emph>del</emph>" at the subsequent time point.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0001.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0001.jpg" title="1 An illustration of the time‐wrapped process data. It showcases five different examinees each employing different behavioral sequences over different time line." /> </p> <p></p> <p>A major objective of assessments is to provide information about the status of examinees in relation to the construct of interest. By analyzing process data, we can potentially make use of additional information that would reveal patterns that could otherwise be missed, helping us identify behavioral patterns and understand problem‐solving skills (Feng & Cai, [<reflink idref="bib8" id="ref3">8</reflink>]; He et al., [<reflink idref="bib12" id="ref4">12</reflink>]; Kuang & Sahin, [<reflink idref="bib17" id="ref5">17</reflink>]; Wang et al., [<reflink idref="bib29" id="ref6">29</reflink>]). This type of information is associated with examinees' cognitive activities and may be useful for understanding the latent abilities that are the focus of the assessment, and therefore, this information is potentially reflective of examinees' reasoning processes (Han & Wilson, [<reflink idref="bib11" id="ref7">11</reflink>]; Xiong et al., [<reflink idref="bib32" id="ref8">32</reflink>]). Recent research has directed attention toward this overarching theme of response process data (Zhang et al., [<reflink idref="bib34" id="ref9">34</reflink>]). The underlying assumption in these studies is that the process data provide valuable insights into the thinking and reasoning that drive examinees' responses, thereby enriching the information obtained from assessment scores alone (Ercikan & Pellegrino, [<reflink idref="bib7" id="ref10">7</reflink>]).</p> <p>Analyzing complex and diverse process data, as depicted in Figure 1, poses a challenge when employing data‐driven methods due to the variable length of categorical response sequences on each item. Unlike item response data, these sequences depend on the number of actions taken by the examinee. Several scholars have proposed innovative methodologies to analyze and model process data effectively. Mislevy ([<reflink idref="bib21" id="ref11">21</reflink>]) introduced an analytical approach focusing on characterizing evidence in process data. That is, extracting features that contain meaningful information from the process data. Although feature extraction may suffer from lack of meaningful interpretation, regardless of what machine learning algorithms are used, features can still be extracted using techniques like data mining, knowledge engineering, and computational linguistics (Bejar et al., [<reflink idref="bib2" id="ref12">2</reflink>]). Similarly, Xu et al. ([<reflink idref="bib33" id="ref13">33</reflink>]) presented a latent topic model with a Markov transition process, leveraging topic transition probabilities and response times to capture examinees' learning strategies. Tang et al. ([<reflink idref="bib27" id="ref14">27</reflink>]) proposed a sequence‐to‐sequence autoencoder utilizing recurrent neural networks (RNNs; Medsker & Jain, [<reflink idref="bib20" id="ref15">20</reflink>]) to automatically extract numerical features from log files, thereby eliminating the need for manual feature engineering. RNNs are a class of neural networks that allows the previous outputs to be used as inputs to the next layer while also having hidden layers, which are used for the processing of sequential data (Medsker & Jain, [<reflink idref="bib20" id="ref16">20</reflink>]). Among these methods, RNNs exhibit great advantages compared to other statistical approaches. They obviate the need for explicit encoding of domain knowledge, rendering them more adaptable across diverse domains. Unlike traditional methods reliant on prior knowledge about items in assessments, RNNs operate independently of such information, thereby enhancing their versatility. Moreover, RNNs excel in capturing intricate representations of examinees' response actions, facilitating a deeper understanding of underlying patterns and behaviors (Goodfellow et al., [<reflink idref="bib9" id="ref17">9</reflink>]).</p> <p>RNNs are versatile in handling sequential data but their training is challenging and computationally expensive due to complex parameter updates (Pascanu et al., [<reflink idref="bib23" id="ref18">23</reflink>]). To address this, reservoir computing (RC), derived from RNNs, has emerged. It learns data representations through a fixed, nonlinear reservoir system (Jaeger, [<reflink idref="bib14" id="ref19">14</reflink>]). Although both RNNs and RC process sequential data, they differ in structure, training, and efficiency. RNNs consist of neuron‐like units organized into layers with cyclic connections, enabling them to retain memory of past inputs. However, training RNNs involves optimizing recurrent connections, which is computationally intensive and prone to issues like vanishing gradients (Pascanu et al., [<reflink idref="bib23" id="ref20">23</reflink>]). In contrast, RC comprises an input layer, a randomly connected reservoir, and an output layer. Only the weights connecting the reservoir to the output are trained, simplifying the process and reducing computational costs (Lukoševičius, [<reflink idref="bib19" id="ref21">19</reflink>]). RC's lower resource requirements make it more accessible and efficient, while still performing comparably to traditional RNNs for some situations (Bompas et al., [<reflink idref="bib4" id="ref22">4</reflink>]). Table 1 summarized the comparison between RNNs and RC regarding the methodology and other aspects.</p> <p>1 Table Comparison between Recurrent Neural Network and Reservoir Computing</p> <p> <ephtml> <table><thead><tr valign="bottom"><th /><th align="center">Recurrent Neural Network (RNN)</th><th align="center">Reservoir Computing (RC)</th></tr></thead><tbody><tr><td>Methodology</td><td>Networks with loops in them, allowing information to persist</td><td>Uses a fixed, random reservoir to project input into a higher‐dimensional space</td></tr><tr><td>Parameter</td><td>Requires training through backpropagation through time, which can be complex and computationally expensive</td><td>Only the output weights are trained, significantly simplifying the training process</td></tr><tr><td>Applicability</td><td>Broad applicability in sequence prediction, natural language processing, and more</td><td>Often used in time‐series prediction, signal processing, and situations where training data is limited</td></tr><tr><td>Pros</td><td>Flexibility in learning and adapting to various sequential data patterns</td><td>Lower computational cost, easier to implement, and less prone to overfitting</td></tr><tr><td>Cons</td><td>Computationally intensive due to the complexities of training and parameter updates, prone to overfitting, and can suffer from gradient issues</td><td>Less flexible in learning transformations, dependent on reservoir quality, and might require more samples to perform well</td></tr></tbody></table> </ephtml> </p> <p>RC has found application in various contexts for extracting features from sequential data (Bianchi et al., [<reflink idref="bib3" id="ref23">3</reflink>]; Wyffels & Schrauwen, [<reflink idref="bib30" id="ref24">30</reflink>]; Xiong, [<reflink idref="bib31" id="ref25">31</reflink>]). Typically, these applications aim to predict outcomes or detect anomalies, prioritizing efficiency, error detection, anomaly detection, or trend identification. However, when applied to educational process data, RC faces unique challenges. Traditional RC methods often employ a target variable to guide the training process within a supervised framework. This approach is grounded in the assumption that the data's informative content can be primarily dictated by this target variable. Educational process data, however, differ from other forms of sequential data due to the complex cognitive processes, learning dynamics, and educational objectives involved. Understanding learning behaviors, cognitive challenges, and instructional effectiveness becomes paramount in this context. Consequently, patterns extracted from educational process data may be subtler and influenced by pedagogical factors, individual differences, and learning progressions. Furthermore, as outlined in Table 1, while the performance of RC is contingent upon the quality of the reservoir, there remains a notable gap in the literature regarding the optimization of reservoirs in RC applications. To address these challenges, this study proposes a novel method, the Sequence Reservoir Method (SRM), grounded in RC theory and optimization algorithms. SRM overcomes the traditional reliance on a target variable in RC by treating the educational process data as a vector space. This approach allows SRM to discover lower‐dimensional representations (i.e., features) that encapsulate the inherent structure and interconnections among actions. In this way, SRM can detect the subtle trajectories of assessment behaviors that are characteristic of educational process data. These trajectories that are ones that are shaped by cognitive challenges and pedagogical strategies and that evolve as learning progresses. The multidimensional features extracted by SRM reflect a richer understanding of these dynamics, offering a more nuanced view than could be gleaned from the outcome‐focused perspectives of conventional RC.</p> <p>Through simulations and empirical studies, we demonstrate SRM's versatility across various assessment environments and subject areas where process data are collected. Its potential applications include detailed analyses of problem‐solving assessments, interpretation of student performance and engagement patterns in Massive Open Online Courses (MOOCs), and implementation in adaptive learning scenarios. SRM is expected to enrich the existing body of literature on process data and facilitate a deeper understanding of examinees' behaviors, significant user interactions, and comprehension within educational settings. By leveraging SRM, researchers and educators can gain unparalleled insights into the multifaceted nature of learning and interaction, thereby enhancing educational practices and learner outcomes.</p> <hd id="AN0192629994-4">The Structure of Sequence Reservoir Method</hd> <p>In analyzing nonuniform, varying‐length action sequences from log files, the extraction of a feature matrix—where rows represent examinees and columns represent distinct features—is a crucial step. In this section, we introduce our proposed Sequence Reservoir Method (SRM) for extracting this feature matrix from examinees' process data. The SRM operates on sequential process data through a series of steps. Initially, categorical action sequences undergo transformation into numerical representations, referred to as embedded sequences. These embedded sequences are then input into a fundamental component of SRM: an Echo State Network (ESNs; Jaeger, [<reflink idref="bib14" id="ref26">14</reflink>]). Reforming the ESN framework, our major contribution is to propose an optimization algorithm that can learn the reservoir weight matrix through a self‐learning framework. As summarized in Table 1, this reservoir weight matrix is a critical element in ESN and it can learn information from the input embedded sequences. This optimization algorithm combines particle swarm optimization (PSO; Kennedy & Eberhart, [<reflink idref="bib15" id="ref27">15</reflink>]) with singular value decomposition (SVD; Wall et al., [<reflink idref="bib28" id="ref28">28</reflink>]) to construct the optimal reservoir weight matrix. Once this matrix is constructed, SRM can extract the feature matrix, facilitating subsequent statistical analysis on the final output layer. In the subsequent sections, we provide comprehensive explanation of each component of SRM, with a specific emphasis on the ESN and its integration with PSO and SVD techniques.</p> <hd id="AN0192629994-5">Action Embedding</hd> <p>Categorical action sequences are converted into numerical representations, known as embedded action sequences, as direct modeling of categorical data is not feasible. For the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>z</mi><annotation encoding="application/x-tex">$z$</annotation></semantics></math> </ephtml> th item, suppose there exist a total of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>n</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{n}_z}$</annotation></semantics></math> </ephtml> possible unique actions, which are denoted as <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi mathvariant="bold-italic">S</mi><mspace width="0.33em" /></mrow><mo>=</mo><mo>(</mo><mrow><msubsup><mi>s</mi><mn>1</mn><mi>z</mi></msubsup><mo>,</mo><mtext>...</mtext><mo>,</mo><msubsup><mi>s</mi><msub><mi>n</mi><mi>z</mi></msub><mi>z</mi></msubsup></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">${\bm{S\ }} = ({s_1^z,\ldots,s_{{{n}_z}}^z})$</annotation></semantics></math> </ephtml> . The objective of embedding is to first map each of the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>n</mi><mi>k</mi></msub><annotation encoding="application/x-tex">${{n}_k}$</annotation></semantics></math> </ephtml> unique categorical actions to an <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>u</mi></msub><annotation encoding="application/x-tex">${{N}_u}$</annotation></semantics></math> </ephtml> dimensional vector <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi mathvariant="bold-italic">a</mi><annotation encoding="application/x-tex">${\bm{a}}$</annotation></semantics></math> </ephtml> . <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>u</mi></msub><annotation encoding="application/x-tex">${{N}_u}$</annotation></semantics></math> </ephtml> is called the embedding size and was learned by the model from the data. So within each item, the embedded action sequences will consist of different arrangements of these <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>n</mi><mi>k</mi></msub><annotation encoding="application/x-tex">${{n}_k}$</annotation></semantics></math> </ephtml> vectors, reflecting the sequence of actions taken by each examinee. Examinees' actions in an assessment are not independent; rather, they are part of a behavioral sequence that can reflect the examinee's thought process, strategy, and even areas of struggle. Therefore, given the context of examinees' actions, this embedding process uses learnable embeddings from data, specifically through model‐based learning. That is, these embeddings are learned independently through a model‐based learning layer, which uses a model that capture the relationships between different actions based on their context within the data (Kim et al., [<reflink idref="bib16" id="ref29">16</reflink>]), before the SRM optimization process.</p> <hd id="AN0192629994-6">Echo State Networks</hd> <p>ESNs comprise an RNN‐based reservoir, making it suitable for processing sequential data. ESNs are commonly used in RC and require less learning time to converge while achieving good model accuracy (Chouikhi et al., [<reflink idref="bib5" id="ref30">5</reflink>]). Figure 2 illustrates the structure of an ESN, and the color‐filled bubbles represent neurons within the neural network. It consists of three components: an input layer for reading in sequential data, a random sparse recurrent hidden layer, called a reservoir, which consists of an untrained RNN that functions as a temporal kernel by mapping the input into a high‐dimensional feature space, and an output layer for training the high‐dimensional features resulting from the reservoir. The primary roles of the reservoir in ESNs are to nonlinearly transform the sequential inputs to a high‐dimensional space and then store information in the features. There are three weight matrices in ESN: an input weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>i</mi><mi>n</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{in}}$</annotation></semantics></math> </ephtml> weights the input data to the reservoir, a reservoir weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> weights the data from input layer, and an oputput weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>o</mi><mi>u</mi><mi>t</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{out}}$</annotation></semantics></math> </ephtml> that can be learned by the guidance from output layer variables.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0002.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0002.jpg" title="2 The structure of an echo state network (ESN). An ESN has three components: an input layer, a random sparse recurrent hidden layer called a reservoir, and an output layer." /> </p> <p></p> <p>Among these key model parameters, the input weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>i</mi><mi>n</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{in}}$</annotation></semantics></math> </ephtml> is initialized but not trained. The weight matrix of the recurrent connections within the reservoir <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> is initialized and optimized in a later stage through the use of PSO and SVD. In the context of ESN, obtaining the weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> is usually regarded as optimization rather training, due to the difference between conventional definition of training in neural networks versus the unique approach taken by ESN (Lukoševičius, [<reflink idref="bib19" id="ref31">19</reflink>]). In most neural networks, the training process involves adjusting all the weights of the network (including input, hidden, and output layers) through a process like backpropagation, based on minimizing the error between the network output and the desired output. However, ESN only adjusts the parameters of the reservoir through a fitness function to ensure the reservoir's dynamics are suitable to calculate the features, which does not involve direct adjustment for the input weight matrix or output weight matrix based on the training data's input‐output relationship through backpropagation. The only requirement of ESN is the echo state property, which indicates that the maximal absolute eigenvalue of the reservoir matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> should be less than 1 (Lukoševičius, [<reflink idref="bib19" id="ref32">19</reflink>]). Once the reservoir weight matrix has been optimized, the embedded action sequences can be processed through the reservoir and final feature matrix can be obtained. The output layer could be any target variables defined by the task at hand such as students' scores, latent traits, or group labels. Given the final feature matrix, the output weight matrix, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>o</mi><mi>u</mi><mi>t</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{out}}$</annotation></semantics></math> </ephtml> , can be trained with a simple learning algorithm such as regression or classification. The unique optimization and training characteristics of ESN make the overall modeling process much faster and reduces the computational resources required by common deep neural network, and still keeps good performance on sequential data processing.</p> <hd id="AN0192629994-8">Particle Swarm Optimization and Singular Value Decomposition</hd> <p>In this section, we introduce our proposed optimization algorithm that combines both PSO and SVD to optimize SRM. As mentioned earlier, in the ESN, the initialization and optimization of the reservoir weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> occur in a subsequent phase. Of particular importance here is the optimization of the singular value spectrum of this matrix, as it significantly impacts the performance of the ESN (Strauss et al., [<reflink idref="bib25" id="ref33">25</reflink>]). Therefore, the proposed optimization algorithm is designed to strategically construct a set of singular values optimally. This, in turn, shapes a reservoir weight matrix capable of most effectively capturing and learning latent information from the input data.</p> <p>PSO is an evolutionary learning method for optimizing parameters. Chouikhi et al. ([<reflink idref="bib5" id="ref34">5</reflink>]) used PSO in the ESN to pretrain the fixed reservoir matrix weights and then applied the weights in the ESN to process time series. Results suggested that using PSO can enhance the learning results for ESN‐based time series forecasting with fast parameter convergence. PSO is initialized with a group of random particles and searches for optimal solutions by iteratively updating values. In this system, each particle is a single solution (i.e., a set of singular values) that will move through the search space to a global optimum. After we obtained the best singular values, we will inversely use SVD to construct a reservoir weight matrix. SVD factorizes a matrix and generalizes the eigenvalue decomposition of a square normal matrix with an orthonormal eigen basis to any dimensional matrix. It has been used for training neural networks such as growing ESN (Li & Li, [<reflink idref="bib18" id="ref35">18</reflink>]) due to its significantly better prediction accuracy and higher estimation performance with fewer tunable parameters and less time.</p> <hd id="AN0192629994-9">The Optimization of Model Parameters</hd> <p>In summary, the SRM establishes a framework using ESN as its data processing mechanism, while developing an optimization algorithm tailored specifically to refine ESN for the input educational process data. Figure 3 provides an illustration of the structure of SRM. For a selected reservoir size <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>x</mi></msub><annotation encoding="application/x-tex">${{N}_x}$</annotation></semantics></math> </ephtml> , in SRM, the embedded sequences are directly inputted as the input layer, and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>i</mi><mi>n</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{in}}$</annotation></semantics></math> </ephtml> and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> are initialized. The optimization containing PSO and SVD is then initialized to generate a series of particles, where each particle stands for a set of singular values (see PSO details in Appendix). Given the property of SVD, the reservoir weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mrow><msub><mi>N</mi><mi>x</mi></msub><mo>×</mo><msub><mi>N</mi><mi>x</mi></msub></mrow></msup></mrow><annotation encoding="application/x-tex">${{W}_{res}} \in {{\mathbb{R}}^{{{N}_x} \times {{N}_x}}}$</annotation></semantics></math> </ephtml> can be obtained: 1 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>U</mi><mi>S</mi><mspace width="0.33em" /><msup><mi>V</mi><mi>T</mi></msup><mo linebreak="badbreak">=</mo><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}US\ {{V}^T} = {{W}_{res}},\end{equation}$$</annotation></semantics></math> </ephtml> where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>U</mi><mi>T</mi></msup><mi>U</mi><mspace width="0.33em" /><mo>=</mo><msup><mi>V</mi><mi>T</mi></msup><mspace width="0.33em" /><mi>V</mi><mspace width="0.33em" /><mo>=</mo><mspace width="0.33em" /><mi>I</mi></mrow><annotation encoding="application/x-tex">${{U}^T}U\ = {{V}^T}\ V\ = \ I$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>I</mi><annotation encoding="application/x-tex">$I$</annotation></semantics></math> </ephtml> is an identity matrix, and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>U</mi><annotation encoding="application/x-tex">$U$</annotation></semantics></math> </ephtml> and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">$V$</annotation></semantics></math> </ephtml> are two orthogonal matrices. <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>S</mi><mspace width="0.33em" /><mo>=</mo><mspace width="0.33em" /><mi>d</mi><mi>i</mi><mi>a</mi><mi>g</mi><mo>(</mo><mrow><msub><mi>σ</mi><mn>1</mn></msub><mo>,</mo><mtext>...</mtext><mo>,</mo><msub><mi>σ</mi><msub><mi>N</mi><mi>x</mi></msub></msub></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$S\ = \ diag({{{\sigma }_1},\ldots,{{\sigma }_{{{N}_x}}}})$</annotation></semantics></math> </ephtml> , where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>σ</mi><annotation encoding="application/x-tex">$\sigma $</annotation></semantics></math> </ephtml> 's are singular values optimized by PSO. According to SVD, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>S</mi><annotation encoding="application/x-tex">$S$</annotation></semantics></math> </ephtml> have the same singular values. Therefore, once the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>σ</mi><annotation encoding="application/x-tex">$\sigma $</annotation></semantics></math> </ephtml> 's are optimized, the reservoir weight matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math> </ephtml> constructed by Equation 1 yields the optimized reservoir. Recall that <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>max</mi><mo>(</mo><mi>σ</mi><mo>)</mo><mo><</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">$\max (\sigma) < 1$</annotation></semantics></math> </ephtml> can maintain the echo state property.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0003.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0003.jpg" title="3 The structure of the sequence reservoir method (SRM). Categorical sequences are embedded into numerical sequences. The embedded sequences are then processed in the ESN where the optimization algorithm helps to yield the final features based on the target vector space." /> </p> <p></p> <p>For the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> th examinee, given the embedded vector <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi mathvariant="bold-italic">a</mi><annotation encoding="application/x-tex">${\bm{a}}$</annotation></semantics></math> </ephtml> which has <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>u</mi></msub><annotation encoding="application/x-tex">${{N}_u}$</annotation></semantics></math> </ephtml> ‐dimension, the reservoir size <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>x</mi></msub><annotation encoding="application/x-tex">${{N}_x}$</annotation></semantics></math> </ephtml> (i.e., the number of neurons), ESN updates the hidden state of each input as in Equations 2 and 3: 2 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mover accent="true"><mi>x</mi><mo>∼</mo></mover><mi>l</mi></msup><mrow><mspace width="0.33em" /></mrow><mfenced separators="" open="(" close=")"><mrow><mi>c</mi><mo>+</mo><mn>1</mn></mrow></mfenced><mo linebreak="badbreak">=</mo><mspace width="0.33em" /><mi>f</mi><mfenced separators="" open="(" close=")"><mrow><msub><mi>W</mi><mrow><mi>i</mi><mi>n</mi></mrow></msub><msup><mfenced separators="" open="(" close=")"><mrow><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><mfenced separators="" open="(" close=")"><mrow><mi>c</mi><mo>+</mo><mn>1</mn></mrow></mfenced></mrow></mfenced><mi>T</mi></msup><mo>+</mo><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><msup><mi>x</mi><mi>l</mi></msup><mfenced open="(" close=")"><mi>c</mi></mfenced></mrow></mfenced><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{{\tilde{x}}^l}{\bm{\ }}\left({c + 1} \right) = \ f\left({{{W}_{in}}{{{\left({{{{\bm{a}}}^l}\left({c + 1} \right)} \right)}}^T} + {{W}_{res}}{{x}^l}\left(c \right)} \right),\end{equation}$$</annotation></semantics></math> </ephtml> 3 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>x</mi><mi>l</mi></msup><mspace width="0.33em" /><mfenced separators="" open="(" close=")"><mrow><mi>c</mi><mo>+</mo><mn>1</mn></mrow></mfenced><mo linebreak="badbreak">=</mo><mfenced separators="" open="(" close=")"><mrow><mn>1</mn><mo>−</mo><mi>ρ</mi></mrow></mfenced><mspace width="0.33em" /><msup><mi>x</mi><mi>l</mi></msup><mfenced open="(" close=")"><mi>c</mi></mfenced><mo linebreak="goodbreak">+</mo><mi>ρ</mi><msup><mover accent="true"><mi>x</mi><mo>∼</mo></mover><mi>l</mi></msup><mfenced separators="" open="(" close=")"><mrow><mi>c</mi><mo>+</mo><mn>1</mn></mrow></mfenced><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{{x}^l}\ \left({c + 1} \right) = \left({1 - \rho } \right)\ {{x}^l}\left(c \right) + \rho {{\tilde{x}}^l}\left({c + 1} \right),\end{equation}$$</annotation></semantics></math> </ephtml> where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><mrow><mo>(</mo><mi>c</mi><mo>)</mo></mrow><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><msub><mi>N</mi><mi>u</mi></msub></msup></mrow><annotation encoding="application/x-tex">${{{\bm{a}}}^l}(c) \in {{\mathbb{R}}^{{{N}_u}}}$</annotation></semantics></math> </ephtml> is the input embedded vector at time point <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>c</mi><annotation encoding="application/x-tex">$c$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>x</mi><mi>l</mi></msup><mrow><mo>(</mo><mi>c</mi><mo>)</mo></mrow><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mrow><msub><mi>N</mi><mi>x</mi></msub><mo>×</mo><msub><mi>N</mi><mi>u</mi></msub></mrow></msup></mrow><annotation encoding="application/x-tex">${{x}^l}(c) \in {{\mathbb{R}}^{{{N}_x} \times {{N}_u}}}$</annotation></semantics></math> </ephtml> is a matrix of reservoir neuron activation, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mover accent="true"><mi>x</mi><mo>∼</mo></mover><mi>l</mi></msup><mrow><mo>(</mo><mrow><mi>c</mi><mo>+</mo><mn>1</mn></mrow><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{\tilde{x}}^l}({c + 1})$</annotation></semantics></math> </ephtml> is its update, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo>(</mo><mtext>...</mtext><mo>)</mo></mrow><annotation encoding="application/x-tex">$f(\ldots)$</annotation></semantics></math> </ephtml> is the activation function, which is usually a hyperbolic function <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo form="prefix">tanh</mo><mo>(</mo><mtext>...</mtext><mo>)</mo></mrow><annotation encoding="application/x-tex">$\tanh (\ldots)$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>i</mi><mi>n</mi></mrow></msub><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mrow><msub><mi>N</mi><mi>x</mi></msub><mo>×</mo><mn>1</mn></mrow></msup></mrow><annotation encoding="application/x-tex">${{W}_{in}} \in {{\mathbb{R}}^{{{N}_x} \times 1}}$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>r</mi><mi>e</mi><mi>s</mi></mrow></msub><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mrow><msub><mi>N</mi><mi>x</mi></msub><mo>×</mo><msub><mi>N</mi><mi>x</mi></msub></mrow></msup></mrow><annotation encoding="application/x-tex">${{W}_{res}} \in {{\mathbb{R}}^{{{N}_x} \times {{N}_x}}}$</annotation></semantics></math> </ephtml> are the input and reservoir weight matrices respectively, and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ρ</mi><mo>∈</mo><mo>(</mo><mrow><mn>0</mn><mo>,</mo><mspace width="0.33em" /><mn>1</mn></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$\rho \in ({0,\ 1})$</annotation></semantics></math> </ephtml> named as leaking rate is randomly selected between 0 and 1 and then fixed throughout the algorithm.</p> <p>In the ESN, for examinee <emph>l</emph>, denote the vertical concatenation of feature vector and input vector at each time point as in Equation 44 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>h</mi><mi>l</mi></msup><mspace width="0.33em" /><mfenced open="(" close=")"><mi>t</mi></mfenced><mo linebreak="badbreak">=</mo><mfenced separators="" open="[" close="]"><mrow><msup><mi>x</mi><mi>l</mi></msup><mfenced open="(" close=")"><mi>t</mi></mfenced><mo>:</mo><msup><mfenced separators="" open="(" close=")"><mrow><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><mfenced open="(" close=")"><mi>t</mi></mfenced></mrow></mfenced><mi>T</mi></msup></mrow></mfenced><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{{h}^l}\ \left(t \right) = \left[ {{{x}^l}\left(t \right):{{{\left({{{{\bm{a}}}^l}\left(t \right)} \right)}}^T}} \right].\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>We keep the last output summary of information <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>h</mi><mi>l</mi></msup><mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mrow><mrow><mo>(</mo><mn>1</mn><mo>+</mo><msub><mi>N</mi><mi>x</mi></msub><mo>)</mo></mrow><mo>×</mo><msub><mi>N</mi><mi>u</mi></msub></mrow></msup></mrow><annotation encoding="application/x-tex">${{h}^l}(t) \in {{\mathbb{R}}^{(1 + {{N}_x}) \times {{N}_u}}}$</annotation></semantics></math> </ephtml> and calculate the average of each column to get a vector <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi mathvariant="bold-italic">h</mi><mi>l</mi></msup><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><msub><mi>N</mi><mi>u</mi></msub></msup></mrow><annotation encoding="application/x-tex">${{{\bm{h}}}^l} \in {{\mathbb{R}}^{{{N}_u}}}$</annotation></semantics></math> </ephtml> , which is the raw feature vector for the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> th examinee. In order to identify the best feature vectors, our target is to find the orthonormal basis to the vector space of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi mathvariant="script">L</mi><mi>l</mi></msup><mo>=</mo><mspace width="0.33em" /><mi>s</mi><mi>p</mi><mi>a</mi><mi>n</mi><mspace width="0.33em" /><mrow><mo>(</mo><mrow><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow><mo>,</mo><mtext>...</mtext><mo>,</mo><mspace width="0.33em" /><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow></mrow><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{\mathcal{L}}^l} = \ span\ ({{{{\bm{a}}}^l}(1),\ldots,\ {{{\bm{a}}}^l}(t)})$</annotation></semantics></math> </ephtml> , which consists of original embedded sequence vectors. Therefore, the Gram‐Schmidt process is applied here 5 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfenced separators="" open="{" close="">ζil=ali−∑j=1i−1ali,ξjlξjl,i∈1,tξil=ζilζil,i∈1,t</mfenced><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}\left\{ { \def\eqcellsep{&}\begin{array}{@{}*{1}{c}@{}} {\ {\bm{\zeta }}_i^l = {{{\bm{a}}}^l}\ \left(i \right) - \sum_{j\ = \ 1}^{i - 1} \left\langle {{{{\bm{a}}}^l}\left(i \right),{\bm{\xi }}_j^l} \right\rangle {\bm{\xi }}_j^l,i \in \left[ {1,t} \right]}\\ {\ {\bm{\xi }}_i^l = \frac{{{\bm{\zeta }}_i^l}}{{\left| {{\bm{\zeta }}_i^l} \right|}}\ ,i \in \left[ {1,t} \right]} \end{array} } \right.,\end{equation}$$</annotation></semantics></math> </ephtml> where the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>⟨</mo><mrow><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow><mo>,</mo><msubsup><mi mathvariant="bold-italic">ξ</mi><mi>j</mi><mi>l</mi></msubsup></mrow><mo>⟩</mo></mrow><annotation encoding="application/x-tex">$\langle {{{{\bm{a}}}^l}(i),{\bm{\xi }}_j^l} \rangle $</annotation></semantics></math> </ephtml> indicates the inner product of two vectors <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{{\bm{a}}}^l}(i)$</annotation></semantics></math> </ephtml> and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msubsup><mi mathvariant="bold-italic">ξ</mi><mi>j</mi><mi>l</mi></msubsup><annotation encoding="application/x-tex">${\bm{\xi }}_j^l$</annotation></semantics></math> </ephtml> . Then the orthonormal vectors <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi mathvariant="bold-italic">ξ</mi><mi>i</mi><mi>l</mi></msubsup><mo>=</mo><mrow><mo>(</mo><mrow><msubsup><mi mathvariant="bold-italic">ξ</mi><mn>1</mn><mi>l</mi></msubsup><mo>,</mo><mtext>...</mtext><mo>,</mo><msubsup><mi mathvariant="bold-italic">ξ</mi><mi>t</mi><mi>l</mi></msubsup></mrow><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${\bm{\xi }}_i^l = ({{\bm{\xi }}_1^l,\ldots,{\bm{\xi }}_t^l})$</annotation></semantics></math> </ephtml> are the basis of the space <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi mathvariant="script">L</mi><mi>l</mi></msup><annotation encoding="application/x-tex">${{\mathcal{L}}^l}$</annotation></semantics></math> </ephtml> . Given the feature vector learned for <emph>l</emph>th examinee as <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi mathvariant="bold-italic">h</mi><mi>l</mi></msup><annotation encoding="application/x-tex">${{{\bm{h}}}^l}$</annotation></semantics></math> </ephtml> , the distance between the representation vector and space <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi mathvariant="script">L</mi><mi>l</mi></msup><annotation encoding="application/x-tex">${{\mathcal{L}}^l}$</annotation></semantics></math> </ephtml> is defined in Equation 6. 6 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>d</mi><mfenced separators="" open="(" close=")"><mrow><msup><mi mathvariant="bold-italic">h</mi><mi>l</mi></msup><mo>,</mo><msup><mi mathvariant="script">L</mi><mi>l</mi></msup></mrow></mfenced><mo linebreak="badbreak">=</mo><msqrt><mrow><mfenced separators="" open="|" close="|"><msup><mi mathvariant="bold-italic">h</mi><mi>l</mi></msup></mfenced><mo linebreak="badbreak">−</mo><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>t</mi></msubsup><mfenced separators="" open="〈" close="〉"><mrow><msup><mi mathvariant="bold-italic">h</mi><mi>l</mi></msup><mo>,</mo><msubsup><mi mathvariant="bold-italic">ξ</mi><mi>i</mi><mi>l</mi></msubsup></mrow></mfenced></mrow></msqrt><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}d\left({{{{\bm{h}}}^l},{{\mathcal{L}}^l}} \right) = \sqrt {\left| {{{{\bm{h}}}^l}} \right| - \sum_{i = 1}^t \left\langle {{{{\bm{h}}}^l},{\bm{\xi }}_i^l} \right\rangle }.\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>The distance <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>d</mi><mo>(</mo><mrow><msup><mi mathvariant="bold-italic">h</mi><mi>l</mi></msup><mo>,</mo><msup><mi mathvariant="script">L</mi><mi>l</mi></msup></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$d({{{{\bm{h}}}^l},{{\mathcal{L}}^l}})$</annotation></semantics></math> </ephtml> reflects the distance between the features and desired vector space. A smaller distance indicates that a better feature was extracted. For all the examinees, the fitness function can be defined as in Equation 7. 7 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mi>i</mi><mi>t</mi><mi>n</mi><mi>e</mi><mi>s</mi><mi>s</mi><mfenced open="(" close=")"><mi>k</mi></mfenced><mo linebreak="badbreak">=</mo><msqrt><mrow><msubsup><mo>∑</mo><mrow><mi>l</mi><mo>=</mo><mn>1</mn></mrow><mi>L</mi></msubsup><msup><mfenced separators="" open="(" close=")"><mrow><mi>d</mi><mfenced separators="" open="(" close=")"><mrow><msup><mi mathvariant="bold-italic">h</mi><mi>l</mi></msup><mo>,</mo><msup><mi mathvariant="script">L</mi><mi>l</mi></msup></mrow></mfenced></mrow></mfenced><mn>2</mn></msup></mrow></msqrt><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}fitness\left(k \right) = \sqrt {\sum_{l = 1}^L {{{\left({d\left({{{{\bm{h}}}^l},{{\mathcal{L}}^l}} \right)} \right)}}^2}}.\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>As noted above, the last output matrix is retained as a summary of information to predict target variable such as latent trait values or human rater's scores using Equation 88 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>y</mi><mi>T</mi></msup><mo linebreak="badbreak">=</mo><msub><mi>W</mi><mrow><mi>o</mi><mi>u</mi><mi>t</mi></mrow></msub><mspace width="0.33em" /><mi>H</mi><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{{y}^T} = {{W}_{out}}\ H,\end{equation}$$</annotation></semantics></math> </ephtml> where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi mathvariant="bold-italic">y</mi><annotation encoding="application/x-tex">${\bm{y}}$</annotation></semantics></math> </ephtml> is the target variable, the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>o</mi><mi>u</mi><mi>t</mi></mrow></msub><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mrow><mn>1</mn><mo>×</mo><mo>(</mo><mrow><mn>1</mn><mo>+</mo><msub><mi>N</mi><mi>x</mi></msub></mrow><mo>)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">${{W}_{out}} \in {{\mathbb{R}}^{1 \times ({1 + {{N}_x}})}}$</annotation></semantics></math> </ephtml> is an output weight matrix which can be trained by a regression or classification model, and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>H</mi><annotation encoding="application/x-tex">$H$</annotation></semantics></math> </ephtml> is the feature matrix for all examinees. This overall process is listed as a pseudo‐code of the whole process SRM in Table 2.</p> <p>2 Table The Pseudo‐Code Showing the Whole Process of SRM Applied on Process Data</p> <p> <ephtml> <table><thead><tr><th>Algorithm 1: Sequential Reservoir Method</th></tr></thead><tbody><tr><td>1: PROCEDURE SRM_Optimization(N_x, Equations)2: FOR each <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi><mi>x</mi><annotation encoding="application/x-tex">${{N}_x}$</annotation></semantics></math></p> DO3: InitializeParticles()4: WHILE NOT Converged DO5: FOR each particle DO6: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>W</mi><mi>r</mi><mi>e</mi><mi>s</mi><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math></p> ← ConstructReservoirWeightMatrix(Equation 1)7: FOR <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math></p>th examinee DO8: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>h</mi><mi>l</mi><annotation encoding="application/x-tex">${{h}^l}$</annotation></semantics></math></p> ← CalculateFeatureVector(<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>W</mi><mi>r</mi><mi>e</mi><mi>s</mi><annotation encoding="application/x-tex">${{W}_{res}}$</annotation></semantics></math></p>, Equations <ext-link href="jedm12413-disp-0002 jedm12413-disp-0003 jedm12413-disp-0004" title="2, 3, 4" />)9: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi mathvariant="script">L</mi><mi>l</mi><annotation encoding="application/x-tex">${{\mathcal{L}}^l}$</annotation></semantics></math></p> ← GramSchmidtProcess(<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>h</mi><mi>l</mi><annotation encoding="application/x-tex">${{h}^l}$</annotation></semantics></math></p>, Equation 5)10: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>d</mi><mo>(</mo><mi>h</mi><mi>l</mi><mo>,</mo><mi mathvariant="script">L</mi><mi>l</mi><mo>)</mo><annotation encoding="application/x-tex">$d({{{h}^l},{{\mathcal{L}}^l}})$</annotation></semantics></math></p> ← CalculateDistance(<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>h</mi><mi>l</mi><annotation encoding="application/x-tex">${{h}^l}$</annotation></semantics></math></p>, <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi mathvariant="script">L</mi><mi>l</mi><annotation encoding="application/x-tex">${{\mathcal{L}}^l}$</annotation></semantics></math></p>, Equation 6)11: fitnessValue ← CalculateFitnessValue(<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>d</mi><mo>(</mo><mi>h</mi><mi>l</mi><mo>,</mo><mi mathvariant="script">L</mi><mi>l</mi><mo>)</mo><annotation encoding="application/x-tex">$d({{{h}^l},{{\mathcal{L}}^l}})$</annotation></semantics></math></p>, Equation 7)12: END FOR13: END WHILE14: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>W</mi><mi>r</mi><mi>e</mi><mi>s</mi><mo>,</mo><mspace width="0.33em" /><mi>o</mi><mi>p</mi><mi>t</mi><mi>i</mi><mi>m</mi><mi>a</mi><mi>l</mi><annotation encoding="application/x-tex">${{W}_{res,\ optimal}}$</annotation></semantics></math></p> ← RecordOptimalSolution()15: FOR <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math></p>th examinee DO16: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>h</mi><mi>f</mi><mi>i</mi><mi>n</mi><mi>a</mi><mi>l</mi><mi>l</mi><annotation encoding="application/x-tex">$h_{final}^l$</annotation></semantics></math></p> ← CalculateFeatureVector(<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>W</mi><mi>r</mi><mi>e</mi><mi>s</mi><mo>,</mo><mspace width="0.33em" /><mi>o</mi><mi>p</mi><mi>t</mi><mi>i</mi><mi>m</mi><mi>a</mi><mi>l</mi><annotation encoding="application/x-tex">${{W}_{res,\ optimal}}$</annotation></semantics></math></p>, Equations <ext-link href="jedm12413-disp-0002 jedm12413-disp-0003 jedm12413-disp-0004" title="2, 3, 4" />)17: END FOR18: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>H</mi><mi>f</mi><mi>i</mi><mi>n</mi><mi>a</mi><mi>l</mi><mi>L</mi><annotation encoding="application/x-tex">$H_{final}^L$</annotation></semantics></math></p> ← RecordAllExaminees()19: <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>W</mi><mi>o</mi><mi>u</mi><mi>t</mi><annotation encoding="application/x-tex">${{W}_{out}}$</annotation></semantics></math></p> ← TrainOutputWeightMatrix(<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>H</mi><mi>f</mi><mi>i</mi><mi>n</mi><mi>a</mi><mi>l</mi><mi>L</mi><annotation encoding="application/x-tex">$H_{final}^L$</annotation></semantics></math></p>, Equation 8)20: END FOR21: END PROCEDURE</td></tr></tbody></table> </ephtml> </p> <hd id="AN0192629994-11">Simulation Study</hd> <p></p> <hd id="AN0192629994-12">Design of Study</hd> <p>In this section, we present results for two sets of simulation studies, each depicting sequences of actions within distinct scenarios. These studies aimed to showcase the application of the SRM in feature extraction. The resultant feature vectors, extracted from the sequences by the SRM, served as representations of individual behaviors. These representations were then utilized for tasks such as behavioral group classification, prediction of latent information, and evaluation of model fit.</p> <p>For both simulation studies, process sequences were generated by Markov chains (Athreya et al., [<reflink idref="bib1" id="ref36">1</reflink>]), following the method used to generate log actions and associated temstamps in previous research studies (Tang et al., [[<reflink idref="bib26" id="ref37">26</reflink>]]). In addition to the purpose of ensuring that our simulations are relevant and comparable to established simulation methods and simulation conditions in previous studies, these specific conditions were also selected due to their alignment with real‐world educational assessment scenarios. These conditions were chosen for their potential to pose challenges, thus allowing us to verify both the strengths and limitations of SRM. Recall that, there exist a total of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>n</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{n}_z}$</annotation></semantics></math> </ephtml> possible unique actions denoted as <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi mathvariant="bold-italic">S</mi><mspace width="0.33em" /></mrow><mo>=</mo><mo>(</mo><mrow><msubsup><mi>s</mi><mn>1</mn><mi>z</mi></msubsup><mo>,</mo><mtext>...</mtext><mo>,</mo><msubsup><mi>s</mi><msub><mi>n</mi><mi>z</mi></msub><mi>z</mi></msubsup></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">${\bm{S\ }} = ({s_1^z,\ldots,s_{{{n}_z}}^z})$</annotation></semantics></math> </ephtml> , and let us assume that <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>s</mi><mn>1</mn></msub><annotation encoding="application/x-tex">${{s}_1}$</annotation></semantics></math> </ephtml> indicates the start action and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>s</mi><msub><mi>n</mi><mi>z</mi></msub></msub><annotation encoding="application/x-tex">${{s}_{{{n}_z}}}$</annotation></semantics></math> </ephtml> indicates the end action. Therefore, all examinees' action sequences start from <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>s</mi><mn>1</mn></msub><annotation encoding="application/x-tex">${{s}_1}$</annotation></semantics></math> </ephtml> and end at <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>s</mi><msub><mi>n</mi><mi>z</mi></msub></msub><annotation encoding="application/x-tex">${{s}_{{{n}_z}}}$</annotation></semantics></math> </ephtml> . The Markov chain will transition from one action to another. Given an action <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>a</mi><mi>t</mi></msub><annotation encoding="application/x-tex">${{a}_t}$</annotation></semantics></math> </ephtml> at moment <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>t</mi><annotation encoding="application/x-tex">$t$</annotation></semantics></math> </ephtml> , the probability of making the next transition <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>a</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msub><annotation encoding="application/x-tex">${{a}_{t + 1}}$</annotation></semantics></math> </ephtml> will only depend on the action at the given time <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>t</mi><annotation encoding="application/x-tex">$t$</annotation></semantics></math> </ephtml> . This mean that <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>a</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msub><annotation encoding="application/x-tex">${{a}_{t + 1}}$</annotation></semantics></math> </ephtml> is one of the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>n</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{n}_z}$</annotation></semantics></math> </ephtml> actions that the process can transition to. The probability of transiting from one action to another is defined by a transition matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">$P$</annotation></semantics></math> </ephtml> : 9 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo linebreak="badbreak">=</mo><mfenced separators="" open="[" close="]">0p1,2⋯p1,nz0p2,2⋯p2,nz⋮0⋮0⋱⋮⋯1</mfenced><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}P = \left[ { \def\eqcellsep{&}\begin{array}{@{}*{4}{c}@{}} 0&{{{p}_{1,2}}}&{ \def\eqcellsep{&}\begin{array}{*{20}{c}} \cdots &{{{p}_{1,{{n}_z}}}} \end{array} }\\ 0&{{{p}_{2,2}}}&{ \def\eqcellsep{&}\begin{array}{*{20}{c}} \cdots &{{{p}_{2,{{n}_z}}}} \end{array} }\\ { \def\eqcellsep{&}\begin{array}{@{}*{1}{c}@{}} \vdots \\ 0 \end{array} }&{ \def\eqcellsep{&}\begin{array}{@{}*{1}{c}@{}} \vdots \\ 0 \end{array} }&{ \def\eqcellsep{&}\begin{array}{@{}*{2}{c}@{}} { \def\eqcellsep{&}\begin{array}{*{20}{c}} \ddots & \vdots \end{array} }\\ { \def\eqcellsep{&}\begin{array}{*{20}{c}} \cdots &1 \end{array} } \end{array} } \end{array} } \right].\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>Each action represented a specific examinee behaviors during an assessment, such as "Enter Item," "Choose Answer," and "Exit Item." This matrix describes the probabilities of transitioning from one action to another, with rows representing the starting action and columns representing the subsequent action. The element in this matrix, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mspace width="0.33em" /><mo>=</mo><mo>[</mo><msub><mi>p</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow></msub><mo>]</mo></mrow><annotation encoding="application/x-tex">$P\ = [ {{{p}_{i,j}}} ]$</annotation></semantics></math> </ephtml> , indicates the probability from one action <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>a</mi><mi>i</mi></msub><annotation encoding="application/x-tex">${{a}_i}$</annotation></semantics></math> </ephtml> to another <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>a</mi><mi>j</mi></msub><annotation encoding="application/x-tex">${{a}_j}$</annotation></semantics></math> </ephtml> and serves as the guiding rule for behavioral patterns. Note that in the context of our assessment scenario, for a particular item, the transition probabilities from any other actions to <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>s</mi><mn>1</mn></msub><annotation encoding="application/x-tex">${{s}_1}$</annotation></semantics></math> </ephtml> are 0 because it is not feasible to revert to the start action before an examinee completes an item. Therefore, the first column with all element values being 0s reflects the constraint that once an examinee has entered an item, they cannot "re‐enter" it without first completing or exiting the item. Furthermore, the transition probabilities from <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>s</mi><msub><mi>n</mi><mi>z</mi></msub></msub><annotation encoding="application/x-tex">${{s}_{{{n}_z}}}$</annotation></semantics></math> </ephtml> to any other actions are also 0, except for the probability of transitioning from <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>s</mi><msub><mi>n</mi><mi>z</mi></msub></msub><annotation encoding="application/x-tex">${{s}_{{{n}_z}}}$</annotation></semantics></math> </ephtml> to itself is 1. This peculiar situation arises because it is impossible for examinees to perform additional actions after completing an item unless they choose to re‐engage with the same item. Therefore, entries in the first column of matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">$P$</annotation></semantics></math> </ephtml> are all zeros, while entries in the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>n</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{n}_z}$</annotation></semantics></math> </ephtml> th row of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">$P$</annotation></semantics></math> </ephtml> predominantly consist of zeros, with the exception of the last element, which is set to one. The remaining upper‐right submatrix of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">$P$</annotation></semantics></math> </ephtml> is designed based on a specific simulation set up, which is described in each of the two simulation studies.</p> <p>After simulating data (Table A3), SRM will be applied to extract features, and the number of features extracted by the SRM was not predetermined but instead explored adaptively. We allowed the SRM to determine the optimal feature count from the set 25, 50, 75, 100, 125, 175, 200, aiming to identify the configuration that yielded the best results in terms of model output (Table A4).</p> <hd id="AN0192629994-13">Simulation 1: Group Classification Based on Sequences</hd> <p>The goal of this simulation study is to demonstrate the classification accuracy to examinees' latent group membership using features extracted by SRM from the generated action sequences. Additionally, a comparison is considered using baseline features from conventional ESN without optimization algorithm as in SRM.</p> <hd id="AN0192629994-14">Data generation and result evaluation criterion</hd> <p>The number of unique actions for an item can influence the potential behavioral patterns accessible to examinees, and the number of examinees determines the scale of the data sample size. To capture this variability in our simulations, therefore, we manipulated these parameters by considering three distinct counts for <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>n</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{n}_z}$</annotation></semantics></math> </ephtml> : 10, 25, and 50, and for <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> : 150, 1,500, and 3,000. Our simulations encompassed combinations of these varied values, exploring the effects on action sequence generation. In line with the general training principles for ESN (Lukoševičius, [<reflink idref="bib19" id="ref38">19</reflink>]), six different reservoir sizes <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>x</mi></msub><annotation encoding="application/x-tex">${{N}_x}$</annotation></semantics></math> </ephtml> , ranging from 500 to 5,000, are simulated to assess the effects caused by different reservoir sizes. Each simulation condition was replicated 50 times to ensure statistical reliability, and each replication involved only one generic item in Simulation 1.</p> <p>The action sequences within Simulation 1 were generated from three Markov matrices following Equation 9. Each sequence of actions was generated by one of the three Markov matrices, and sequences stemming from the same matrix was regarded as from one group. Denote the upper‐right submatrix with dimension <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo>(</mo><mrow><msub><mi>n</mi><mi>z</mi></msub><mo>−</mo><mn>1</mn></mrow><mo>)</mo></mrow><mo>×</mo><mrow><mo>(</mo><mrow><msub><mi>n</mi><mi>z</mi></msub><mo>−</mo><mn>1</mn></mrow><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">$({{{n}_z} - 1}) \times ({{{n}_z} - 1})$</annotation></semantics></math> </ephtml> in Equation 9 as <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>P</mi><mo>′</mo></msup><annotation encoding="application/x-tex">$P\mathrm{^{\prime}}$</annotation></semantics></math> </ephtml> , which was generated based on three matrices <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></mrow></msup><annotation encoding="application/x-tex">${{P}^{^{\prime}(1)}}$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></mrow></msup><annotation encoding="application/x-tex">${{P}^{^{\prime}(2)}}$</annotation></semantics></math> </ephtml> , and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></mrow></msup><annotation encoding="application/x-tex">${{P}^{^{\prime}(3)}}$</annotation></semantics></math> </ephtml> . Specifically, three uniform matrices <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>U</mi><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></msup><annotation encoding="application/x-tex">${{U}^{(1)}}$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>U</mi><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></msup><annotation encoding="application/x-tex">${{U}^{(2)}}$</annotation></semantics></math> </ephtml> , and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>U</mi><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></msup><annotation encoding="application/x-tex">${{U}^{(3)}}$</annotation></semantics></math> </ephtml> , each with dimension <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo>(</mo><mrow><msub><mi>n</mi><mi>z</mi></msub><mo>−</mo><mn>1</mn></mrow><mo>)</mo></mrow><mo>×</mo><mrow><mo>(</mo><mrow><msub><mi>n</mi><mi>z</mi></msub><mo>−</mo><mn>1</mn></mrow><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">$({{{n}_z} - 1}) \times ({{{n}_z} - 1})$</annotation></semantics></math> </ephtml> , were generated. The elements of the three uniform matrices are denoted by <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msubsup><mi>u</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></msubsup><annotation encoding="application/x-tex">$u_{i,j}^{(1)}$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msubsup><mi>u</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></msubsup><annotation encoding="application/x-tex">$u_{i,j}^{(2)}$</annotation></semantics></math> </ephtml> , and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msubsup><mi>u</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></msubsup><annotation encoding="application/x-tex">$u_{i,j}^{(3)}$</annotation></semantics></math> </ephtml> , respectively and were simulated independently from a uniform distribution <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>U</mi><mo>(</mo><mrow><mo>−</mo><mn>15</mn><mo>,</mo><mn>15</mn></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$U({ - 15,15})$</annotation></semantics></math> </ephtml> . The decision to employ uniform matrices was to ensure similar reasonable values are generated, and avoid significantly higher transition probabilities between certain actions which could lead to those actions disproportionately dominating the simulated sequences. Given these uniform matrices, three matrices <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></mrow></msup><mo>=</mo><mrow><mo>(</mo><msubsup><mi>p</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></msubsup><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{P}^{^{\prime}(1)}} = ({p_{i,j}^{(1)}})$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></mrow></msup><mo>=</mo><mrow><mo>(</mo><msubsup><mi>p</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></msubsup><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{P}^{^{\prime}(2)}} = ({p_{i,j}^{(2)}})$</annotation></semantics></math> </ephtml> , and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></mrow></msup><mo>=</mo><mrow><mo>(</mo><msubsup><mi>p</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></msubsup><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{P}^{^{\prime}(3)}} = ({p_{i,j}^{(3)}})$</annotation></semantics></math> </ephtml> were generated using Equation 10. 10 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfenced separators="" open="{" close="">pi,j1=expui,j1∑j=1nk−1expui,j1pi,j2=expui,j2∑j=1nk−1expui,j2pi,j3=expui,j3∑j=1nk−1expui,j3</mfenced><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}\left\{ { \def\eqcellsep{&}\begin{array}{@{}*{1}{c}@{}} {\ p_{i,j}^{\left(1 \right)} = \frac{{\exp \left({u_{i,j}^{\left(1 \right)}} \right)}}{{\sum_{j = 1}^{{{n}_k} - 1} \exp \left({u_{i,j}^{\left(1 \right)}} \right)}}\ }\\ {\ p_{i,j}^{\left(2 \right)} = \frac{{\exp \left({u_{i,j}^{\left(2 \right)}} \right)}}{{\sum_{j = 1}^{{{n}_k} - 1} \exp \left({u_{i,j}^{\left(2 \right)}} \right)}}\ }\\ {\ p_{i,j}^{\left(3 \right)} = \frac{{\exp \left({u_{i,j}^{\left(3 \right)}} \right)}}{{\sum_{j = 1}^{{{n}_k} - 1} \exp \left({u_{i,j}^{\left(3 \right)}} \right)}}\ } \end{array} } \right..\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>SRM was applied on the complete set of generated sequences after random shuffling the sequences. The extracted SRM features and baseline features were used to predict the group classification under each simulation condition using a generalized logit model and 10‐fold cross validation, respectively. Classification accuracy index was used to evaluate the prediction accuracy based on SRM features and the baseline features. Table 3 shows a confusion matrix example, where each entry shows the number of results corresponding to the actual group of the row and the model predicted group of the column. Given these numbers, the classification accuracy is defined in Equation 11. 11 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>C</mi><mi>l</mi><mi>a</mi><mi>s</mi><mi>s</mi><mi>i</mi><mi>f</mi><mi>i</mi><mi>c</mi><mi>a</mi><mi>t</mi><mi>i</mi><mi>o</mi><mi>n</mi><mspace width="0.33em" /><mi>A</mi><mi>c</mi><mi>c</mi><mi>u</mi><mi>r</mi><mi>a</mi><mi>c</mi><mi>y</mi><mspace width="0.33em" /><mo linebreak="badbreak">=</mo><mfrac><mrow><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mn>3</mn></msubsup><msub><mi>N</mi><mrow><mi>i</mi><mi>i</mi></mrow></msub></mrow><mrow><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mn>3</mn></msubsup><msubsup><mo>∑</mo><mrow><mi>j</mi><mo>=</mo><mn>1</mn></mrow><mn>3</mn></msubsup><msub><mi>N</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfrac><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}Classification\ Accuracy\ = \frac{{\sum_{i = 1}^3 {{N}_{ii}}}}{{\sum_{i = 1}^3 \sum_{j = 1}^3 {{N}_{ij}}}}.\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>3 Table Confusion Matrix for Classification with Three Groups</p> <p> <ephtml> <table><thead><tr><th /><th /><th align="center">Model Predicted Group</th></tr><tr><th /><th /><th>1</th><th>2</th><th><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mrow><mn>3</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 3$</annotation></semantics></math></p></th></tr></thead><tbody><tr><td>Actual</td><td>1</td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>11<annotation encoding="application/x-tex">${{N}_{11}}$</annotation></semantics></math></p></td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>12<annotation encoding="application/x-tex">${{N}_{12}}$</annotation></semantics></math></p></td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>13<annotation encoding="application/x-tex">${{N}_{13}}$</annotation></semantics></math></p></td></tr><tr><td>Markov</td><td>2</td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>21<annotation encoding="application/x-tex">${{N}_{21}}$</annotation></semantics></math></p></td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>22<annotation encoding="application/x-tex">${{N}_{22}}$</annotation></semantics></math></p></td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>23<annotation encoding="application/x-tex">${{N}_{23}}$</annotation></semantics></math></p></td></tr><tr><td>Group</td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns="">3<annotation encoding="application/x-tex">$\hskip.001pt 3$</annotation></semantics></math></p></td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>31<annotation encoding="application/x-tex">${{N}_{31}}$</annotation></semantics></math></p></td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>32<annotation encoding="application/x-tex">${{N}_{32}}$</annotation></semantics></math></p></td><td><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>N</mi>33<annotation encoding="application/x-tex">${{N}_{33}}$</annotation></semantics></math></p></td></tr></tbody></table> </ephtml> </p> <hd id="AN0192629994-15">Results</hd> <p>Applying the SRM to the simulated dataset yields a feature matrix with dimensions <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>l</mi><mo>×</mo><msub><mi>N</mi><mi>u</mi></msub></mrow><annotation encoding="application/x-tex">$l \times {{N}_u}$</annotation></semantics></math> </ephtml> , in which each row represents an examinee and each column represents one feature. To visualize the latent information from features, we draw a principal component analysis (PCA) on the extracted features. Through PCA analysis applied to the extracted features, distinct group classifications became evident and observable, and results across conditions are similar. Here we provide the PCA analysis result on one of the feature matrices which was extracted from the condition of 3,000 examinees and 10 unique actions as an example. Figure 4 shows a plot of the first two principal components. In this figure, each dot represents an examinee. Evidently, the features extracted from the action sequences enable the clear differentiation of three distinct groups. In this simulation, each of the three matrices represents a distinct behavioral group, ensuring internal consistency within each group. This means that examinees classified within the same group display similar behavior patterns. The results demonstrate that features learned by the SRM effectively utilize these nuanced differences in transition probabilities between actions to accurately identify and differentiate between groups.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0004.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0004.jpg" title="4 PCA of the features extracted from Simulation 1 with 3,000 examinees and 10 actions." /> </p> <p></p> <p>The average group classification accuracy shows similar and consistent changes across different simulation conditions. The results reveal that as the number of examinees increases, there is a corresponding improvement in the average classification accuracy. Furthermore, there is a noticeable trend suggesting that larger reservoir sizes have a positive impact on classification accuracy. This augmentation in accuracy aligns with the notion that enlarging the reservoir size amplifies the intricacy of the model, leading to a subsequent refinement in its predictive accuracy. Specifically, a reservoir size of 500 exhibited a relatively modest average accuracy, with the highest accuracy observed when the reservoir size reached 5,000. This particular pattern of outcomes underscores the potential benefits associated with employing a reservoir size of 5,000, suggesting its capacity to yield an optimal result. Figure 5 showcases the average group classification accuracy change under the condition of 3,000 examinees and 10 actions against different reservoir sizes as an illustration.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0005.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0005.jpg" title="5 Average group classification accuracy with 3,000 examinees and 10 actions." /> </p> <p></p> <p>Table 4 reports the average classification accuracies yielded by the optimal reservoir size based on the SRM and the baseline features in different conditions. The smallest and highest average classification accuracies yielded by using the SRM features are.825 and.902, respectively, while values from the baseline features are.634 and.839, respectively. Table 4 highlights two key observations. First, the average classification accuracy derived from the SRM features consistently surpasses that of the baseline features across all conditions. Second, with an increase in sample size, there is a corresponding rise in the average classification accuracy achieved by the SRM features. Note that the progression in classification accuracy shows a slight increase when transitioning from 10 to 25 unique actions. However, subsequent shifts from 25 to 50 unique actions result in minimal changes in classification accuracy, suggesting a potential saturation point in the relationship between action quantity and accuracy enhancement.</p> <p>4 Table Average Best Group Classification Accuracy Comparison</p> <p> <ephtml> <table><thead><tr><th><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math></p></th><th><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><msub><mi>N</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{N}_z}$</annotation></semantics></math></p></th><th>SRM Features</th><th>Baseline Features</th></tr></thead><tbody><tr><td>150</td><td>10</td><td>.825 (.074)</td><td>.634 (.089)</td></tr><tr><td /><td>25</td><td>.836 (.102)</td><td>.678 (.076)</td></tr><tr><td /><td>50</td><td>.829 (.096)</td><td>.655 (.063)</td></tr><tr><td>1,500</td><td>10</td><td>.851 (.068)</td><td>.692 (.054)</td></tr><tr><td /><td>25</td><td>.862 (.057)</td><td>.723 (.062)</td></tr><tr><td /><td>50</td><td>.865 (.062)</td><td>.741 (.059)</td></tr><tr><td>3,000</td><td>10</td><td>.891 (.059)</td><td>.760 (.050)</td></tr><tr><td /><td>25</td><td>.902 (.047)</td><td>.804 (.042)</td></tr><tr><td /><td>50</td><td>.900 (.051)</td><td>.839 (.043)</td></tr></tbody></table> </ephtml> </p> <p>1 <emph>Note</emph>. The values in the parentheses are standard deviations.</p> <hd id="AN0192629994-18">Simulation 2: Using Both Process and Response to Assess Latent Ability and Model Fit</hd> <p>The goal of Simulation 2 is to evaluate the accuracy of predicting latent ability values and to evaluate the model fit by utilizing both the responses and features extracted from the simulated process data.</p> <hd id="AN0192629994-19">Data generation and result evaluation criteria</hd> <p>Examinees' action sequences were simulated based on their latent abilities. Specifically, each of the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> action sequences in this simulation was generated from a unique Markov chain, and all <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> Markov chains were based on a common uniform matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>U</mi><mrow><mo>(</mo><mn>4</mn><mo>)</mo></mrow></msup><annotation encoding="application/x-tex">${{U}^{(4)}}$</annotation></semantics></math> </ephtml> . First, a set of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> examinees' latent abilities, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>θ</mi><mn>1</mn></msub><mo>,</mo><mtext>...</mtext><mo>,</mo><msub><mi>θ</mi><mi>l</mi></msub></mrow><annotation encoding="application/x-tex">${{\theta }_1},\ldots,{{\theta }_l}$</annotation></semantics></math> </ephtml> , were randomly generated from a normal distribution <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>N</mi><mo>(</mo><mrow><mn>0</mn><mo>,</mo><mn>1</mn></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$N({0,1})$</annotation></semantics></math> </ephtml> . Then the uniform matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>U</mi><mrow><mo>(</mo><mn>4</mn><mo>)</mo></mrow></msup><annotation encoding="application/x-tex">${{U}^{(4)}}$</annotation></semantics></math> </ephtml> , with element <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msubsup><mi>u</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>4</mn><mo>)</mo></mrow></msubsup><annotation encoding="application/x-tex">$u_{i,j}^{(4)}$</annotation></semantics></math> </ephtml> and dimension <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo>(</mo><mrow><msub><mi>n</mi><mi>z</mi></msub><mo>−</mo><mn>1</mn></mrow><mo>)</mo></mrow><mo>×</mo><mrow><mo>(</mo><mrow><msub><mi>n</mi><mi>z</mi></msub><mo>−</mo><mn>1</mn></mrow><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">$({{{n}_z} - 1}) \times ({{{n}_z} - 1})$</annotation></semantics></math> </ephtml> , was generated from a uniform distribution <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>U</mi><mo>(</mo><mrow><mo>−</mo><mn>15</mn><mo>,</mo><mn>15</mn></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$U({ - 15,15})$</annotation></semantics></math> </ephtml> , the same as in Simulation 1. The upper right submatrices <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mi>l</mi><mo>)</mo></mrow></mrow></msup><annotation encoding="application/x-tex">${{P}^{^{\prime}(l)}}$</annotation></semantics></math> </ephtml> were generated by both the latent abilities <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>θ</mi><mi>l</mi></msub><annotation encoding="application/x-tex">${{\theta }_l}$</annotation></semantics></math> </ephtml> and the common uniform matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>U</mi><mrow><mo>(</mo><mn>4</mn><mo>)</mo></mrow></msup><annotation encoding="application/x-tex">${{U}^{(4)}}$</annotation></semantics></math> </ephtml> . That is, for each examinee, a unique matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>P</mi><mrow><msup><mrow /><mo>′</mo></msup><mrow><mo>(</mo><mi>l</mi><mo>)</mo></mrow></mrow></msup><mo>=</mo><mrow><mo>(</mo><msubsup><mi>p</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mi>l</mi><mo>)</mo></mrow></msubsup><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{P}^{^{\prime}(l)}} = ({p_{i,j}^{(l)}})$</annotation></semantics></math> </ephtml> was generated using Equation 12: 12 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>p</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mfenced open="(" close=")"><mi>l</mi></mfenced></msubsup><mo linebreak="badbreak">=</mo><mfrac><mrow><mi>exp</mi><mfenced separators="" open="(" close=")"><mrow><msub><mi>θ</mi><mi>l</mi></msub><msubsup><mi>u</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mfenced open="(" close=")"><mn>4</mn></mfenced></msubsup></mrow></mfenced></mrow><mrow><msubsup><mo>∑</mo><mrow><mi>j</mi><mo>=</mo><mn>1</mn></mrow><mrow><msub><mi>n</mi><mi>k</mi></msub><mo>−</mo><mn>1</mn></mrow></msubsup><mi>exp</mi><mfenced separators="" open="(" close=")"><mrow><msub><mi>θ</mi><mi>l</mi></msub><msubsup><mi>u</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mfenced open="(" close=")"><mn>4</mn></mfenced></msubsup></mrow></mfenced></mrow></mfrac><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}p_{i,j}^{\left(l \right)} = \frac{{\exp \left({{{\theta }_l}u_{i,j}^{\left(4 \right)}} \right)}}{{\sum_{j = 1}^{{{n}_k} - 1} \exp \left({{{\theta }_l}u_{i,j}^{\left(4 \right)}} \right)}}.\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>Given these matrices, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> different Markov matrices were formed using matrix in Equation 10, and the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> th sequence of actions is generated by its unique Markov matrix. The multiplication of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>θ</mi><mi>l</mi></msub><annotation encoding="application/x-tex">${{\theta }_l}$</annotation></semantics></math> </ephtml> with <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msubsup><mi>u</mi><mrow><mi>i</mi><mo>,</mo><mi>j</mi></mrow><mrow><mo>(</mo><mn>4</mn><mo>)</mo></mrow></msubsup><annotation encoding="application/x-tex">$u_{i,j}^{(4)}$</annotation></semantics></math> </ephtml> serves dual purposes. First, it incorporates the examinee's latent ability into the transition probabilities, merging behavioral data with cognitive assessments. Second, it ensures that each examinee is associated with a unique transition matrix, reflecting their individual behavioral patterns in the assessment. This methodological integration allows the generated action sequences to not only mirror observed behaviors but also to be weighted by latent abilities, facilitating the generation of sequences that are varied and aligned with the examinee's performance.</p> <p>Note that we used the Rasch model (Rasch, [<reflink idref="bib24" id="ref39">24</reflink>]) to generate item responses. The Rasch model in Equation 13 has one ability parameter <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>θ</mi><mi>l</mi></msub><annotation encoding="application/x-tex">${{\theta }_l}$</annotation></semantics></math> </ephtml> for each examinee and one difficulty parameter <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>b</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{b}_z}$</annotation></semantics></math> </ephtml> for each item while assuming that all the items have identical item discriminations of 1. 13 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>P</mi><mi>z</mi></msub><mfenced separators="" open="(" close=")"><msub><mi>θ</mi><mi>l</mi></msub></mfenced><mo linebreak="badbreak">=</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><msup><mi>e</mi><mrow><mo>−</mo><mfenced separators="" open="(" close=")"><mrow><msub><mi>θ</mi><mi>l</mi></msub><mo>−</mo><msub><mi>b</mi><mi>z</mi></msub></mrow></mfenced></mrow></msup></mrow></mfrac><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{{P}_z}\left({{{\theta }_l}} \right) = \frac{1}{{1 + {{e}^{ - \left({{{\theta }_l} - {{b}_z}} \right)}}}},\end{equation}$$</annotation></semantics></math> </ephtml> where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>P</mi><mi>z</mi></msub><mrow><mo>(</mo><msub><mi>θ</mi><mi>l</mi></msub><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">${{P}_z}({{{\theta }_l}})$</annotation></semantics></math> </ephtml> is the probability that an examinee with trait level <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>θ</mi><mi>l</mi></msub><annotation encoding="application/x-tex">${{\theta }_l}$</annotation></semantics></math> </ephtml> can answer the item correctly. By employing the Rasch model, the response patterns can be generated. Each examinee's response vector is represented as <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mrow><mi mathvariant="bold-italic">π</mi></mrow><mi>l</mi></msup><annotation encoding="application/x-tex">${{{\bm{\pi }}}^l}$</annotation></semantics></math> </ephtml> . Then, for the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>z</mi><annotation encoding="application/x-tex">$z$</annotation></semantics></math> </ephtml> th item, each examinee's sequence was generated from a unique Markov matrix, and all 3,000 chains were associated with a common uniform matrix.</p> <p>Consider an assessment comprising solely multiple‐choice items, with each item offering four options. In this study, we manipulated six assessment lengths: <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>z</mi><mspace width="0.33em" /><mo>=</mo><mspace width="0.33em" /><mn>5</mn></mrow><annotation encoding="application/x-tex">$z\ = \ 5$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>15</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 15$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>25</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 25$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>35</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 35$</annotation></semantics></math> </ephtml> , <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>45</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 45$</annotation></semantics></math> </ephtml> , and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>55</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 55$</annotation></semantics></math> </ephtml> items. Each item is designed to present examinees with a choice among 10 unique actions, such as entering the item, exiting the item, selecting one of the four options A, B, C, or D, utilizing the calculator or the navigator, and other interactions such as flag an item for review or unflag an item. An assumption is that an examinee will follow a consistent behavioral rule (i.e., in this case, the same uniform distribution of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>U</mi><mo>(</mo><mrow><mo>−</mo><mn>15</mn><mo>,</mo><mn>15</mn></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$U({ - 15,15})$</annotation></semantics></math> </ephtml> ) for all items. That is, while each item utilizes a unique uniform matrix for generating sequences, all matrices are derived from the same uniform distribution. This is to ensure that the sequence of actions or trajectories an examinee employs in responding to assessment items remains stable, irrespective of the specific item being addressed. For each assessment length condition, the item difficulty parameters <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>b</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{b}_z}$</annotation></semantics></math> </ephtml> were randomly generated from a normal distribution <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>N</mi><mo>(</mo><mrow><mn>0</mn><mo>,</mo><mn>1</mn></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">$N({0,1})$</annotation></semantics></math> </ephtml> . This study generates 3,000 examinees and uses a reservoir size of 5,000, given that Simulation 1 indicated that 3,000 examinees and a reservoir size of 5,000 yielded optimal results. For each item, features are extracted by applying the SRM on its 3,000 sequences. For all items, their feature matrices are horizontally concatenated together such that <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi mathvariant="bold-italic">h</mi><mspace width="0.33em" /></mrow><mo>=</mo><mo>(</mo><msub><mi mathvariant="bold-italic">h</mi><mn>1</mn></msub><mspace width="0.33em" /><mo>:</mo><mi>⋯</mi><mo>:</mo><msub><mi mathvariant="bold-italic">h</mi><mi>z</mi></msub><mo>)</mo></mrow><annotation encoding="application/x-tex">${\bm{h\ }} = ({{{\bm{h}}}_1}\ : \cdots :{{{\bm{h}}}_z})$</annotation></semantics></math> </ephtml> .</p> <p>After obtaining the feature matrices, we constructed two linear models. One linear model regresses the simulated <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="bold-italic">θ</mi></mrow><annotation encoding="application/x-tex">${\bm{\theta }}$</annotation></semantics></math> </ephtml> values on both features and responses, denoted as <emph>Rsp+ProcData</emph>, while the other model regresses the simulated <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="bold-italic">θ</mi></mrow><annotation encoding="application/x-tex">${\bm{\theta }}$</annotation></semantics></math> </ephtml> values on responses only, denoted as <emph>RspData</emph>. Both linear models can be written as in Equation 14. 14 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi mathvariant="bold-italic">θ</mi></mrow><mo linebreak="badbreak">=</mo><mi>X</mi><mi>β</mi><mo linebreak="goodbreak">+</mo><mi>ε</mi><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{\bm{\theta }} = X\beta + \epsilon ,\end{equation}$$</annotation></semantics></math> </ephtml> where for the <emph>Rsp+ProcData</emph> model, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mspace width="0.33em" /><mo>=</mo><mo>[</mo><mrow><mn>1</mn><mo>:</mo><mrow><mi mathvariant="bold-italic">π</mi></mrow><mo>:</mo><mi mathvariant="bold-italic">h</mi></mrow><mo>]</mo></mrow><annotation encoding="application/x-tex">$X\ = [ {1:{\bm{\pi }}:{\bm{h}}} ]$</annotation></semantics></math> </ephtml> means the column concatenation of vector <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 1$</annotation></semantics></math> </ephtml> , response matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi mathvariant="bold-italic">π</mi></mrow><mo>,</mo></mrow><annotation encoding="application/x-tex">${\bm{\pi }},$</annotation></semantics></math> </ephtml> and feature matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi mathvariant="bold-italic">h</mi><annotation encoding="application/x-tex">${\bm{h}}$</annotation></semantics></math> </ephtml> , and for the <emph>RspData</emph> model, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mspace width="0.33em" /><mo>=</mo><mo>[</mo><mrow><mn>1</mn><mo>:</mo><mrow><mi mathvariant="bold-italic">π</mi></mrow></mrow><mo>]</mo></mrow><annotation encoding="application/x-tex">$X\ = [ {1:{\bm{\pi }}} ]$</annotation></semantics></math> </ephtml> means the column concatenation of vector <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn></mrow><annotation encoding="application/x-tex">$\hskip.001pt 1$</annotation></semantics></math> </ephtml> and response matrix <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="bold-italic">π</mi></mrow><annotation encoding="application/x-tex">${\bm{\pi }}$</annotation></semantics></math> </ephtml> . The least absolute shrinkage and selection operator (LASSO) is used for selecting useful features in the linear model.</p> <p>We then can evaluate the two models using the root mean square errors (RMSEs) for the ability recovery. The RMSE for all examinees' latent ability is calculated by Equation 15: 15 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi>RMSE</mi><mspace width="0.33em" /></mrow><mo linebreak="badbreak">=</mo><msqrt><mrow><mfrac><mn>1</mn><mi>L</mi></mfrac><msubsup><mo>∑</mo><mrow><mi>l</mi><mo>=</mo><mn>1</mn></mrow><mi>L</mi></msubsup><msup><mfenced separators="" open="(" close=")"><mrow><mover accent="true"><msub><mi>θ</mi><mi>l</mi></msub><mo>̂</mo></mover><mo>−</mo><msub><mi>θ</mi><mi>l</mi></msub></mrow></mfenced><mn>2</mn></msup><mo>,</mo></mrow></msqrt></mrow><annotation encoding="application/x-tex">$$\begin{equation}{\mathrm{RMSE\ }} = \sqrt {\frac{1}{L}\sum_{l = 1}^L {{{\left({\widehat {{{\theta }_l}} - {{\theta }_l}} \right)}}^2},} \end{equation}$$</annotation></semantics></math> </ephtml> where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>θ</mi><mi>l</mi></msub><annotation encoding="application/x-tex">${{\theta }_l}$</annotation></semantics></math> </ephtml> stands for the true ability for <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math> </ephtml> th examinee, and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover accent="true"><msub><mi>θ</mi><mi>l</mi></msub><mo>̂</mo></mover><annotation encoding="application/x-tex">$\widehat {{{\theta }_l}}$</annotation></semantics></math> </ephtml> is the estimated ability. The lower the RMSE, the better the model prediction. Model fit indices used in this study are residual standard error (RSE) and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>R</mi><mn>2</mn></msup><annotation encoding="application/x-tex">${{R}^2}$</annotation></semantics></math> </ephtml> . RSE measures the standard deviation of the residuals in a regression model by calculating 16 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi>RSE</mi><mspace width="0.33em" /></mrow><mo linebreak="badbreak">=</mo><msqrt><mfrac><mrow><msub><mo>∑</mo><mi>l</mi></msub><msup><mfenced separators="" open="(" close=")"><mrow><msub><mi>θ</mi><mi>l</mi></msub><mo>−</mo><mover accent="true"><msub><mi>θ</mi><mi>l</mi></msub><mo>̂</mo></mover></mrow></mfenced><mn>2</mn></msup></mrow><mrow><mi>d</mi><mi>f</mi></mrow></mfrac></msqrt><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{\mathrm{RSE\ }} = \sqrt {\frac{{\sum_l {{{\left({{{\theta }_l} - \widehat {{{\theta }_l}}} \right)}}^2}}}{{df}}} ,\end{equation}$$</annotation></semantics></math> </ephtml> where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>θ</mi><annotation encoding="application/x-tex">$\theta $</annotation></semantics></math> </ephtml> is the true ability, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover accent="true"><mi>θ</mi><mo>̂</mo></mover><annotation encoding="application/x-tex">$\hat{\theta }$</annotation></semantics></math> </ephtml> is the predicted ability, and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>d</mi><mi>f</mi></mrow><annotation encoding="application/x-tex">$df$</annotation></semantics></math> </ephtml> is degree of freedom. The lower the RSE, the better the model fit. The measure of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>R</mi><mn>2</mn></msup><annotation encoding="application/x-tex">${{R}^2}$</annotation></semantics></math> </ephtml> represents the proportion of the variance for the predicted variable that is explained by the predictors in the regression model by computing 17 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>R</mi><mn>2</mn></msup><mo linebreak="badbreak">=</mo><mn>1</mn><mo linebreak="goodbreak">−</mo><mfrac><mrow><mi>R</mi><mi>S</mi><mi>S</mi></mrow><mrow><mi>T</mi><mi>S</mi><mi>S</mi></mrow></mfrac><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{{R}^2} = 1 - \frac{{RSS}}{{TSS}},\end{equation}$$</annotation></semantics></math> </ephtml> where the <emph>RSS</emph> is the residual sum of squares and <emph>TSS</emph> is the total sum of squares. The higher the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>R</mi><mn>2</mn></msup><annotation encoding="application/x-tex">${{R}^2}$</annotation></semantics></math> </ephtml> , the better the model fit.</p> <hd id="AN0192629994-20">Results</hd> <p>By employing PCA analysis on the extracted features associated within each item, distinct patterns and relationships become evident. The principal components derived from examinees with comparable latent abilities exhibit similarities across items. Figure 6 shows a plot of the first and second principal components for each examinee from one of the items. The legend on the right of the figure indicates the generated <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>θ</mi><annotation encoding="application/x-tex">$\theta $</annotation></semantics></math> </ephtml> for each examinee. The darker dots indicate a higher <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>θ</mi><annotation encoding="application/x-tex">$\theta $</annotation></semantics></math> </ephtml> and the lighter dots indicate a lower <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>θ</mi><annotation encoding="application/x-tex">$\theta $</annotation></semantics></math> </ephtml> . It is interesting to see that examinees located closer to each other have similar latent ability values, and that their ability information can be represented and compressed in the extracted features.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0006.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0006.jpg" title="6 PCA of the features extracted from 3,000 examinees." /> </p> <p></p> <p>The RMSE values of the recovery of the latent ability given by the two models are shown in Table 5. <emph>Rsp+ProcData</emph> model provides better ability recovery than <emph>RspData</emph> model at each level of number of items. In addition, by using more items, two linear models both yielded higher ability recovery accuracy. This confirms that incorporating process data can offer valuable insights into examinees' latent abilities.</p> <p>5 Table RMSEs of the Latent Ability Recovery Given by the Rsp+ProcData Model and the RspData Model</p> <p> <ephtml> <table><thead><tr><th /><th align="center">Number of Items in the Assessment</th></tr><tr><th /><th>5</th><th>15</th><th>25</th><th>35</th><th>45</th><th>55</th></tr></thead><tbody><tr><td>Rsp+ProcData</td><td>.399</td><td>.351</td><td>.331</td><td>.303</td><td>.287</td><td>.263</td></tr><tr><td>RspData</td><td>.711</td><td>.513</td><td>.430</td><td>.387</td><td>.346</td><td>.298</td></tr></tbody></table> </ephtml> </p> <p>Figure 7 plots the two model fit indices of the <emph>Rsp+ProcData</emph> model versus the <emph>RspData</emph> model. In this figure, the <emph>x</emph>‐axis indicates the fit index values from the <emph>RspData</emph> model and the <emph>y</emph>‐axis indicates the fit index values from the <emph>Rsp+ProcData</emph> model. The number of items used in the assessment is represented by increasing radius sizes in this legend. The diagonal line separates an upper triangle section and a lower triangle section. The symbols in the upper triangle section indicate that the fit index value from the <emph>Rsp+ProcData</emph> model is higher than those from the <emph>RspData</emph> model, and the dots in the lower triangle indicate the fit from the <emph>Rsp+ProcData</emph> model is lower than from the <emph>RspData</emph> model. All <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>R</mi><mn>2</mn></msup><annotation encoding="application/x-tex">${{R}^2}$</annotation></semantics></math> </ephtml> values fall into the upper triangle section, while all RSE values fall into the lower triangle section. This suggests that the model fit provided by the <emph>Rsp+ProcData</emph> model outperforms that of the <emph>RspData</emph> model. In essence, the incorporation of additional information from process data seems to lead to a superior fit to the linear model compared to using response data alone. Moreover, as more item responses are added to the linear model, the symbols tend to approach the line. This trend hints that the disparities between the two models may diminish with the inclusion of more item responses.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0007.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0007.jpg" title="7 The model fit index values of Rsp+ProcData model versus the RspData model." /> </p> <p></p> <hd id="AN0192629994-23">Empirical Study</hd> <p></p> <hd id="AN0192629994-24">Data Description</hd> <p>An empirical data application is presented here to assess the practical utility of the features extracted from the process data using the proposed SRM. The data were obtained from a competition that uses data from the National Assessment of Educational Progress and Educational Testing Service.[<reflink idref="bib1" id="ref40">1</reflink>] The primary objective of the competition, was to develop state‐of‐art models that can leverage behavioral data to accurately predict examinees' test‐taking efficiency. This 8th‐grade mathematics assessment was administered in the 2016‐2017 academic year and contains a de‐identified compilation of action sequences made by examinees. Examinees were provided with multiple‐choice items, drag and drop items, or constructed‐response items. For some items, an on‐screen calculator and drawing tools were available. There was also a text‐to‐speech feature that allowed examinees to listen to the task materials. In this assessment, examinees responded to two "blocks." We refer to them here as Blocks A and B. Block A contained 19 items and Block B contained 15 items. Each examinee had a 30‐minute time limit to complete the items in a block. Once the 30 minutes was reached, the examinee was automatically cut off from further activities in the block, regardless of how many items they have completed.</p> <p>To assess the efficiency with which examinees completed Block B, human raters assigned binary labels for each examinee based on criteria set by the assessment organizers. Specifically, the assessment organizers defined efficient usage of time as (<reflink idref="bib1" id="ref41">1</reflink>) being able to complete all problems in Block B, and (<reflink idref="bib2" id="ref42">2</reflink>) being able to allocate a reasonable amount of time to solve each item. This "reasonable amount of time" is regarded as the minimum possible time needed to solve each item by data organizers. They chose the threshold based on the distribution of the total amount of time students spent on each problem in the dataset. Specifically, for each item in Block B, they ranked the total amount of time each student took to complete each problem and used the 5th percentile as the cut‐off for the "reasonable amount of time."</p> <p>The process data were partitioned by the data organizers into two distinct subsets. Subset 1 comprised action sequences from 1,232 examinees recorded throughout the entire 30‐minute duration of Block A. A binary indicator was used to denote each examinee's efficiency in completing Block B. Subset 2 was divided into three separate portions, each corresponding to a specific binary efficiency label. The first portion includes the initial 10 minutes of process data for 411 examinees from the onset of Block A. The second portion captures the first 20 minutes of process data for a distinct set of 411 examinees. Finally, the third portion contained the complete 30 minutes of process data for the remaining 410 examinees. The structure of these subsets and blocks is presented in Figure 8.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0008.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0008.jpg" title="8 Empirical data sets and assessment block structure." /> </p> <p></p> <hd id="AN0192629994-26">Data Analysis Plan</hd> <p>This empirical study is intended to leverage the SRM to process data from Block A, extract features, and predict binary labels indicating each examinee's efficiency in Block B, employing 10‐fold cross‐validation for robustness. The two subsets are analyzed separately. In Subset 1, SRM analyzes the complete 30‐minute action sequences from 1,232 examinees to generate features for efficiency prediction. For Subset 2, a similar approach is taken, but with a key distinction: SRM is applied independently to each of the three portions of the data, predicting the efficiency labels for the corresponding portion.</p> <p>We conducted the empirical analysis from three aspects. First, we determined the viability of identifying noteworthy attributes from an examinee's complete sequence of actions within Block A based on Subset 1 data. This is intended to facilitate the prediction of the examinee's subsequent performance efficiency within Block B. Second, leveraging the data from Subset 2, we sought to evaluate whether attributes derived from a partial segment of the entire process could adequately capture sufficient useful information. It is important to note that this analysis involves predicting performance in Block B based on Block A data. It assumes that an examinee's behavioral tendencies, such as the allocation of reasoning time and response pace, exhibit consistency across assessments, therein reflecting intra‐individual stability. Finally, for the results obtained from both subsets, we offer exploratory interpretations of the extracted attributes based on their operational relevance.</p> <p>Table 6 lists the descriptive statistics for the two subsets. Total length means the total number of actions for all examinees stored in each data set; Unique Actions means the number of unique actions in the log data; and Average Length indicates the mean action length per examinee, shown with standard deviations in the parenthesis. The number of examinees falling into each efficiency and inefficient level for each subset is also listed in this table.</p> <p>6 Table Descriptive Statistics for the NAEP Math Assessment Subset 1 and Subset 2 Process Data</p> <p> <ephtml> <table><thead><tr><th /><th>Total Length</th><th>Unique Actions</th><th>Average Length</th><th>Efficient</th><th>Inefficient</th></tr></thead><tbody><tr><td>Subset 1</td><td>438,291</td><td>42</td><td>356 (166)</td><td>744</td><td>488</td></tr><tr><td>Portion 1</td><td>47,563</td><td>39</td><td>116 (57)</td><td>248</td><td>163</td></tr><tr><td>Portion 2</td><td>110,481</td><td>41</td><td>269 (116)</td><td>248</td><td>163</td></tr><tr><td>Portion 3</td><td>143,880</td><td>42</td><td>351 (158)</td><td>248</td><td>162</td></tr></tbody></table> </ephtml> </p> <p>The prediction of efficiency labels using extracted features is facilitated by employing a support vector machine (SVM; Noble, [<reflink idref="bib22" id="ref43">22</reflink>]). SVM is a supervised learning algorithm that learns either the linear or nonlinear decision boundary to classify samples with labels (see SVM details in Appendix). Model prediction is then evaluated with the adjusted Area Under the Curve, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> defined as in Equation 18, to indicate how well the model can predict the outcome. 18 <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><mspace width="0.33em" /><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub><mo linebreak="badbreak">=</mo><mspace width="0.33em" /><mn>2</mn><mo linebreak="goodbreak">×</mo><mfenced separators="" open="(" close=")"><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>o</mi><mi>r</mi><mi>i</mi></mrow></msub><mo>−</mo><mn>0.5</mn></mrow></mfenced><mo>.</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}AU\ {{C}_{adj}} = \ 2 \times \left({AU{{C}_{ori}} - 0.5} \right).\end{equation}$$</annotation></semantics></math> </ephtml></p> <p>Here <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>o</mi><mi>r</mi><mi>i</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{ori}}$</annotation></semantics></math> </ephtml> is the original area value under the curve, and.5 is the random guessing rate for a classification task with binary labels. The adjusted AUC value can provide a more intuitive understanding of how much better the model is performing over random guessing chance. The higher the adjusted AUC, the better the model classification.</p> <hd id="AN0192629994-27">Results 1: Feature Extraction and Model Evaluation Based on Subset 1</hd> <p>We first reorganize the process data by both the personal ID and the item identification number. That is, the actions for each item are first aggregated, then we further organize the actions produced by the different examinees. Therefore, the SRM is applied to the process information for each item for all 1,232 examinees in this subset. Suppose for <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>z</mi><annotation encoding="application/x-tex">$z$</annotation></semantics></math> </ephtml> th item, the extracted high‐dimensional feature matrix is denoted as <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi mathvariant="bold-italic">h</mi><mrow><mo>(</mo><mi>z</mi><mo>)</mo></mrow></msup><annotation encoding="application/x-tex">${{{\bm{h}}}^{(z)}}$</annotation></semantics></math> </ephtml> . Then a series of SVM models are constructed, with each is done by adding a given item's feature matrix one at a time such that <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi mathvariant="bold-italic">H</mi><mspace width="0.33em" /></mrow><mo>=</mo><mspace width="0.33em" /><mo>(</mo><mrow><msup><mi mathvariant="bold-italic">h</mi><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></msup><mo>:</mo><mi>⋯</mi><mo>:</mo><msup><mi mathvariant="bold-italic">h</mi><mrow><mo>(</mo><mi>z</mi><mo>)</mo></mrow></msup></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">${\bm{H\ }} = \ ({{{{\bm{h}}}^{(1)}}: \cdots :{{{\bm{h}}}^{(z)}}})$</annotation></semantics></math> </ephtml> . In this study, SVM‐Recursive Feature Elimination method (SVM‐RFE; Guyon et al., [<reflink idref="bib10" id="ref44">10</reflink>]) was used to determine a set of selected features. It ranks features concerning their relevance to the cost function based on a backward sequential selection.</p> <p>In this study, reservoir size of 5,000 was applied given its optimal results from our simulation studies. The SRM tried each of the predefined number, 25, 50, 75, 100, 125, 150, 175, and 200, and selected the one which produced the best result. Table 7 shows the length of all action sequences for each item and the corresponding optimal feature number selected from the predefined numbers. From this table, it can be seen that the SRM selected different number of features for each item through its optimization process. The smallest number of features is 50, and the largest number of features is 150.</p> <p>7 Table Item Sequence Length and Number of Features</p> <p> <ephtml> <table><thead><tr><th>Item</th><th>Number of Examinees</th><th>Length of Sequence</th><th>Number of Features</th></tr></thead><tbody><tr><td>VH098519</td><td>1,232</td><td>16,768</td><td>100</td></tr><tr><td>VH098522</td><td>1,057</td><td>18,837</td><td>100</td></tr><tr><td>VH098556</td><td>1,097</td><td>6,808</td><td>50</td></tr><tr><td>VH098597</td><td>1,121</td><td>9,286</td><td>50</td></tr><tr><td>VH098740</td><td>1,229</td><td>12,235</td><td>75</td></tr><tr><td>VH098753</td><td>1,229</td><td>19,270</td><td>100</td></tr><tr><td>VH098759</td><td>1,229</td><td>20,905</td><td>100</td></tr><tr><td>VH098779</td><td>1,083</td><td>8,897</td><td>50</td></tr><tr><td>VH098783</td><td>1,219</td><td>18,584</td><td>100</td></tr><tr><td>VH098808</td><td>1,229</td><td>18,756</td><td>100</td></tr><tr><td>VH098810</td><td>1,232</td><td>9,178</td><td>50</td></tr><tr><td>VH098812</td><td>1,210</td><td>11,823</td><td>75</td></tr><tr><td>VH098834</td><td>1,070</td><td>8,351</td><td>50</td></tr><tr><td>VH098839</td><td>1,157</td><td>11,896</td><td>75</td></tr><tr><td>VH134366</td><td>1,230</td><td>71,979</td><td>150</td></tr><tr><td>VH134373</td><td>1,184</td><td>34,382</td><td>100</td></tr><tr><td>VH134387</td><td>1,226</td><td>34,118</td><td>100</td></tr><tr><td>VH139047</td><td>1,228</td><td>29,260</td><td>100</td></tr><tr><td>VH139196</td><td>1,201</td><td>61,587</td><td>150</td></tr></tbody></table> </ephtml> </p> <p>Figure 9 presents the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> changes against the adding of a given item's feature matrix one at a time in the SVM models. Each tick on the <emph>x</emph>‐axis indicates the addition of the corresponding item's features into the SVM classification model. In addition, each adjusted AUC value resulted from adding each item is annotated with the number of unique actions for that item. For example, there are a total of 29 unique actions in item VH098519. We can observe that, with the addition of item process information, the SVM model <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> gradually increases. The other finding is that some item process information appears to contribute more than that of other items. However, a small increase in <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> does not necessarily imply that an item lacks informative value; rather, it suggests that the item contributes limited additional information over and above what has already been accounted for by other items included earlier in the model. That is, different results might be obtained if the order of item inclusion is altered. The correlation between the number of unique actions and the corresponding increase of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> values was.221, which suggests there is not a strong relationship between the number of unique actions and the increase of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> . Specific adjusted AUC increase value for each item can be found in the Appendix (i.e., Table A2). From Figure 9, it can be found that initially, when only one item is added to the model, the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> is poor. This is expected as the model is just beginning to capture the underlying patterns with limited data. As more items are added and the model has access to a broader range of data and interactions, the <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> reaches above.40. This improvement signals a high predictive performance, demonstrating the SRM's effectiveness in leveraging more comprehensive data to better predict the outcomes.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0009.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0009.jpg" title="9 Adjusted AUC value changes against the addition of new item's features in the SVM. Each adjusted AUC value resulted from adding each item is annotated with the number of unique actions for that item." /> </p> <p></p> <p>Since a substantial number of features have been automatically extracted, the relative importance of these features in the classification process remains uncertain. Therefore, a feature importance analysis was performed along with the SVM‐RFE. Importance is defined as the weight of each feature in the SVM (Huang et al., [<reflink idref="bib13" id="ref45">13</reflink>]). Importance was calculated and the weight was scaled between 0 and 100. Furthermore, the interpretability of these automatically selected features presents challenges, impeding a coherent understanding. To facilitate the interpretation of the selected features, we report the correlations between each feature vector and a set of manually created variables. These variables include the median action length of each examinee, the total action length of each examinee, the answer changes made by each examinee when responding to the item, and the count of items to which each examinee provided multiple responses, and other relevant actions taken by examinees during the assessment. In this analysis, correlations were computed between each feature vector and each manually defined variable.</p> <p>There were hundreds of features in this analysis. Table 8 presents the top 10 features based on the importance analysis and their correlations with the selected variables. We then provide feature interpretations by identifying the highest correlation values with our manually defined variable. For instance, V1 has a correlation of −.43 with the defined variable of action length of item VH134387. The action length of item VH134387 represents each examinee's sequence length of item VH134387. This may indicate that, if the examinee's sequence length of item VH134387 is high, then V1 would have a lower value, suggesting that it could result in an important change to the determination of level in the SVM. V8 can be explained as representing the number of eliminating choices. So, it could be interpreted as if an examinee had a larger number of eliminated choices, the examinee would have a higher value on V8. Note that the interpretation of each feature is selected based solely on the variable that exhibits the highest correlation with the feature. It is important to acknowledge that multiple variables may be correlated with each feature. Each manually defined variable captures specific aspects of these behaviors, but given the potential for correlations between variables and the complexity of the data, these variables alone may not be sufficient for predictive modeling. Consequently, the features, representing a synthesis of these variables and potentially others, provide a comprehensive, multidimensional perspective of the data, enabling a more robust analysis and understanding of assessment behaviors.</p> <p>8 Table Interpretation to the Top 10 Important Features</p> <p> <ephtml> <table><thead><tr><th>Feature</th><th align="center">Manually Defined Variable</th><th>Correlation</th></tr></thead><tbody><tr><td>V1</td><td>Action length of item VH134387</td><td>−.43<ext-link /><sup>*</sup></td></tr><tr><td>V2</td><td>Action length of item VH098834</td><td>−.54<ext-link /><sup>*</sup></td></tr><tr><td>V3</td><td>Action length of item VH098759</td><td>−.62<ext-link /><sup>*</sup></td></tr><tr><td>V4</td><td>Action length of item VH134366</td><td>−.67<ext-link /><sup>*</sup></td></tr><tr><td>V5</td><td>Number of eliminating choice</td><td>.38<ext-link /><sup>*</sup></td></tr><tr><td>V6</td><td>Action length of item VH139196</td><td>−.55<ext-link /><sup>*</sup></td></tr><tr><td>V7</td><td>Action length of item VH098597</td><td>−.61<ext-link /><sup>*</sup></td></tr><tr><td>V8</td><td>Number of clicking progress navigator</td><td>.40<ext-link /><sup>*</sup></td></tr><tr><td>V9</td><td>Action length of item VH139047</td><td>−.44<ext-link /><sup>*</sup></td></tr><tr><td>V10</td><td>Number of opening calculator</td><td>−.39<ext-link /><sup>*</sup></td></tr></tbody></table> </ephtml> </p> <p>2 * A significant correlation at <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi><mo>=</mo><mn>0.05</mn></mrow><annotation encoding="application/x-tex">$\alpha = 0.05$</annotation></semantics></math> </ephtml> .</p> <hd id="AN0192629994-29">Results 2: Feature Extraction and Model Evaluation Based on Subset 2</hd> <p>We similarly applied SRM directly on the three portions of Subset 2. Table 9 shows the sample size, sequence length, extracted feature numbers, and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> values for each portion. The <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> values for the three portions were.12,.34, and.39, respectively. These <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math> </ephtml> values for the three portions indicate a range from fair to approaching good predictive performance. This implies that longer process information tends to result in higher classification accuracy. It suggests that a portion of the entire process may contain valuable information, and a 20‐minute subset of the data might be sufficient to predict examinees' efficiency levels. Additionally, combining processes from different items also seems to be utilized in predicting examinees' efficiency levels.</p> <p>9 Table Feature Number and Adjusted AUC Value for Each Portion in Subset 2</p> <p> <ephtml> <table><thead><tr><th>Data</th><th>Sample Size</th><th>Length of Sequence</th><th>Feature Number</th><th><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math></p></th></tr></thead><tbody><tr><td>Portion 1</td><td>411</td><td>47,563</td><td>75</td><td>.12</td></tr><tr><td>Portion 2</td><td>411</td><td>110,481</td><td>100</td><td>.34</td></tr><tr><td>Portion 3</td><td>410</td><td>143,880</td><td>150</td><td>.39</td></tr></tbody></table> </ephtml> </p> <p>We follow the same procedure of conducting the importance analysis and calculating the correlations of the top 10 important features with a set of manually defined variables (Table 10) to provide feature interpretations. For instance, V2 has a correlation of.64 with the number of items the examinee responded to more than once. This may suggest that, if the examinee responds with a given activity for more items more than one time, V2 value will likely be higher. One possible reason is that the examinee may have sufficient time to go back to items. The differences of possible interpretations between the features in Subset 2 and Subset 1 stem from their derivation from distinct data subsets, with each subset representing a unique organizational and temporal structure of the process data. It is obvious that, given the data structural differences between Subset 1 and Subset 2, the features learned by SRM reflect the specific contexts and constraints of each subset. Subset 1, reorganized at the item level, emphasizes features related to item‐specific interactions, such as time spent on each item. In contrast, Subset 2 focuses on time‐controlled data, capturing the sequence of actions across several items within fixed time intervals. Compared with the features extracted from Subset 1, which mainly focuses on the information from a single item, the feature extracted based upon a combination of all item processes in Subset 2 tends to contain some global information such as how many times the examinee used the math keypress or how many items did the examinee responded to more than one time.</p> <p>10 Table Interpretation to the Top 10 Important Features</p> <p> <ephtml> <table><thead><tr><th>Feature</th><th align="center">Manually Defined Variable</th><th>Correlation</th></tr></thead><tbody><tr><td>V1</td><td>The total sequence length of all items in Block A</td><td>−.59<ext-link /><sup>*</sup></td></tr><tr><td>V2</td><td>The number of items did the examinee enter more than once</td><td>.64<ext-link /><sup>*</sup></td></tr><tr><td>V3</td><td>Action length of item VH098834</td><td>−.58<ext-link /><sup>*</sup></td></tr><tr><td>V4</td><td>The median number of times the examinee enter each item</td><td>.47<ext-link /><sup>*</sup></td></tr><tr><td>V5</td><td>Action length of item VH134366</td><td>−.66<ext-link /><sup>*</sup></td></tr><tr><td>V6</td><td>The median number of changes made to each item</td><td>−.48<ext-link /><sup>*</sup></td></tr><tr><td>V7</td><td>Action length of item VH139196</td><td>−.71<ext-link /><sup>*</sup></td></tr><tr><td>V8</td><td>Action length of item VH098597</td><td>−.39<ext-link /><sup>*</sup></td></tr><tr><td>V9</td><td>The total number of times the examinee used the math keypress</td><td>−.44<ext-link /><sup>*</sup></td></tr><tr><td>V10</td><td>The median number of times the examinee losing focus</td><td>−.37<ext-link /><sup>*</sup></td></tr></tbody></table> </ephtml> </p> <p>3 * A significant correlation at <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi><mo>=</mo><mn>0.05</mn></mrow><annotation encoding="application/x-tex">$\alpha = 0.05$</annotation></semantics></math> </ephtml> .</p> <hd id="AN0192629994-30">Summary and Discussion</hd> <p>The present study explores the effectiveness of the proposed Sequence Reservoir Method for extracting features from a specific type of process data—log files. Given the variability in examinees' actions on each item within an assessment, both within and among examinees, the SRM successfully converts varying‐length action sequence data for each examinee into fixed‐length vectors, thereby facilitating the creation of a features matrix. In essence, when log files are inputted into the SRM, categorical action sequences are transformed into embedded action sequences. These sequences are then processed by an Echo State Network with an optimized reservoir weight matrix to derive final features for each examinee. Subsequently, these features can be utilized for necessary statistical analyses concerning the desired target variable output.</p> <p>Two simulation studies were done to demonstrate the efficacy of SRM features from different angles. In Simulation 1, we explored the potential of utilizing extracted features from simulated action sequences, presumed to exhibit a discrete group structure, for examinee classification. Results revealed that the group classification accuracy using SRM features surpassed that of baseline features, indicating that SRM‐derived features effectively captured meaningful process information about various Markov groups. Moreover, larger sample sizes and reservoirs were found to enhance classification accuracy. In Simulation 2, we conducted a comparative analysis of examinees' latent abilities and model fit using two different models. One model integrated both process data and response data, while the other relied solely on response data. The findings demonstrated that the incorporation of SRM features alongside final responses led to reduced errors in estimating latent abilities compared to relying solely on responses. In terms of model fit, integrating SRM features within the linear model, alongside response patterns, resulted in superior model fit compared to the model exclusively based on responses.</p> <p>The application of SRM on the empirical data from the NAEP math assessment was used for multipurposes. Results from Subset 1 demonstrate the efficacy of utilizing comprehensive item‐specific process information of examinees to effectively forecast their corresponding efficiency levels. These efficiency levels serve as indicative of the temporal distribution of examinee behavior during the assessment. Notably, incorporating features from more items resulted in an augmentation of predictive accuracy concerning efficiency levels. Nonetheless, certain item process features appeared to carry more weight than others, highlighting the usefulness of including process features to anticipate the temporal allocation efficiency in examinees' assessment behaviors. Interpretation of the importance of features was based on their correlations with defined variables, such as sequence length. These analyses further suggest that examinees spending more time on certain actions will likely influence their overall efficiency level in the assessment. A comprehensive set of variables was defined intended to capture a wide array of behaviors within the process data. It is important to note that these variables may or may not exhaust all the information contained within the data. Our approach was guided by the goal of exploratory analysis, which is to identify and describe patterns that could offer insights into examinee behaviors during assessment. Consequently, these manually defined variables serve primarily explanatory purposes rather than predictive ones. The rationale behind this methodological choice stems from the complexity of the process data and the multifaceted nature of the behaviors the features captured. Looking ahead, there is ample opportunity for further research from a psychometric standpoint. Future work could focus on investigating the connections between SRM‐derived features and item‐level characteristics such as item discrimination or item difficulty. This could potentially improve understanding of the interplay between examinee behavior and assessment design.</p> <p>In Subset 2, the examinees' partial process information from the entire 30‐minute time limit was used. Results suggest that a proportion of the holistic process potentially contain pertinent information. Furthermore, the data encompassing a 20‐minute interval appears to hold adequate potential for anticipating the efficiency levels of examinees. It is noteworthy that longer sequences of process data led to enhanced classification outcomes, despite the efficacy demonstrated by the 20‐minute interval. Finally, combining all items' processes resulted in features that contained information for predicting ability, including metrics such as the number of times the examinee used certain actions or entered an item multiple times. The examination of the top 10 important features, such as action lengths across various items, showcases SRM's capacity to autonomously discover latent patterns within process data without relying on training labels. In this empirical study, the SRM extracted features prior to the application of efficiency labels. The encouraging interpretability of these features underscores the independent feature extraction process as the model was not exposed to efficiency labels during the learning phase. This exploration demonstrates SRM's capability to discern informative features from the data and also highlights its potential to deliver a comprehensive set of features that can later be correlated with outcome measures like efficiency.</p> <p>While SRM and the sequence‐to‐sequence autoencoder share some common characteristics in feature extraction method, there are at least two aspects where SRM offers distinct advantages. First, sequence‐to‐sequence autoencoder uses a sequence reconstruction algorithm to find features from original action sequences, so the target variables must be the original sequences, which utility is somewhat limited when it comes to directly predicting variables that are not sequences or sequence embeddings. However, the optimization in SRM relies on the use of a fitness function, not the original action sequences directly. The target variable, or output layer, is flexible because SRM allows for the output layer to be adapted to a wide variety of target variables that may be of interest in educational assessments. For instance, the output could be configured to predict binary outcomes (e.g., pass/fail), continuous variables (such as response time, engagement, or self‐efficacy measures), or even categorical data. This flexibility is afforded by the model's use of a simple linear learning algorithm to map learned features to the output layer, in addition to the features learned from the reservoir. Second, a sequence‐to‐sequence autoencoder requires the training of both encoder and decoder parts, involving a larger number of parameters and layers to be trained by backpropagation through time or similar algorithms. In contrast, SRM only requires optimizing a dynamic reservoir that captures temporal dependencies and patterns within sequences without the need for extensive training on all the layers. This is particularly appealing for scenarios with limited computational resources or for long sequences or large datasets.</p> <p>One potential limitation of this study utilizing the SRM is its exclusive focus on log files. This narrow scope may limit the broader applicability of the model to other types of process data, such as eye‐tracking data. Future research endeavors could broaden the scope by incorporating a wider array of process data to evaluate the model's performance across diverse settings. Furthermore, while the SRM showed promise in extracting meaningful features and patterns, it is essential to acknowledge that the model relies on certain assumptions about the data such as each person's behaviors are consistent in the assessment. Any discrepancies or anomalies in the input data may affect the model's performance, and regular updates and adaptations to the model would be necessary to ensure its relevance and accuracy. Expanding the scope of data sources, accounting for biases, and exploring practical applications are all vital areas for future research. Despite these limitations, the study lays a foundation for advancements in educational data mining and the potential for more effective personalized behavior analysis.</p> <hd id="AN0192629994-31">Acknowledgments</hd> <p>We express our sincere gratitude to the Editor and the reviewers for their invaluable and constructive feedback on this article. Dr. Jiawei Xiong expresses his sincere gratitude to Prof. George Engelhard, Prof. Seock‐Ho Kim, and Prof. Sheng Li for their great support and guidance on this study.</p> <hd id="AN0192629994-32">Appendix Particle Swarm Optimization</hd> <p>In PSO, each particle <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>r</mi><annotation encoding="application/x-tex">$r$</annotation></semantics></math> </ephtml> can be marked by a pair of values called position and velocity, denoted as <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mrow><msub><mi>p</mi><mi>r</mi></msub><mo>,</mo><mspace width="0.33em" /><msub><mi>v</mi><mi>r</mi></msub></mrow><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">$({{{p}_r},\ {{v}_r}})$</annotation></semantics></math> </ephtml> . Some terms will be introduced, such as the personal best <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>p</mi><mrow><mi>r</mi><mo>,</mo><mi>b</mi><mi>e</mi><mi>s</mi><mi>t</mi></mrow></msub><annotation encoding="application/x-tex">${{p}_{r,best}}$</annotation></semantics></math> </ephtml> . This is the best solution for particle <emph>r</emph> up to this point. The global best, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>g</mi><mrow><mi>b</mi><mi>e</mi><mi>s</mi><mi>t</mi></mrow></msub><mo>,</mo></mrow><annotation encoding="application/x-tex">${{g}_{best}},$</annotation></semantics></math> </ephtml> is the best value obtained by any particle in the whole system. If the algorithm stops running, the global best value will be taken as the optimal value. For the obtained best value at each step, some constants serve as controlling coefficients and determine their importance for calculating the movement of the particle, as described in Equations  A1 and  A2. The particle's velocity and the position of particle <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>r</mi><annotation encoding="application/x-tex">$r$</annotation></semantics></math> </ephtml> at time <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>k</mi><annotation encoding="application/x-tex">$k$</annotation></semantics></math> </ephtml> in a search space can be updated using Equations  A1 and A2:</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable displaystyle="true"><mtr><mtd columnalign="right"><msub><mi>v</mi><mrow><mi>r</mi><mo>,</mo><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd><mtd><mo>=</mo></mtd><mtd columnalign="left"><mrow><msub><mi>W</mi><mrow><mi>t</mi><mi>i</mi><mi>a</mi></mrow></msub><msub><mi>v</mi><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow></msub><mo linebreak="badbreak">+</mo><msub><mi>C</mi><mn>1</mn></msub><mo linebreak="goodbreak">×</mo><mi>r</mi><mi>a</mi><mi>n</mi><mi>d</mi><mfenced separators="" open="[" close="]"><mrow><mn>0</mn><mo>,</mo><mn>1</mn></mrow></mfenced><mo linebreak="goodbreak">×</mo><mfenced separators="" open="(" close=")"><mrow><msub><mi>p</mi><mrow><mi>r</mi><mo>,</mo><mi>b</mi><mi>e</mi><mi>s</mi><mi>t</mi></mrow></msub><mo>−</mo><msub><mi>p</mi><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow></msub></mrow></mfenced></mrow></mtd></mtr><mtr><mtd /><mtd /><mtd columnalign="left"><mrow><mo linebreak="badbreak">+</mo><mspace width="0.28em" /><msub><mi>C</mi><mn>2</mn></msub><mo linebreak="goodbreak">×</mo><mi>r</mi><mi>a</mi><mi>n</mi><mi>d</mi><mfenced separators="" open="[" close="]"><mrow><mn>0</mn><mo>,</mo><mn>1</mn></mrow></mfenced><mo linebreak="goodbreak">×</mo><mfenced separators="" open="(" close=")"><mrow><msub><mi>g</mi><mrow><mi>b</mi><mi>e</mi><mi>s</mi><mi>t</mi></mrow></msub><mo>−</mo><msub><mi>p</mi><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow></msub></mrow></mfenced></mrow></mtd></mtr></mtable><annotation encoding="application/x-tex">$$\begin{eqnarray}{{v}_{r,k + 1}} &=& {{W}_{tia}}{{v}_{r,k}} + {{C}_1} \times rand\left[ {0,1} \right] \times \left({{{p}_{r,best}} - {{p}_{r,k}}} \right)\nonumber\\ &&+\; {{C}_2} \times rand\left[ {0,1} \right] \times \left({{{g}_{best}} - {{p}_{r,k}}} \right)\end{eqnarray}$$</annotation></semantics></math> </ephtml> </p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mrow><mi>r</mi><mo>,</mo><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub><mo linebreak="badbreak">=</mo><msub><mi>p</mi><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow></msub><mo linebreak="goodbreak">+</mo><msub><mi>v</mi><mrow><mi>r</mi><mo>,</mo><mrow><mspace width="0.33em" /><mspace width="0.33em" /></mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>,</mo></mrow><annotation encoding="application/x-tex">$$\begin{equation}{{p}_{r,k + 1}} = {{p}_{r,k}} + {{v}_{r,{\mathrm{\ \ }}k + 1}},\end{equation}$$</annotation></semantics></math> </ephtml> </p> <p>where <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>W</mi><mrow><mi>t</mi><mi>i</mi><mi>a</mi></mrow></msub><annotation encoding="application/x-tex">${{W}_{tia}}$</annotation></semantics></math> </ephtml> represents the inertia weight applied to control the search, <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>C</mi><mn>1</mn></msub><annotation encoding="application/x-tex">${{C}_1}$</annotation></semantics></math> </ephtml> and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>C</mi><mn>2</mn></msub><annotation encoding="application/x-tex">${{C}_2}$</annotation></semantics></math> </ephtml> are constants controlling the displacements of particles toward the local or the global optima, and the function <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mi>a</mi><mi>n</mi><mi>d</mi><mo stretchy="false">[</mo><mrow><mn>0</mn><mo>,</mo><mn>1</mn></mrow><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">$rand[ {0,1} ]$</annotation></semantics></math> </ephtml> gives a random value in the range of <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">[</mo><mrow><mn>0</mn><mo>,</mo><mn>1</mn></mrow><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">$[ {0,1} ]$</annotation></semantics></math> </ephtml> . The updated solutions are evaluated by a fitness function defined to indicate the quality and convergence of the optimization.</p> <hd id="AN0192629994-33">Markov Chains</hd> <p>Markov chain is a stochastic model that describes a sequence of actions where the probability of each action simply depends on its previous action. By using Markov chains, we assume the following assumptions:</p> <p></p> <ulist> <item> Markov property: The next action of the process depends only on the current action and not on the sequence of actions that preceded it.</item> <p></p> <item> Stationarity of transition probabilities: The transition probabilities between actions are constant over time.</item> <p></p> <item> Observation independence: Given the current action of the latent process, the observed data at any time point are assumed to be independent of the observations at other time points.</item> <p></p> <item> Finite action: The latent Markov chain is assumed to have a finite number of actions.</item> </ulist> <hd id="AN0192629994-34">Support Vector Machine</hd> <p>SVM is a supervised algorithm that learns either the linear or nonlinear decision boundary to classify samples with labels. For example, for the input data <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="bold-italic">x</mi><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mi mathvariant="normal">O</mi></msup></mrow><annotation encoding="application/x-tex">${\bm{x}} \in {{\mathbb{R}}^{\mathrm{O}}}$</annotation></semantics></math> </ephtml> , the SVM can transform the input data into a newly created feature space, in order to make classifications by identifying a boundary between classes based on the transformed features <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="bold-italic">f</mi><mo>=</mo><mi>φ</mi><mrow><mo stretchy="false">(</mo><mi mathvariant="bold-italic">x</mi><mo stretchy="false">)</mo></mrow><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mi mathvariant="normal">D</mi></msup></mrow><annotation encoding="application/x-tex">${\bm{f}} = \varphi ({\bm{x}}) \in {{\mathbb{R}}^{\mathrm{D}}}$</annotation></semantics></math> </ephtml> . Figure A1 describes this transformation process in which new features are created from the original data points. In this way, they provide the boundary to distinguish between classes. It is important to note that, sometimes the dimension of the created new feature space might be higher (i.e., <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi mathvariant="double-struck">R</mi><mi mathvariant="normal">O</mi></msup><mo>⊆</mo><msup><mi mathvariant="double-struck">R</mi><mi mathvariant="normal">D</mi></msup></mrow><annotation encoding="application/x-tex">${{\mathbb{R}}^{\mathrm{O}}} \subseteq {{\mathbb{R}}^{\mathrm{D}}}$</annotation></semantics></math> </ephtml> ) than the original space, as shown in Figure A1. In such a case, it is possible to map two‐dimensional data onto a three‐dimensional coordinate system to achieve clear separation between the two classes, Efficient and Inefficient.</p> <p>This decision boundary is a separator that divides data points into their respective classes, where the separator is referred to as a hyperplane. The right‐most coordinate plane in Figure A1 shows that the SVM uses the transformed features <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>φ</mi><mo stretchy="false">(</mo><mi mathvariant="bold-italic">x</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">$\varphi ({\bm{x}})$</annotation></semantics></math> </ephtml> to decide the hyperplane <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="script">H</mi><mo>:</mo><mi>g</mi><mo stretchy="false">(</mo><mrow><mi>φ</mi><mo stretchy="false">(</mo><mi mathvariant="bold-italic">x</mi><mo stretchy="false">)</mo></mrow><mo stretchy="false">)</mo><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">$\mathcal{H}:g({\varphi ({\bm{x}})}) = 0$</annotation></semantics></math> </ephtml> and distinguish the classes, where the transformed feature data points are indicated above the upper dotted and below the lower dashed boundaries with distances <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>d</mi><mn>1</mn></msub><annotation encoding="application/x-tex">${{d}_1}$</annotation></semantics></math> </ephtml> and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>d</mi><mn>2</mn></msub><annotation encoding="application/x-tex">${{d}_2}$</annotation></semantics></math> </ephtml> to the hyperplane. These are the support vectors and the distances <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>d</mi><mn>1</mn></msub><annotation encoding="application/x-tex">${{d}_1}$</annotation></semantics></math> </ephtml> and <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>d</mi><mn>2</mn></msub><annotation encoding="application/x-tex">${{d}_2}$</annotation></semantics></math> </ephtml> from the hyperplane to the support vectors are called the margins. Although there are many possible candidate hyperplanes, the SVM maximizes the minimum distance to decide the optimal hyperplane rather than minimizing the margin. One thing that needs to note here is that some classification models, such as logistic regression, predict the probability of each class as the outcome instead of predicting the labels themselves directly. That is, a data point may be classified as positive if the predicted probability of a positive class is greater than or equal to a threshold such as .5. We know that SVM can give each data point's class directly as the outcome but not the class probability.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/MEA/01mar26/jedm12413-fig-0010.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="jedm12413-fig-0010.jpg" title="A1 SVM transformation process. SVM transforms the original data into a new space and uses a hyperplane to distinguish new features by the two margins d1${{d}_1}$ and d2${{d}_2}$." /> </p> <p></p> <p>A1 Table Average Best Group Classification Accuracy Comparison</p> <p> <ephtml> <table><thead><tr><th><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mi>l</mi><annotation encoding="application/x-tex">$l$</annotation></semantics></math></p></th><th><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><msub><mi>N</mi><mi>z</mi></msub><annotation encoding="application/x-tex">${{N}_z}$</annotation></semantics></math></p></th><th>SRM Features</th><th>Embedding Average</th></tr></thead><tbody><tr><td><p>150</p></td><td><p>10</p></td><td><p>.825 (.074)</p></td><td><p>.391 (.137)</p></td></tr><tr><td /><td><p>25</p></td><td><p>.836 (.102)</p></td><td><p>.400 (.098)</p></td></tr><tr><td /><td><p>50</p></td><td><p>.829 (.096)</p></td><td><p>.398 (.115)</p></td></tr><tr><td><p>1,500</p></td><td><p>10</p></td><td><p>.851 (.068)</p></td><td><p>.427 (.168)</p></td></tr><tr><td /><td><p>25</p></td><td><p>.862 (.057)</p></td><td><p>.414 (.151)</p></td></tr><tr><td /><td><p>50</p></td><td><p>.865 (.062)</p></td><td><p>.401 (.099)</p></td></tr><tr><td><p>3,000</p></td><td><p>10</p></td><td><p>.891 (.059)</p></td><td><p>.431 (.103)</p></td></tr><tr><td /><td><p>25</p></td><td><p>.902 (.047)</p></td><td><p>.454 (.077)</p></td></tr><tr><td /><td><p>50</p></td><td><p>.900 (.051)</p></td><td><p>.448 (.102)</p></td></tr></tbody></table> </ephtml> </p> <p>4 <emph>Note</emph>. The values in the parentheses are standard deviations.</p> <p>The right column in Table A1, Embedding Average, is defined as the average of each input embedded action vector <ephtml> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi mathvariant="bold-italic">a</mi><mi>l</mi></msup><annotation encoding="application/x-tex">${{{\bm{a}}}^l}$</annotation></semantics></math> </ephtml> . Using average of original embedded sequence vectors as a comparison basis is another method in feature extraction. In addition, when an item involves a greater variety of unique actions, examinees have a wider range of options during the response process. Consequently, this leads to a sparser final action sequence. Therefore, the average of original embedded sequence vectors can succinctly capture an uncomplicated overview of the behavioral patterns exhibited by these examinees.</p> <p>A2 Table Number of Unique Actions Found in Each Item and Corresponding Contribution to AUCadj$AU{{C}_{adj}}$</p> <p> <ephtml> <table><thead><tr><th>Item</th><th>Number of Unique Actions</th><th>Corresponding Increase of <p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics xmlns=""><mrow><mi>A</mi><mi>U</mi><msub><mi>C</mi><mrow><mi>a</mi><mi>d</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">$AU{{C}_{adj}}$</annotation></semantics></math></p></th></tr></thead><tbody><tr><td><p>VH098519</p></td><td><p>29</p></td><td><p>–</p></td></tr><tr><td><p>VH098522</p></td><td><p>29</p></td><td><p>.006</p></td></tr><tr><td><p>VH098556</p></td><td><p>27</p></td><td><p>.014</p></td></tr><tr><td><p>VH098597</p></td><td><p>27</p></td><td><p>.051</p></td></tr><tr><td><p>VH098740</p></td><td><p>30</p></td><td><p>.011</p></td></tr><tr><td><p>VH098753</p></td><td><p>29</p></td><td><p>.007</p></td></tr><tr><td><p>VH098759</p></td><td><p>28</p></td><td><p>.023</p></td></tr><tr><td><p>VH098779</p></td><td><p>25</p></td><td><p>.006</p></td></tr><tr><td><p>VH098783</p></td><td><p>29</p></td><td><p>.007</p></td></tr><tr><td><p>VH098808</p></td><td><p>30</p></td><td><p>.012</p></td></tr><tr><td><p>VH098810</p></td><td><p>27</p></td><td><p>.007</p></td></tr><tr><td><p>VH098812</p></td><td><p>30</p></td><td><p>.010</p></td></tr><tr><td><p>VH098834</p></td><td><p>25</p></td><td><p>.071</p></td></tr><tr><td><p>VH098839</p></td><td><p>28</p></td><td><p>.008</p></td></tr><tr><td><p>VH134366</p></td><td><p>32</p></td><td><p>.107</p></td></tr><tr><td><p>VH134373</p></td><td><p>32</p></td><td><p>.011</p></td></tr><tr><td><p>VH134387</p></td><td><p>32</p></td><td><p>.046</p></td></tr><tr><td><p>VH139047</p></td><td><p>28</p></td><td><p>.029</p></td></tr><tr><td><p>VH139196</p></td><td><p>34</p></td><td><p>.049</p></td></tr></tbody></table> </ephtml> </p> <p>A3 Table Simulation Code</p> <p> <ephtml> <table><tbody><tr><td /></tr><tr><td /></tr><tr><td /></tr></tbody></table> </ephtml> </p> <p>A4 Table Algorithm 1 Code</p> <p> <ephtml> <table><tbody><tr><td /></tr><tr><td /></tr><tr><td /></tr><tr><td /></tr></tbody></table> </ephtml> </p> <ref id="AN0192629994-36"> <title> Footnotes </title> <blist> <bibl id="bib1" idref="ref36" type="bt">1</bibl> <bibtext> To accommodate researchers interested in a more detailed exploration of the action sequences, we have provided the following link where access to the data can be granted after agreeing to the data privacy policy https://sites.google.com/view/dataminingcompetition2019/home</bibtext> </blist> </ref> <ref id="AN0192629994-37"> <title> References </title> <blist> <bibtext> Athreya, K. B., Doss, H., & Sethuraman, J. (1996). On the convergence of the Markov chain simulation method. The Annals of Statistics, 24(1), 69–100.</bibtext> </blist> <blist> <bibl id="bib2" idref="ref12" type="bt">2</bibl> <bibtext> Bejar, I. I., Mislevy, R. J., & Zhang, M. (2016). Automated scoring with validity in mind. In A. A. Rupp & J. P. Leighton (Eds.), The Wiley handbook of cognition and assessment: Frameworks, methodologies, and applications (pp. 226–246). Wiley.</bibtext> </blist> <blist> <bibl id="bib3" idref="ref23" type="bt">3</bibl> <bibtext> Bianchi, F. M., Scardapane, S., Løkse, S., & Jenssen, R. (2020). Reservoir computing approaches for representation and classification of multivariate time series. IEEE Transactions on Neural Networks and Learning Systems, 32(5), 2169–2179.</bibtext> </blist> <blist> <bibl id="bib4" idref="ref22" type="bt">4</bibl> <bibtext> Bompas, S., Georgeot, B., & Guéry‐Odelin, D. (2020). Accuracy of neural networks for the simulation of chaotic dynamics: Precision of training data vs precision of the algorithm. Chaos: An Interdisciplinary Journal of Nonlinear Science, 30(11), 113118.</bibtext> </blist> <blist> <bibl id="bib5" idref="ref30" type="bt">5</bibl> <bibtext> Chouikhi, N., Ammar, B., Rokbani, N., & Alimi, A. M. (2017). PSO‐based analysis of Echo State Network parameters for time series forecasting. Applied Soft Computing, 55, 211–225.</bibtext> </blist> <blist> <bibl id="bib6" idref="ref1" type="bt">6</bibl> <bibtext> Ercikan, K., Guo, H., & Por, H. H. (2023). Uses of process data in advancing the practice and science of technology‐rich assessments. In N. Foster & M. Piacentini (Eds.), Innovating assessments to measure and support complex skills. OECD. https://doi.org/10.1787/7b3123f1‐en</bibtext> </blist> <blist> <bibl id="bib7" idref="ref10" type="bt">7</bibl> <bibtext> Ercikan, K., & Pellegrino, J. W. (2017). Validation of score meaning for the next generation of assessments: The use of response processes. Taylor & Francis.</bibtext> </blist> <blist> <bibl id="bib8" idref="ref3" type="bt">8</bibl> <bibtext> Feng, T., & Cai, L. (2024). Sensemaking of process data from evaluation studies of educational games: An application of cross‐classified item response theory modeling. Journal of Educational Measurement. https://doi.org/10.1111/jedm.12396</bibtext> </blist> <blist> <bibl id="bib9" idref="ref17" type="bt">9</bibl> <bibtext> Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.</bibtext> </blist> <blist> <bibtext> Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1), 389–422. https://doi.org/10.1023/A:1012487302797</bibtext> </blist> <blist> <bibtext> Han, Y., & Wilson, M. (2022). Analyzing student response processes to evaluate success on a technology‐based problem‐solving task. Applied Measurement in Education, 35(1), 33–45.</bibtext> </blist> <blist> <bibtext> He, Q., Borgonovi, F., & Paccagnella, M. (2021). Leveraging process data to assess adults' problem‐solving skills: Using sequence mining to identify behavioral patterns across digital tasks. Computers & Education, 166, 104170. https://doi.org/10.1016/j.compedu.2021.104170</bibtext> </blist> <blist> <bibtext> Huang, M.‐L., Hung, Y.‐H., Lee, W. M., Li, R.‐K., & Jiang, B.‐R. (2014). SVM‐RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier. The Scientific World Journal, 2014, 795624.</bibtext> </blist> <blist> <bibtext> Jaeger, H. (2001). The "echo state" approach to analysing and training recurrent neural networks‐with an erratum note. Bonn, Germany: German National Research Center for Information Technology GMD Technical Report, 148(34), 13.</bibtext> </blist> <blist> <bibtext> Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN'95‐International Conference on Neural Networks, 4, 1942–1948.</bibtext> </blist> <blist> <bibtext> Kim, H. J., Hong, S. E., & Cha, K. J. (2020). seq2vec: Analyzing sequential data using multi‐rank embedding vectors. Electronic Commerce Research and Applications, 43, 101003.</bibtext> </blist> <blist> <bibtext> Kuang, H., & Sahin, F. (2023). Comparison of disengagement levels and the impact of disengagement on item parameters between PISA 2015 and PISA 2018 in the United States. Large‐Scale Assessments in Education, 11(1), 4. https://doi.org/10.1186/s40536‐023‐00152‐0</bibtext> </blist> <blist> <bibtext> Li, Y., & Li, F. (2019). PSO‐based growing echo state network. Applied Soft Computing, 85, 105774.</bibtext> </blist> <blist> <bibtext> Lukoševičius, M. (2012). A practical guide to applying echo state networks. In G. Montavon, G. B. Orr, & K.‐R. Müller (Eds.), Neural networks: Tricks of the trade (Vol. 7700, pp. 659–686). Springer Berlin Heidelberg. https://doi.org/10.1007/978‐3‐642‐35289‐8_36</bibtext> </blist> <blist> <bibtext> Medsker, L. R., & Jain, L. C. (2001). Recurrent neural networks. Design and Applications, 5, 64–67.</bibtext> </blist> <blist> <bibtext> Mislevy, R. J. (2019). Advances in measurement and cognition. The Annals of the American Academy of Political and Social Science, 683(1), 164–182.</bibtext> </blist> <blist> <bibtext> Noble, W. S. (2006). What is a support vector machine?Nature Biotechnology, 24(12), 1565–1567.</bibtext> </blist> <blist> <bibtext> Pascanu, R., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. In ICML'13: Proceedings of the 30th International Conference on Machine Learning (pp. 1310–1318). JMLR.</bibtext> </blist> <blist> <bibtext> Rasch, G. (1966). An individualistic approach to item analysis. Readings in Mathematical Social Science, 89–108.</bibtext> </blist> <blist> <bibtext> Strauss, T., Wustlich, W., & Labahn, R. (2012). Design strategies for weight matrices of echo state networks. Neural Computation, 24(12), 3246–3276.</bibtext> </blist> <blist> <bibtext> Tang, X., Wang, Z., He, Q., Liu, J., & Ying, Z. (2020). Latent feature extraction for process data via multidimensional scaling. Psychometrika, 85(2), 378–397.</bibtext> </blist> <blist> <bibtext> Tang, X., Wang, Z., Liu, J., & Ying, Z. (2021). An exploratory analysis of the latent structure of process data via action sequence autoencoders. British Journal of Mathematical and Statistical Psychology, 74(1), 1–33.</bibtext> </blist> <blist> <bibtext> Wall, M. E., Rechtsteiner, A., & Rocha, L. M. (2003). Singular value decomposition and principal component analysis. In D. P. Berrar, W. Dubitzky, & M. Granzow (Eds.), A practical approach to microarray data analysis (pp. 91–109). Springer.</bibtext> </blist> <blist> <bibtext> Wang, S., Wu, S., Chen, Y., Fang, L., Xiao, L., & Li, F. (2024). Exploring latent constructs through multimodal data analysis. Journal of Educational Measurement. https://doi.org/10.1111/jedm.12412</bibtext> </blist> <blist> <bibtext> Wyffels, F., & Schrauwen, B. (2010). A comparative study of reservoir computing strategies for monthly time series prediction. Neurocomputing, 73(10), 1958–1964. https://doi.org/10.1016/j.neucom.2010.01.016</bibtext> </blist> <blist> <bibtext> Xiong, J. (2022). Exploratory process data analysis in the mixed‐format assessment: Using reservoir computing and topic modeling. PhD Thesis, University of Georgia. https://esploro.libs.uga.edu/esploro/outputs/doctoral/Exploratory‐Process‐Data‐Analysis‐in‐the/9949467728402959</bibtext> </blist> <blist> <bibtext> Xiong, J., Engelhard, G., & Cohen, A. S. (2024). Analysis of mixed‐format assessments using measurement models and topic modeling. Measurement: Interdisciplinary Research and Perspectives, 1–15. https://doi.org/10.1080/15366367.2023.2298135</bibtext> </blist> <blist> <bibtext> Xu, H., Fang, G., & Ying, Z. (2020). A latent topic model with Markov transition for process data. British Journal of Mathematical and Statistical Psychology, 73(3), 474–505.</bibtext> </blist> <blist> <bibtext> Zhang, S., Wang, Z., Qi, J., Liu, J., & Ying, Z. (2023). Accurate assessment via process data. Psychometrika, 88(1), 76–97.</bibtext> </blist> </ref> <aug> <p>By Jiawei Xiong; Shiyu Wang; Cheng Tang; Qidi Liu; Rufei Sheng; Bowen Wang; Huan Kuang; Allan S. Cohen and Xinhui Xiong</p> <p>Reported by Author; Author; Author; Author; Author; Author; Author; Author; Author</p> <p></p> <p>JIAWEI XIONG is an Associate Research Scientist at Curriculum Associates and a Research Affiliate at the University of Georgia, Aderhold Hall, 110 Carlton Street, Athens, GA 30602; jxiong@cainc.com, jiawei.xiong@uga.edu. His primary research interests include educational data mining and machine learning in education.</p> <p>SHIYU WANG is an Associate Professor at The University of Georgia, Aderhold Hall, 110 Carlton Street, Athens, GA 30602; swang44@uga.edu. Her primary research interests include computerized adaptive testing and innovations in latent variable modeling.</p> <p>CHENG TANG is a PhD Student at The University of Georgia, Aderhold Hall, 110 Carlton Street, Athens, GA 30602; cheng.tang@uga.edu. His primary research interests include artificial intelligence for educational measurement, algorithm, automated essay scoring, and computerized adaptive testing.</p> <p>QIDI LIU is a Principal Engineer at GlobalFoundries, 1490 Robinson Pkwy, Essex Junction, VT 05452; qidi.liu.phtotnics@gmail.com. His primary research interests include artificial intelligence, advanced computing, and deep learning for complex and high‐dimensional data.</p> <p>RUFEI SHENG is a Data Scientist II at Amazon, 7 W 34th St., New York, NY 10001; rufeisheng@gmail.com. Her primary research interests include psychological and behavioral data analysis and human‐computer interaction analysis.</p> <p>BOWEN WANG is a PhD Candidate at The University of Florida, 1221 SW 5th Ave, Gainesville, FL 32601; bowen.wang@ufl.edu. His primary research interests include educational data mining, measurement theory, and response behavior analysis.</p> <p>HUAN (HAILEY) KUANG is an Assistant Professor at Florida State University, 3204D Stone Building; hkuang2@fsu.edu. Her primary research interests include advance computational psychometrics and machine learning for educational data.</p> <p>ALLAN S. COHEN is a Professor Emeritus at The University of Georgia, Aderhold Hall, 110 Carlton Street, Athens, GA 30602; acohen@uga.edu. His primary research interests include psychometrics and response behavior analysis.</p> <p>XINHUI (MAGGIE) XIONG is a Senior Psychometrician at Educational Testing Service, 660 Rosedale Road, Princeton, NJ 08541; xxiong@ets.org. Her primary research interests include psychometrics and automated scoring.</p> </aug> <nolink nlid="nl1" bibid="bib26" firstref="ref2"></nolink> <nolink nlid="nl2" bibid="bib12" firstref="ref4"></nolink> <nolink nlid="nl3" bibid="bib17" firstref="ref5"></nolink> <nolink nlid="nl4" bibid="bib29" firstref="ref6"></nolink> <nolink nlid="nl5" bibid="bib11" firstref="ref7"></nolink> <nolink nlid="nl6" bibid="bib32" firstref="ref8"></nolink> <nolink nlid="nl7" bibid="bib34" firstref="ref9"></nolink> <nolink nlid="nl8" bibid="bib21" firstref="ref11"></nolink> <nolink nlid="nl9" bibid="bib33" firstref="ref13"></nolink> <nolink nlid="nl10" bibid="bib27" firstref="ref14"></nolink> <nolink nlid="nl11" bibid="bib20" firstref="ref15"></nolink> <nolink nlid="nl12" bibid="bib23" firstref="ref18"></nolink> <nolink nlid="nl13" bibid="bib14" firstref="ref19"></nolink> <nolink nlid="nl14" bibid="bib19" firstref="ref21"></nolink> <nolink nlid="nl15" bibid="bib30" firstref="ref24"></nolink> <nolink nlid="nl16" bibid="bib31" firstref="ref25"></nolink> <nolink nlid="nl17" bibid="bib15" firstref="ref27"></nolink> <nolink nlid="nl18" bibid="bib28" firstref="ref28"></nolink> <nolink nlid="nl19" bibid="bib16" firstref="ref29"></nolink> <nolink nlid="nl20" bibid="bib25" firstref="ref33"></nolink> <nolink nlid="nl21" bibid="bib18" firstref="ref35"></nolink> <nolink nlid="nl22" bibid="bib24" firstref="ref39"></nolink> <nolink nlid="nl23" bibid="bib22" firstref="ref43"></nolink> <nolink nlid="nl24" bibid="bib10" firstref="ref44"></nolink> <nolink nlid="nl25" bibid="bib13" firstref="ref45"></nolink>
Header DbId: eric
DbLabel: ERIC
An: EJ1501512
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Sequential Reservoir Computing for Log File-Based Behavior Process Data Analyses
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Jiawei+Xiong%22">Jiawei Xiong</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-2069-8720">0000-0002-2069-8720</externalLink>)<br /><searchLink fieldCode="AR" term="%22Shiyu+Wang%22">Shiyu Wang</searchLink><br /><searchLink fieldCode="AR" term="%22Cheng+Tang%22">Cheng Tang</searchLink><br /><searchLink fieldCode="AR" term="%22Qidi+Liu%22">Qidi Liu</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-6797-4163">0000-0002-6797-4163</externalLink>)<br /><searchLink fieldCode="AR" term="%22Rufei+Sheng%22">Rufei Sheng</searchLink><br /><searchLink fieldCode="AR" term="%22Bowen+Wang%22">Bowen Wang</searchLink><br /><searchLink fieldCode="AR" term="%22Huan+Kuang%22">Huan Kuang</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0003-2651-2867">0000-0003-2651-2867</externalLink>)<br /><searchLink fieldCode="AR" term="%22Allan+S%2E+Cohen%22">Allan S. Cohen</searchLink><br /><searchLink fieldCode="AR" term="%22Xinhui+Xiong%22">Xinhui Xiong</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Journal+of+Educational+Measurement%22"><i>Journal of Educational Measurement</i></searchLink>. 2026 63(1).
– Name: Avail
  Label: Availability
  Group: Avail
  Data: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 41
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2026
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Reports - Research
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Data+Use%22">Data Use</searchLink><br /><searchLink fieldCode="DE" term="%22Data+Analysis%22">Data Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Computation%22">Computation</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Assisted+Testing%22">Computer Assisted Testing</searchLink><br /><searchLink fieldCode="DE" term="%22Response+Style+%28Tests%29%22">Response Style (Tests)</searchLink><br /><searchLink fieldCode="DE" term="%22Algorithms%22">Algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22Sequential+Approach%22">Sequential Approach</searchLink><br /><searchLink fieldCode="DE" term="%22Data+Collection%22">Data Collection</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1111/jedm.12413
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 0022-0655<br />1745-3984
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: The use of process data in assessment has gained attention in recent years as more assessments are administered by computers. Process data, recorded in computer log files, capture the sequence of examinees' response activities, for example, timestamped keystrokes, during the assessment. Traditional measurement methods are often inadequate for handling this type of data. In this paper, we proposed a sequential reservoir method (SRM) based on a reservoir computing model using the echo state network, with the particle swarm optimization and singular value decomposition as optimization. Designed to regularize features from process data through a computational self-learning algorithm, this method has been evaluated using both simulated and empirical data. Simulation results suggested that, on one hand, the model effectively transforms action sequences into standardized and meaningful features, and on the other hand, these features are instrumental in categorizing latent behavioral groups and predicting latent information. Empirical results further indicate that SRM can predict assessment efficiency. The features extracted by SRM have been verified as related to action sequence lengths through the correlation analysis. This proposed method enhances the extraction and accessibility of meaningful information from process data, presenting an alternative to existing process data technologies.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2026
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1501512
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1501512
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1111/jedm.12413
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 41
    Subjects:
      – SubjectFull: Data Use
        Type: general
      – SubjectFull: Data Analysis
        Type: general
      – SubjectFull: Computation
        Type: general
      – SubjectFull: Computer Assisted Testing
        Type: general
      – SubjectFull: Response Style (Tests)
        Type: general
      – SubjectFull: Algorithms
        Type: general
      – SubjectFull: Sequential Approach
        Type: general
      – SubjectFull: Data Collection
        Type: general
    Titles:
      – TitleFull: Sequential Reservoir Computing for Log File-Based Behavior Process Data Analyses
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Jiawei Xiong
      – PersonEntity:
          Name:
            NameFull: Shiyu Wang
      – PersonEntity:
          Name:
            NameFull: Cheng Tang
      – PersonEntity:
          Name:
            NameFull: Qidi Liu
      – PersonEntity:
          Name:
            NameFull: Rufei Sheng
      – PersonEntity:
          Name:
            NameFull: Bowen Wang
      – PersonEntity:
          Name:
            NameFull: Huan Kuang
      – PersonEntity:
          Name:
            NameFull: Allan S. Cohen
      – PersonEntity:
          Name:
            NameFull: Xinhui Xiong
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 03
              Type: published
              Y: 2026
          Identifiers:
            – Type: issn-print
              Value: 0022-0655
            – Type: issn-electronic
              Value: 1745-3984
          Numbering:
            – Type: volume
              Value: 63
            – Type: issue
              Value: 1
          Titles:
            – TitleFull: Journal of Educational Measurement
              Type: main
ResultId 1