Statewide Implementation of Automated Writing Evaluation: Analyzing Usage and Associations with State Test Performance in Grades 4-11

Saved in:
Bibliographic Details
Title: Statewide Implementation of Automated Writing Evaluation: Analyzing Usage and Associations with State Test Performance in Grades 4-11
Language: English
Authors: Potter, Andrew (ORCID 0000-0002-1012-2680), Wilson, Joshua
Source: Educational Technology Research and Development. Jun 2021 69(3):1557-1578.
Availability: Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/
Peer Reviewed: Y
Page Count: 22
Publication Date: 2021
Sponsoring Agency: Institute of Education Sciences (ED)
Contract Number: R305H170046
Document Type: Journal Articles
Reports - Research
Education Level: Elementary Education
Secondary Education
Descriptors: Computer Assisted Testing, Writing Evaluation, Feedback (Response), Scoring, Revision (Written Composition), Computer Software, Elementary School Students, Secondary School Students, Writing Tests, Peer Evaluation, State Programs, Program Implementation
DOI: 10.1007/s11423-021-10004-9
ISSN: 1042-1629
Abstract: Automated Writing Evaluation (AWE) provides automatic writing feedback and scoring to support student writing and revising. The purpose of the present study was to analyze a statewide implementation of an AWE software (n = 114,582) in grades 4-11. The goals of the study were to evaluate: (1) to what extent AWE features were used; (2) if equity and access issues influenced AWE usage; and (3) if AWE usage was associated with writing performance on a large-scale state writing assessment. Descriptive statistics and hierarchical linear modeling were used to answer the research questions. Results indicated that the main feature of AWE (i.e., writing and revising essays) were used but some features (peer review and independent lessons) were underutilized. School and student level demographic variables explained little variance in AWE usage. AWE usage was statistically and positively associated with performance on a large-scale state writing assessment when controlling for prior performance and demographics. The study presents evidence that AWE can positively influence writing on a distal measure when implemented at-scale. Implications for large-scale AWE implementation are discussed.
Abstractor: As Provided
IES Funded: Yes
Entry Date: 2021
Accession Number: EJ1302863
Database: ERIC
Full text is not displayed to guests.
FullText Links:
  – Type: pdflink
    Url: https://content.ebscohost.com/cds/retrieve?content=AQICAHj0k_4E0hTGH8RJwT4gCJyBsGNe_WN95AvKlDbXJGqwxwHXog2PyZOeGiol7NcHcqg-AAAA4jCB3wYJKoZIhvcNAQcGoIHRMIHOAgEAMIHIBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDG9OqVkwkqxEFhaPZwIBEICBmlxurQZe6_LwoN785ShGgsvaKrPKdTPb9ZuL88xaVCmNBfXL_D-xNYjAY3OlAJOyPA1uIepxC4QwOckJzv1frHIM4UzZJ0JOqmH2edfqKr9Gdeqr59BVIRt4ZzXsCAtUIUzGAonltGpD58-oWI82KdF9dlYj5BlwE3iqyHV_H3Zf1sVMLgUIKWr-smreEWnwu1Lrq9uhWBJxRMQ=
Text:
  Availability: 1
  Value: <anid>AN0151271343;etr01jun.21;2021Jul08.05:35;v2.2.500</anid> <title id="AN0151271343-1">Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11 </title> <p>Automated Writing Evaluation (AWE) provides automatic writing feedback and scoring to support student writing and revising. The purpose of the present study was to analyze a statewide implementation of an AWE software (n = 114,582) in grades 4-11. The goals of the study were to evaluate (a) to what extent AWE features were used, (b) if equity and access issues influenced AWE usage, and (c) if AWE usage was associated with writing performance on a large-scale state writing assessment. Descriptive statistics and hierarchical linear modeling were used to answer the research questions. Results indicated that the main feature of AWE (i.e., writing and revising essays) were used but some features (peer review and independent lessons) were underutilized. School and student level demographic variables explained little variance in AWE usage. AWE usage was statistically and positively associated with performance on a large-scale state writing assessment when controlling for prior performance and demographics. The study presents evidence that AWE can positively influence writing on a distal measure when implemented at-scale. Implications for large-scale AWE implementation are discussed.</p> <p>Keywords: Automated writing evaluation; Automated feedback; Writing technology; Hierarchical linear-modeling; Interactive learning environment; Educational technology implementation</p> <p>Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11423-021-10004-9.</p> <hd id="AN0151271343-2">Introduction</hd> <p>Writing is a critical skill, yet students in the United States consistently score low in standardized writing assessments across multiple grades (NCES, [<reflink idref="bib43" id="ref1">43</reflink>]) and employers report writing skills in the workplace are lacking (National Commission on Writing for America's Families, Schools, and Colleges [<reflink idref="bib44" id="ref2">44</reflink>]). Technology to support writing instruction may help to address these systematic performance issues. Technology-based approaches to improving writing skills are of increasing interest given the need to adopt effective and scalable approaches to writing instruction. Meta-analyses have shown that technology-based writing instruction is effective for improving the writing skills of both typically-developing students (Little et al. [<reflink idref="bib37" id="ref3">37</reflink>]; Strobl et al. [<reflink idref="bib60" id="ref4">60</reflink>]; Williams & Beam, [<reflink idref="bib64" id="ref5">64</reflink>]) and students with or at-risk of disabilities (Morphy & Graham, [<reflink idref="bib42" id="ref6">42</reflink>]).</p> <p>One promising writing technology is automated writing evaluation (AWE). AWE software systems are intended to support the teaching and learning of writing via the provision of automated feedback often in conjunction with automated scoring and different learning management and user functions (Allen et al. [<reflink idref="bib1" id="ref7">1</reflink>]; Grimes & Warschauer, [<reflink idref="bib22" id="ref8">22</reflink>]; Stevenson, [<reflink idref="bib58" id="ref9">58</reflink>]; Strobl et al. [<reflink idref="bib60" id="ref10">60</reflink>]). However, the promise of AWE is contingent on its implementation and teachers often experience barriers to implementing technology-based writing instruction (Williams & Beam, [<reflink idref="bib64" id="ref11">64</reflink>]).</p> <p>This study contributes novel research in two ways. First, we examine a large-scale statewide implementation of AWE technology with students in grades 4-11 (<emph>n</emph> = 114,582). Though small scale studies provide evidence of AWE's effectiveness (see Stevenson & Phakiti, [<reflink idref="bib59" id="ref12">59</reflink>]), no prior research has examined a statewide implementation of AWE. Second, we use multilevel models to examine associations between (a) students' demographics and AWE usage to identify potential equity issues when AWE is used at scale, and (b) AWE usage and students' performance on a statewide writing assessment. We discuss the advantages and limitations for implementing AWE technology at scale.</p> <hd id="AN0151271343-3">Literature review</hd> <p></p> <hd id="AN0151271343-4">Automated writing evaluation</hd> <p>Small-scale studies have documented positive associations between AWE usage and various student writing outcomes, such as improvements in the amount and quality of revisions (Liu et al., [<reflink idref="bib39" id="ref13">39</reflink>]; Roscoe et al., [<reflink idref="bib53" id="ref14">53</reflink>]; Wilson and Andrada, [<reflink idref="bib66" id="ref15">66</reflink>]; Wilson et al., [<reflink idref="bib70" id="ref16">70</reflink>]; Zhu et al., [<reflink idref="bib73" id="ref17">73</reflink>]), increases in writing attitudes (Roscoe et al., [<reflink idref="bib53" id="ref18">53</reflink>]), writing motivation (Grimes & Warschauer, [<reflink idref="bib22" id="ref19">22</reflink>]; Williams & Beam, [<reflink idref="bib64" id="ref20">64</reflink>]; Wilson & Czik, [<reflink idref="bib68" id="ref21">68</reflink>];), writing self-efficacy (Wilson & Roscoe, [<reflink idref="bib71" id="ref22">71</reflink>]), and writing quality (Liu et al., [<reflink idref="bib39" id="ref23">39</reflink>]; Palermo & Thomson, [<reflink idref="bib47" id="ref24">47</reflink>]; Palermo & Wilson, [<reflink idref="bib48" id="ref25">48</reflink>]; Stevenson & Phakiti, [<reflink idref="bib59" id="ref26">59</reflink>]; Williams & Beam, [<reflink idref="bib64" id="ref27">64</reflink>]; Zhu et al. [<reflink idref="bib73" id="ref28">73</reflink>]). However, the associated effects of AWE usage on students' performance on large-scale state writing and English language arts assessments are inconsistent (c.f., Shermis et al., [<reflink idref="bib55" id="ref29">55</reflink>]; Wilson & Roscoe, [<reflink idref="bib71" id="ref30">71</reflink>]).</p> <p>Nevertheless, concerns regarding AWE have been raised about (a) positioning automated systems insensitive to the content and meaning of student writing, (b) utilizing automated feedback that may result in stilted, formulaic writing, and (c) measuring a limited writing construct in ways that are susceptible to gaming (Allen et al., [<reflink idref="bib1" id="ref31">1</reflink>]; Bejar et al., [<reflink idref="bib6" id="ref32">6</reflink>]; Conference on College Composition & Communication, [<reflink idref="bib16" id="ref33">16</reflink>]; Ericsson & Haswell, [<reflink idref="bib19" id="ref34">19</reflink>]; Higgins & Heilman, [<reflink idref="bib28" id="ref35">28</reflink>]; National Council of Teachers of English, [<reflink idref="bib45" id="ref36">45</reflink>]). Additional concerns have been raised over AWE providing too much feedback that may overwhelm students (Ranalli, [<reflink idref="bib50" id="ref37">50</reflink>]), and the appropriateness of using AWE with English learners (ELs; Bai & Hu, [<reflink idref="bib4" id="ref38">4</reflink>]; Dikli, [<reflink idref="bib18" id="ref39">18</reflink>]; Liu & Kunnan, [<reflink idref="bib38" id="ref40">38</reflink>]; Ranalli et al., [<reflink idref="bib51" id="ref41">51</reflink>]).</p> <hd id="AN0151271343-5">Large-scale technology implementation</hd> <p>Another concern is that results from small-scale, researcher-controlled studies often do not transfer outside that context (Bauer et al., [<reflink idref="bib5" id="ref42">5</reflink>]). Seminal mixed-method studies of naturalistic implementations of AWE (Grimes & Warschauer, [<reflink idref="bib22" id="ref43">22</reflink>]; Warschauer & Grimes, [<reflink idref="bib62" id="ref44">62</reflink>]) and a recent literature review on the topic (Stevenson, [<reflink idref="bib58" id="ref45">58</reflink>]) reported that even when teachers and students found AWE to be helpful and desirable, AWE was underutilized. Thus, research is needed on the associated effects of AWE usage on achievement in a naturalistic statewide implementation.</p> <p>Moreover, demographic factors may affect integration of technology in writing instruction and potentially create issues of access and equity. Differential educational technology usage can be associated with race, class, language status, and other demographic factors of students and schools (Lu & Overbaugh, [<reflink idref="bib40" id="ref46">40</reflink>]; U.S. Department of Education, Office of Educational Technology, [<reflink idref="bib61" id="ref47">61</reflink>]; Warschauer et al., [<reflink idref="bib63" id="ref48">63</reflink>]). If AWE, when implemented at scale, is inaccessible or underutilized by those most at-risk, its implementation may exacerbate existing gaps in writing achievement associated with race (NCES, [<reflink idref="bib43" id="ref49">43</reflink>]; Hoffman & Liagas, [<reflink idref="bib29" id="ref50">29</reflink>]) or socio-economic-status (Grissmer & Berends, [<reflink idref="bib23" id="ref51">23</reflink>]).</p> <hd id="AN0151271343-6">The present study</hd> <p>Given the need for additional research on naturalistic, statewide implementations of AWE, the present study examines AWE usage and its associated effects on state test writing performance as part of a two-year statewide implementation of the <emph>Utah Compose</emph> AWE system in Utah between 2014 and 2016. Given concerns over equitable access and usage of educational technologies and achievement gaps in writing outcomes, we also studied student- and school-level demographic factors in relation to the following research questions:</p> <p></p> <ulist> <item> Which aspects of Utah Compose were implemented and to what degree?</item> <p></p> <item> Is there statistically significant variability in students' Utah Compose usage based on student and school-level variables? What student- and school-level factors explain this variability in usage?</item> <p></p> <item> Is Utah Compose usage statistically significantly associated with gains in state test writing performance?</item> </ulist> <p>By examining these research questions, the present study can inform stakeholders of the potential benefits and limitations of implementing AWE at scale. In addition, given the limited and mixed findings regarding the associated effects of AWE usage on state test performance (c.f., Shermis et al. [<reflink idref="bib55" id="ref52">55</reflink>]; Wilson & Roscoe, [<reflink idref="bib71" id="ref53">71</reflink>]), the present study extends research on a key outcome of interest for states and districts considering educational technology adoption.</p> <hd id="AN0151271343-7">Methods</hd> <p></p> <hd id="AN0151271343-8">Data source and sample</hd> <p>A secondary dataset was created by (a) identifying all students who used the Utah Compose AWE system for both Year 1 and Year 2 of the statewide Utah Compose implementation, and then (b) merging usage data in Year 2 of implementation with students' demographic data and their scores on the Spring 2015 and Spring 2016 writing portion of Utah's SAGE (Student Assessment of Growth and Excellence) state English language arts (ELA) exam, which was administered annually to grades 3-11. AWE usage data was provided by Measurement Incorporated, the developer of Utah Compose, and students' demographic and achievement data was provided by the Utah State Board of Education (USBOE). The database consisted of 136,229 students in grades 3-12 who used Utah Compose in both years of implementation and whose SAGE writing data from each year was available.</p> <p>We elected to examine data of students who used Utah Compose for both the first and second years of implementation for two reasons. First, we wanted to examine usage after the novelty effect (see Jeno et al. [<reflink idref="bib32" id="ref54">32</reflink>]; Keller & Suzuki, [<reflink idref="bib33" id="ref55">33</reflink>]) of introducing Utah Compose wore off. We considered two years of Utah Compose usage would minimize potential usage novelty effects. Second, educational technology implementations may not show benefits for students' learning in the short-term (Campuzano et al [<reflink idref="bib11" id="ref56">11</reflink>]; Hull & Dutch, [<reflink idref="bib31" id="ref57">31</reflink>]); this reinforced our decision to examine associations between usage and SAGE performance in Year 2 of the statewide AWE implementation.</p> <hd id="AN0151271343-9">Database preparation</hd> <p>Prior to data analysis we trimmed the dataset in two ways. First, we removed students in grades 3 (<emph>n</emph> = 37) and 12 (<emph>n</emph> = 108). SAGE is only assessed in grades 3-11, so third graders with Year 2 SAGE data were retained (i.e., held back) and twelfth graders were those who needed to re-take the test in Year 2. Second, we identified and removed univariate outliers based on Utah Compose usage statistics. Outliers were defined as students whose usage was above the 95th percentile for three key variables: number of essays completed, number of drafts, and number of lesson minutes completed. We removed these outliers because we wanted to model typical usage and thus distrusted extreme values for each variable.</p> <p>For <emph>essays completed</emph>, we intended to measure the number of unique writing tasks that a student completed across the school year. However, in Utah Compose, if students accidentally restart an essay that was in progress, Utah Compose treats this as a new essay despite that not being the case. The observed range of essays was 1–836 (median = 6). Thus, students with extreme scores for essays (> 19, equivalent to the 95th percentile), were treated as outliers.</p> <p>For <emph>drafts</emph>, we were interested in examining the total number of revision attempts (i.e., drafts) that a student completed throughout the school year. However, because Utah Compose records a unique draft attempt every time a student resubmits their writing for scoring and feedback, even if they only change a single character, some students had extreme values for this variable. The observed range of drafts was 1–900 (median = 15). Although the grain size of our analysis prohibited making distinctions between substantive revisions and surface-level edits, we felt confident that students whose number of drafts exceeded the 95th percentile (> 74 drafts per school year), could be treated as outliers who exhibited atypical system usage. Prior research examining response time effects have used this process to identify examinees who answer items too quickly, and thus whose responses may be unrepresentative of the broader sample/population (see Wise & Kong, [<reflink idref="bib72" id="ref58">72</reflink>]).</p> <p>For <emph>lesson minutes</emph> we used log data to identify the total number of minutes across the school year that students utilized Utah Compose's skill-building lessons. At the time of implementation, the Utah Compose system did not have an automatic logout feature. Thus, if students did not log out while engaged in a lesson, their lesson time continued to accrue. The observed range of this variable was 0–7,039 min (median = 0 min). We considered students whose total lesson minutes exceeded the 95th percentile of the observed range (> 45 min) as outliers whose lesson usage data was atypical.</p> <p>Trimming the dataset of students in grades 3 and 12 and of outliers for three usage variables resulted in removing 15.89% of students from the original dataset, leaving a final database consisting of records from 114,582 students. This sample included students from 110 school districts to 38 of them public districts and 72 charter districts and 691 schools.</p> <p>Demographic data are presented in Table 1. Based on data obtained from the National Center on Education Statistics, sample demographics mirrored those of the state during the 2015–16 school year in grades 4-11 with respect to gender, race, and free or reduced-price lunch qualification. It was not possible to locate exact statistics of the percent of students in grades 4-11 in 2015–16 who were classified as ELs and students with disabilities; however, according to the USBOE, ELs and students with disabilities accounted for 7 and 13%, respectively, of students in grades PK-12 by 2017–18. Thus, our sample was broadly representative of the state population, but it included a lower percentage of ELs and students with disabilities, likely because the state sample included grades PK-12 while our sample included grades 4-11.</p> <p>Table 1 Demographic statistics for Utah Compose users</p> <p> <ephtml> <table frame="hsides" rules="groups"><thead><tr><th align="left"><p>Variable</p></th><th align="left"><p>Number</p></th><th align="left"><p>Percentage</p></th></tr></thead><tbody><tr><td align="left" colspan="3"><p>Schools (<italic>n</italic> = 691)</p></td></tr><tr><td align="left"><p> Elementary</p></td><td char="," align="char"><p>38,915</p></td><td char="." align="char"><p>33.96</p></td></tr><tr><td align="left"><p> Middle</p></td><td char="," align="char"><p>57,706</p></td><td char="." align="char"><p>50.36</p></td></tr><tr><td align="left"><p> High</p></td><td char="," align="char"><p>17,961</p></td><td char="." align="char"><p>15.68</p></td></tr><tr><td align="left" colspan="3"><p>Students (<italic>n</italic> = 114,582)</p></td></tr><tr><td align="left"><p> Charter</p></td><td char="," align="char"><p>28,560</p></td><td char="." align="char"><p>24.93</p></td></tr><tr><td align="left"><p> Public</p></td><td char="," align="char"><p>86,022</p></td><td char="." align="char"><p>75.07</p></td></tr><tr><td align="left" colspan="3"><p>Grade</p></td></tr><tr><td align="left"><p> 4</p></td><td char="," align="char"><p>12,581</p></td><td char="." align="char"><p>10.98</p></td></tr><tr><td align="left"><p> 5</p></td><td char="," align="char"><p>17,212</p></td><td char="." align="char"><p>15.02</p></td></tr><tr><td align="left"><p> 6</p></td><td char="," align="char"><p>21,093</p></td><td char="." align="char"><p>18.41</p></td></tr><tr><td align="left"><p> 7</p></td><td char="," align="char"><p>18,765</p></td><td char="." align="char"><p>16.38</p></td></tr><tr><td align="left"><p> 8</p></td><td char="," align="char"><p>17,986</p></td><td char="." align="char"><p>15.70</p></td></tr><tr><td align="left"><p> 9</p></td><td char="," align="char"><p>13,771</p></td><td char="." align="char"><p>12.02</p></td></tr><tr><td align="left"><p> 10</p></td><td char="," align="char"><p>7,784</p></td><td char="." align="char"><p>6.79</p></td></tr><tr><td align="left"><p> 11</p></td><td char="," align="char"><p>5,390</p></td><td char="." align="char"><p>4.70</p></td></tr><tr><td align="left" colspan="3"><p>Gender</p></td></tr><tr><td align="left"><p> Female</p></td><td char="," align="char"><p>56,459</p></td><td char="." align="char"><p>49.27</p></td></tr><tr><td align="left"><p> Male</p></td><td char="," align="char"><p>58,123</p></td><td char="." align="char"><p>50.73</p></td></tr><tr><td align="left" colspan="3"><p>Race/Ethnicity</p></td></tr><tr><td align="left"><p> White</p></td><td char="," align="char"><p>90,224</p></td><td char="." align="char"><p>78.74</p></td></tr><tr><td align="left"><p> Hispanic</p></td><td char="," align="char"><p>16,124</p></td><td char="." align="char"><p>14.07</p></td></tr><tr><td align="left"><p> Multiple races</p></td><td char="," align="char"><p>2,647</p></td><td char="." align="char"><p>2.31</p></td></tr><tr><td align="left"><p> Asian</p></td><td char="," align="char"><p>1,627</p></td><td char="." align="char"><p>1.42</p></td></tr><tr><td align="left"><p> Pacific islander</p></td><td char="," align="char"><p>1,504</p></td><td char="." align="char"><p>1.31</p></td></tr><tr><td align="left"><p> American Indian/Alaska native</p></td><td char="," align="char"><p>1,279</p></td><td char="." align="char"><p>1.12</p></td></tr><tr><td align="left"><p> African American</p></td><td char="," align="char"><p>1,177</p></td><td char="." align="char"><p>1.03</p></td></tr><tr><td align="left"><p>Special education</p></td><td char="," align="char"><p>12,094</p></td><td char="." align="char"><p>10.55</p></td></tr><tr><td align="left"><p>English learner</p></td><td char="," align="char"><p>3,249</p></td><td char="." align="char"><p>2.84</p></td></tr><tr><td align="left"><p>Free/Reduced lunch</p></td><td char="," align="char"><p>39,112</p></td><td char="." align="char"><p>34.13</p></td></tr></tbody></table> </ephtml> </p> <hd id="AN0151271343-10">Utah Compose</hd> <p> <emph>Utah Compose</emph> is the state-branded version of an AWE system called <emph>MI Write</emph> (<ulink href="http://www.miwrite.net">www.miwrite.net</ulink>) developed and managed by Measurement Incorporated. The USBOE purchased licenses to Utah Compose for any district or school who voluntarily elected to use it for formative assessment and writing instruction; Utah Compose was not mandated. The state worked with Measurement Incorporated to provide professional development to teachers throughout the state.</p> <hd id="AN0151271343-11">Features</hd> <p>Utah Compose relies on the <emph>Project Essay Grade</emph> (PEG; Page, [<reflink idref="bib46" id="ref59">46</reflink>]) automated scoring engine to provide automated essay ratings and qualitative feedback statements for six traits of writing (see Coe et al., [<reflink idref="bib15" id="ref60">15</reflink>]): development of ideas, organization, style, sentence fluency, word choice, and conventions. To assess these traits, PEG utilizes natural language processing (NLP) techniques to measure 500 + text features including those that are defined a priori based on known relationships with writing quality and a posteriori via machine learning techniques. Once defined and measured, NLP text features are combined via support vector regression (see Smola & Scholkopf, [<reflink idref="bib57" id="ref61">57</reflink>]) to estimate the trait scores (see Bunch et al., [<reflink idref="bib10" id="ref62">10</reflink>]). Prior research demonstrates PEG's automated scores are highly reliable (Shermis, [<reflink idref="bib54" id="ref63">54</reflink>]; Wilson et al., [<reflink idref="bib67" id="ref64">67</reflink>]) and predictive of writing performance on Common Core-aligned state English language arts tests (Wilson, [<reflink idref="bib65" id="ref65">65</reflink>]; Wilson et al., [<reflink idref="bib69" id="ref66">69</reflink>]).</p> <p>Based on PEG's trait and NLP text features scores, Utah Compose generates qualitative feedback and evaluation suggestions to students for revising. The feedback is mainly task-level (i.e., how well the task was completed; see Hattie & Timperley, [<reflink idref="bib24" id="ref67">24</reflink>]). Evaluation suggestions are mainly given as questions and represent either task- or process-level feedback (i.e., the processes students should perform to understand and complete the task; see Hattie & Timperley, [<reflink idref="bib24" id="ref68">24</reflink>]). For example, an essay that scores a 4 out of 5 on Development of Ideas might receive the following feedback: "You have included an introduction and a conclusion that are clearly connected to the rest of the essay". Likewise, an essay on the same topic that scores a 1 out 5 on that same trait might receive feedback that states: "Make sure you've written enough to clearly explain your ideas" and an evaluation suggestion that asks: "Have you stated a clear opinion"? Additionally, spelling and grammar feedback is provided. Students receive new scores, trait-specific feedback, and spelling and grammar feedback each time they submit an essay.</p> <p>The automated scoring and feedback may be applied to pre-packaged or teacher-created writing prompts in multiple genres (narrative, argumentative, informative, constructed response). Teachers may assign—or students may choose—various electronic graphic organizers. Teachers may view essay drafts and corresponding score reports before providing feedback via embedded or summary comments. Individual and classroom progress monitoring functions are also available.</p> <p>Additional features include multimedia, interactive skill-building lessons and a peer review function. Interactive lessons are designed as brief tutorials in which students read and listen to information about a topic and perform various actions (i.e., clicking, dragging, typing) to practice the lesson content. Lessons are available at three levels—beginner, intermediate, advanced—and cover topics aligned to the six traits and genres (e.g., "Elaboration in Essays"). Lessons can be assigned directly to students by teachers or students can access them on their own; links to recommended lessons are also embedded in feedback comments based on AWE scores. At the time of this study, Utah Compose's peer review feature functioned allowed teachers to create peer review groups of 2–5 students within the system; peer review groups can be anonymous or identifiable. To complete a peer review, students were directed to write three comments to their classmates in the form of two positive feedback comments and one suggestion to improve the essay. Additional information and screenshots of these features are available in the supplementary materials.</p> <hd id="AN0151271343-12">Measures</hd> <p></p> <hd id="AN0151271343-13">Utah Compose usage measures</hd> <p>We examined four key Utah Compose usage variables: the number of essays completed across the year, the average number of drafts per essay, the total number of lesson minutes, and the number of peer reviews given and received.</p> <hd id="AN0151271343-14">Essays completed</hd> <p>Models of writing development emphasize the importance of sustained, repeated practice for improving writing skills (Kellogg & Whiteford, [<reflink idref="bib34" id="ref69">34</reflink>]). Practice enables students to develop writing skills and metacognitive control over cognitive writing processes (Flower & Hayes, [<reflink idref="bib20" id="ref70">20</reflink>]; Hayes, [<reflink idref="bib25" id="ref71">25</reflink>], [<reflink idref="bib26" id="ref72">26</reflink>]). As a measure of practice, we used the number of unique essays (i.e., assignments) students completed within Utah Compose across the school year. This variable was automatically calculated for each student by Utah Compose.</p> <hd id="AN0151271343-15">Drafts per essay</hd> <p>Developmental writing models also emphasize the importance of students repeatedly revising their writing with the aid of feedback. Feedback assists the writer in identifying gaps between current and expected performance and developing a plan for closing those gaps (Black & Wiliam, [<reflink idref="bib7" id="ref73">7</reflink>]; Hattie & Timperley, [<reflink idref="bib24" id="ref74">24</reflink>]). In Utah Compose, students receive automated feedback every time they submit an essay draft. We theorized that an increased number of drafts per essay suggests that students are relying on Utah Compose's feedback to support their revising. The average drafts per essay was calculated by the researchers as follows using two variables automatically calculated by Utah Compose: total number of drafts per year/total number of essays per year.</p> <hd id="AN0151271343-16">Lesson minutes</hd> <p>To measure students' degree of lesson usage within Utah Compose, we obtained the total number of lesson minutes completed across the school year via Utah Compose logs.</p> <hd id="AN0151271343-17">Peer feedback</hd> <p>Peer feedback is a widely used formative assessment method that supports writing improvement (Huisman et al., [<reflink idref="bib30" id="ref75">30</reflink>]). To measure the usage of the peer review function, we used two variables: the <emph>number of peer reviews given</emph> and the <emph>number of peer reviews received</emph> across the school year. Both variables were automatically calculated by Utah Compose.</p> <hd id="AN0151271343-18">Writing achievement</hd> <p>Writing achievement was measured by a vertical scale score for the writing portion of Utah's SAGE ELA state test. The test required students to write two source-based responses to prompts in the opinion/argumentative and informational/explanatory genres (American Institutes for Research [AIR] [<reflink idref="bib3" id="ref76">3</reflink>]). Compositions were scored by both automated and human scorers using a rubric that evaluated writing quality on three traits: (a) statement of purpose, focus and organization, (b) evidence and elaboration, and (c) conventions and editing. The SAGE ELA tests for 2014–15 and 2015–16 were subject to rigorous evaluation for reliability and validity and for fairness/bias (AIR 2015, 2016).</p> <hd id="AN0151271343-19">Student-level predictors</hd> <p>To examine potential student demographic factors in AWE usage and writing achievement variance, predictor variables included dummy codes for gender, race/ethnicity, special education status, EL status, and FRL status. Grade-level was included as a student-level predictor to control for differences in age and curriculum. Prior writing achievement was controlled for by using students' Year 1 SAGE scores.</p> <hd id="AN0151271343-20">School-level predictors</hd> <p>To examine potential school-level factors in usage and achievement variance, the following school-level predictors were included: (a) school size (the total number of students enrolled in a school), (b) percentage of minority students, (c) percentage of students classified as EL, (d) percentage of students with disabilities, and (d) the percentage of students receiving FRL. All percentage variables were calculated by dividing the relevant demographic by the total number of students in the school. Dummy codes were added for middle and high school with elementary school serving as the reference variable. A dummy code for charter schools was added with public schools serving as the reference variable. Table 2 includes descriptive statistics for school-level predictors.</p> <p>Table 2 Descriptive statistics of school-level demographic variables for Utah Compose users in year 2</p> <p> <ephtml> <table frame="hsides" rules="groups"><thead><tr><th align="left"><p>School-level variables</p></th><th align="left"><p>Mean</p></th><th align="left"><p>SD</p></th></tr></thead><tbody><tr><td align="left"><p>School size</p></td><td char="." align="char"><p>359.47</p></td><td char="." align="char"><p>270.09</p></td></tr><tr><td align="left"><p>% School minority status</p></td><td char="." align="char"><p>21.26%</p></td><td char="." align="char"><p>0.16</p></td></tr><tr><td align="left"><p>% School EL</p></td><td char="." align="char"><p>2.84%</p></td><td char="." align="char"><p>0.05</p></td></tr><tr><td align="left"><p>% School FRL</p></td><td char="." align="char"><p>34.13%</p></td><td char="." align="char"><p>0.20</p></td></tr></tbody></table> </ephtml> </p> <p>691 schools, <emph>% School FRL</emph> the percentage of students with FRL per school, <emph>% School EL</emph> the percentage of students classified as EL per school, <emph>% School Minority Status</emph> the percentage of students classified as minorities per school, <emph>School Size</emph> the number of students enrolled in a school</p> <hd id="AN0151271343-21">Data analysis</hd> <p>We employed descriptive statistics to answer RQ1 and hierarchical linear modeling (HLM) to answer RQ2 and RQ3 to account for the clustered nature of our data (Raudenbush & Bryk, [<reflink idref="bib52" id="ref77">52</reflink>]): students were clustered within schools that were clustered within districts. A series of random-intercept fixed-slopes HLM models were used for both RQs. In each model, student-level predictors were group mean-centered. School-level predictors were grand mean-centered. Analyses were conducted using maximum likelihood and estimated using the R (version 3.1-140) package nlme (Pinheiro et al., [<reflink idref="bib49" id="ref78">49</reflink>]).</p> <p>The relative strength of the predictors was described using <emph>R</emph><sups>2</sups>, which was calculated as the percentage of variance explained by conditional models relative to the unconditional model (Raudenbush & Bryk, [<reflink idref="bib52" id="ref79">52</reflink>]).</p> <hd id="AN0151271343-22">Interclass-correlation</hd> <p>Based on the intraclass-correlation coefficient (ICC), district-level variance was less than 2.5% of the total variance. Thus, we estimated a two-level HLM model with fixed effects dummy variables (Allison, [<reflink idref="bib2" id="ref80">2</reflink>]; see Lee, [<reflink idref="bib36" id="ref81">36</reflink>]) for all districts with two or more schools (<emph>n</emph> = 45) to control for district-level variance. Of the 110 districts in the dataset, 65 were comprised of one school (62 of these were charter districts). The ICC for two-level models with fixed effects for districts indicated that 63.9% of variance in essays completed was at the student-level and 36.1% at the school-level. For average drafts per essay, 84% was at the student-level and 16% was at the school-level. For SAGE scores, 78.1% was at the student-level and 21.8% at the school.</p> <hd id="AN0151271343-23">Mixed-model predicting Utah Compose usage</hd> <p>To answer RQ2, a series of three unique mixed-model regressions were conducted whereby the Utah Compose usage variables (i.e., the number of essays completed and number of drafts per essay completed) were outcome variables and student and school-level variables were predictors. The student-level HLM equation modeling the association between student-level predictors and Utah Compose usage was:</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mi>Y</mi><mrow><mi mathvariant="italic">ij</mi></mrow></msub><mfenced close=")" open="("><mrow><mi>U</mi><mi>T</mi><mspace width="0.277778em" /><mi>C</mi><mi>o</mi><mi>m</mi><mi>p</mi><mi>o</mi><mi>s</mi><mi>e</mi><mspace width="0.277778em" /><mi>U</mi><mi>s</mi><mi>a</mi><mi>g</mi><mi>e</mi></mrow></mfenced><mspace width="0.277778em" /><mo>=</mo><mspace width="0.277778em" /><msub><mi>β</mi><mrow><mn>0</mn><mi>j</mi></mrow></msub><mspace width="0.277778em" /><mo>+</mo><mspace width="0.277778em" /><msub><mi>β</mi><mrow><mn>1</mn><mo>...</mo><mn>10</mn></mrow></msub><mfenced close=")" open="("><mrow><mi>S</mi><mi>t</mi><mi>u</mi><mi>d</mi><mi>e</mi><mi>n</mi><mi>t</mi><mspace width="0.277778em" /><mi>V</mi><mi>a</mi><mi>r</mi><mi>i</mi><mi>a</mi><mi>b</mi><mi>l</mi><mi>e</mi><mo>-</mo><mover><mrow><mi>S</mi><mi>t</mi><mi>u</mi><mi>d</mi><mi>e</mi><mi>n</mi><mi>t</mi><mspace width="0.277778em" /><mi>V</mi><mi>a</mi><mi>r</mi><mi>i</mi><mi>a</mi><mi>b</mi><mi>l</mi><msub><mi>e</mi><mrow><mi mathvariant="italic">ij</mi></mrow></msub></mrow><mo>¯</mo></mover></mrow></mfenced><mspace width="0.277778em" /><mo>+</mo><mspace width="0.277778em" /><msub><mi>β</mi><mrow><mn>10</mn><mi>d</mi></mrow></msub><mo>∑</mo><mfenced close=")" open="("><mrow><mi>G</mi><mi>R</mi><mi>A</mi><mi>D</mi><mi>E</mi><msub><mi>S</mi><mn>1</mn></msub><mo>...</mo><mi>G</mi><mi>R</mi><mi>A</mi><mi>D</mi><mi>E</mi><msub><mi>S</mi><mn>8</mn></msub></mrow></mfenced><mspace width="0.277778em" /><mo>+</mo><mspace width="0.277778em" /><msub><mi>ε</mi><mrow><mi mathvariant="italic">ij</mi></mrow></msub></mrow></math> </ephtml> </p> <p>Graph</p> <p>The school-level portion of the HLM equation was:</p> <p> <ephtml> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mi>β</mi><mrow><mn>0</mn><mi>j</mi></mrow></msub><mo>=</mo><msub><mi>γ</mi><mn>00</mn></msub><mspace width="0.277778em" /><mo>+</mo><mspace width="0.277778em" /><msub><mi>γ</mi><mrow><mn>01</mn><mo>⋯</mo><mn>08</mn></mrow></msub><mfenced close=")" open="("><mrow><mi>S</mi><mi>c</mi><mi>h</mi><mi>o</mi><mi>o</mi><mi>l</mi><mspace width="0.277778em" /><mi>V</mi><mi>a</mi><mi>r</mi><mi>a</mi><mi>i</mi><mi>a</mi><mi>b</mi><mi>l</mi><msub><mi>e</mi><mi>j</mi></msub><mo>-</mo><mover><mrow><mi>S</mi><mi>c</mi><mi>h</mi><mi>o</mi><mi>o</mi><mi>l</mi><mspace width="0.277778em" /><mi>V</mi><mi>a</mi><mi>r</mi><mi>a</mi><mi>i</mi><mi>a</mi><mi>b</mi><mi>l</mi><mi>e</mi></mrow><mo>¯</mo></mover></mrow></mfenced><mspace width="0.277778em" /><mo>+</mo><mspace width="0.277778em" /><msub><mi>γ</mi><mrow><mn>09</mn><mi>d</mi></mrow></msub><mo>∑</mo><mrow><mfenced close=")" open="("><mrow><mi>D</mi><mi>i</mi><mi>s</mi><mi>t</mi><mi>r</mi><mi>i</mi><mi>c</mi><msub><mi>t</mi><mn>1</mn></msub><mo>...</mo><mi>D</mi><mi>i</mi><mi>s</mi><mi>t</mi><mi>r</mi><mi>i</mi><mi>c</mi><msub><mi>t</mi><mn>44</mn></msub></mrow></mfenced><mspace width="0.277778em" /><mo>+</mo></mrow><mspace width="0.277778em" /><msub><mi>γ</mi><mrow><mn>0</mn><mi>j</mi></mrow></msub></mrow></math> </ephtml> </p> <p>Graph</p> <hd id="AN0151271343-24">Mixed-model predicting writing outcomes</hd> <p>An HLM model was also used to answer RQ3. Year 2 SAGE scores served as the outcome variable; the same student and school-level variables used in the prior equations were entered as predictors. Year 1 SAGE scores were also entered to control for prior performance. Furthermore, we entered linear and quadratic terms for student-level Utah Compose usage variables as predictors. A linear effect would mean that the effect of the predictor on the outcome remains constant throughout the range of that predictor. A quadratic effect would mean that the effect of the predictor on the outcome gradually decreases and plateaus over time. School-level usage variables were also entered as linear effects.</p> <hd id="AN0151271343-25">Results</hd> <p></p> <hd id="AN0151271343-26">RQ1: describing Utah Compose usage</hd> <p>Descriptive statistics for Utah Compose usage and SAGE writing scaled scores are presented in Table 3. These results indicate that some aspects of the system were utilized well, such as the number of essays completed across the year (<emph>M</emph> = 6.09, <emph>SD</emph> = 4.20) and the average number of drafts/essay (<emph>M</emph> = 2.84, <emph>SD</emph> = 1.74). Other aspects of the system were underutilized; even at the 75th percentile the frequencies for peer review and lesson minutes was zero. The descriptive statistics indicate a pattern of moderate use of the main functionality of the AWE system (i.e., writing and revising essays) but not of the supportive pedagogical functions (i.e., peer review and lessons).</p> <p>Table 3 Descriptive statistics for Utah Compose usage in year 2</p> <p> <ephtml> <table frame="hsides" rules="groups"><thead><tr><th align="left" /><th align="left"><p>Mean</p></th><th align="left"><p>Median</p></th><th align="left"><p>SD</p></th><th align="left"><p>Min</p></th><th align="left"><p>Max</p></th><th align="left"><p>25th Percentile</p></th><th align="left"><p>75th Percentile</p></th></tr></thead><tbody><tr><td align="left" colspan="8"><p>Level-1 variables (students; <italic>n</italic> = 114,582)</p></td></tr><tr><td align="left" colspan="8"><p>Utah Compose usage</p></td></tr><tr><td align="left"><p> Essays completed</p></td><td char="." align="char"><p>6.09</p></td><td align="left"><p>5.00</p></td><td char="." align="char"><p>4.20</p></td><td align="left"><p>1.00</p></td><td align="left"><p>19.00</p></td><td align="left"><p>3.00</p></td><td align="left"><p>9.00</p></td></tr><tr><td align="left"><p> Essays completed (quadratic)</p></td><td char="." align="char"><p>54.70</p></td><td align="left"><p>25.00</p></td><td char="." align="char"><p>69.85</p></td><td align="left"><p>1.00</p></td><td align="left"><p>361.00</p></td><td align="left"><p>9.00</p></td><td align="left"><p>81.00</p></td></tr><tr><td align="left"><p> Drafts/Essay</p></td><td char="." align="char"><p>2.84</p></td><td align="left"><p>2.38</p></td><td char="." align="char"><p>1.74</p></td><td align="left"><p>1.00</p></td><td align="left"><p>8.50</p></td><td align="left"><p>1.43</p></td><td align="left"><p>3.80</p></td></tr><tr><td align="left"><p> Drafts/essay (quadratic)</p></td><td char="." align="char"><p>11.06</p></td><td align="left"><p>5.66</p></td><td char="." align="char"><p>13.51</p></td><td align="left"><p>1.00</p></td><td align="left"><p>72.25</p></td><td align="left"><p>2.04</p></td><td align="left"><p>14.44</p></td></tr><tr><td align="left"><p> Reviews given</p></td><td char="." align="char"><p>0.11</p></td><td align="left"><p>0.00</p></td><td char="." align="char"><p>0.80</p></td><td align="left"><p>0.00</p></td><td align="left"><p>25.00</p></td><td align="left"><p>0.00</p></td><td align="left"><p>0.00</p></td></tr><tr><td align="left"><p> Reviews received</p></td><td char="." align="char"><p>0.11</p></td><td align="left"><p>0.00</p></td><td char="." align="char"><p>0.78</p></td><td align="left"><p>0.00</p></td><td align="left"><p>26.00</p></td><td align="left"><p>0.00</p></td><td align="left"><p>0.00</p></td></tr><tr><td align="left"><p> Lesson minutes</p></td><td char="." align="char"><p>3.25</p></td><td align="left"><p>0.00</p></td><td char="." align="char"><p>8.21</p></td><td align="left"><p>0.00</p></td><td align="left"><p>45.00</p></td><td align="left"><p>0.00</p></td><td align="left"><p>0.00</p></td></tr><tr><td align="left" colspan="8"><p>Writing performance</p></td></tr><tr><td align="left"><p> SAGE score Y1</p></td><td char="." align="char"><p>409.55</p></td><td align="left"><p>412</p></td><td char="." align="char"><p>101.24</p></td><td align="left"><p>100</p></td><td align="left"><p>845</p></td><td align="left"><p>350</p></td><td align="left"><p>473</p></td></tr><tr><td align="left"><p> SAGE score Y2</p></td><td char="." align="char"><p>442.27</p></td><td align="left"><p>441</p></td><td char="." align="char"><p>89.12</p></td><td align="left"><p>135</p></td><td align="left"><p>770</p></td><td align="left"><p>390</p></td><td align="left"><p>490</p></td></tr><tr><td align="left" colspan="8"><p>Level-2 variables (schools; <italic>n</italic> = 691)</p></td></tr><tr><td align="left"><p> Essays completed</p></td><td char="." align="char"><p>6.09</p></td><td align="left"><p>5.72</p></td><td char="." align="char"><p>2.50</p></td><td align="left"><p>1.04</p></td><td align="left"><p>15.60</p></td><td align="left"><p>3.98</p></td><td align="left"><p>7.93</p></td></tr><tr><td align="left"><p> Drafts/Essay</p></td><td char="." align="char"><p>2.81</p></td><td align="left"><p>2.78</p></td><td char="." align="char"><p>0.67</p></td><td align="left"><p>1.04</p></td><td align="left"><p>5.12</p></td><td align="left"><p>2.35</p></td><td align="left"><p>3.27</p></td></tr></tbody></table> </ephtml> </p> <p>SAGE Score Y1 and Y2 refer to the writing scale score for the Utah Student Assessment of Growth and Excellence (SAGE) English language arts state assessment in consecutive years <emph>Y1</emph> Year 1 (Spring 2015), <emph>Y2</emph> Year 2 (Spring, 2016)</p> <hd id="AN0151271343-27">RQ2: explaining variability in Utah Compose usage</hd> <p>We assessed whether there was statistically significant variability in student Utah Compose usage based on student and school-level variables. Because the peer review and lesson functions of Utah Compose were effectively unutilized, we only examined the usage variables of number of essays completed and average drafts per essay.</p> <hd id="AN0151271343-28">Number of essays completed</hd> <p>Three mixed-models were estimated. Model 1 was an unconditional model with the number of essays completed as an outcome and fixed-effects dummy variables for districts. In Model 2, student-level demographic variables were added. Model 3 included school-level factors. Results from these HLM models are presented in Table 4.</p> <p>Table 4 Results of two-level hierarchical models predicting essays completed</p> <p> <ephtml> <table frame="hsides" rules="groups"><thead><tr><th align="left" /><th align="left"><p>Model 1</p></th><th align="left"><p>Model 2</p></th><th align="left"><p>Model 3</p></th></tr><tr><th align="left"><p>Fixed effects</p></th><th align="left"><p>Coefficient (S. E.)</p></th><th align="left"><p>Coefficient (S. E.)</p></th><th align="left"><p>Coefficient (S. E.)</p></th></tr></thead><tbody><tr><td align="left"><p>Intercept</p><p><italic>Degrees of freedom (df)</italic></p></td><td align="left"><p>6.50*** (0.10)</p><p><italic>df </italic>= 113,891</p></td><td align="left"><p>6.50*** (0.10)</p><p><italic>df</italic> = 113,880</p></td><td align="left"><p>6.09*** (0.13)</p><p><italic>df</italic> = 113,880</p></td></tr><tr><td align="left" colspan="4"><p>Student demographics</p></td></tr><tr><td align="left"><p> Male</p></td><td align="left" /><td align="left"><p>− 0.06** (0.02)</p></td><td align="left"><p>− 0.06** (0.02)</p></td></tr><tr><td align="left"><p> Hispanic</p></td><td align="left" /><td align="left"><p>− 0.01 (0.03)</p></td><td align="left"><p>− 0.01 (0.03)</p></td></tr><tr><td align="left"><p> Multiple races</p></td><td align="left" /><td align="left"><p>0.04 (0.07)</p></td><td align="left"><p>0.04 (0.07)</p></td></tr><tr><td align="left"><p> Asian</p></td><td align="left" /><td align="left"><p>− 0.15 (0.09)</p></td><td align="left"><p>− 0.15 (0.09)</p></td></tr><tr><td align="left"><p> Pacific Islander</p></td><td align="left" /><td align="left"><p>0.10 (0.09)</p></td><td align="left"><p>0.10 (0.09)</p></td></tr><tr><td align="left"><p> American Indian</p></td><td align="left" /><td align="left"><p>− 0.43*** (0.09)</p></td><td align="left"><p>− 0.43*** (0.11)</p></td></tr><tr><td align="left"><p> African American</p></td><td align="left" /><td align="left"><p>0.27** (0.10)</p></td><td align="left"><p>0.27** (0.10)</p></td></tr><tr><td align="left"><p> Special education</p></td><td align="left" /><td align="left"><p>0.01 (0.03)</p></td><td align="left"><p>0.01 (0.03)</p></td></tr><tr><td align="left"><p> EL</p></td><td align="left" /><td align="left"><p>0.08 (0.07)</p></td><td align="left"><p>0.08 (0.07)</p></td></tr><tr><td align="left"><p> FRL</p></td><td align="left" /><td align="left"><p>− 0.02 (0.02)</p></td><td align="left"><p>− 0.02 (0.02)</p></td></tr><tr><td align="left"><p> Grade</p></td><td align="left" /><td align="left"><p>− 0.28*** (0.01)</p></td><td align="left"><p>− 0.28*** (0.01)</p></td></tr><tr><td align="left" /><td align="left" /><td align="left"><p><italic>df</italic> = 113,880</p></td><td align="left"><p><italic>df</italic> = 113,880</p></td></tr><tr><td align="left" colspan="4"><p>School predictors</p></td></tr><tr><td align="left"><p> Size</p></td><td align="left" /><td align="left" /><td align="left"><p>0.001 (0.001)</p></td></tr><tr><td align="left"><p> Minority%</p></td><td align="left" /><td align="left" /><td align="left"><p>− 1.90 (1.12)</p></td></tr><tr><td align="left"><p> EL%</p></td><td align="left" /><td align="left" /><td align="left"><p>4.76** (2.46)</p></td></tr><tr><td align="left"><p> SpEd%</p></td><td align="left" /><td align="left" /><td align="left"><p>3.17* (1.27)</p></td></tr><tr><td align="left"><p> FRL%</p></td><td align="left" /><td align="left" /><td align="left"><p>1.39 (0.82)</p></td></tr><tr><td align="left"><p> Middle</p></td><td align="left" /><td align="left" /><td align="left"><p>− 1.40*** (0.26)</p></td></tr><tr><td align="left"><p> High</p></td><td align="left" /><td align="left" /><td align="left"><p>− 3.72*** (0.30)</p></td></tr><tr><td align="left"><p> Charter district</p></td><td align="left" /><td align="left" /><td align="left"><p>− 3.23 (1.65)</p></td></tr><tr><td align="left" /><td align="left" /><td align="left" /><td align="left"><p><italic>df</italic> = 638</p></td></tr><tr><td align="left" colspan="4"><p>Variance components</p></td></tr><tr><td align="left"><p> Intercept (SD)</p></td><td align="left"><p>6.50 (2.55)</p></td><td align="left"><p>6.50 (2.55)</p></td><td align="left"><p>4.92 (2.22)</p></td></tr><tr><td align="left"><p> Residual (SD)</p></td><td align="left"><p>11.52 (3.39)</p></td><td align="left"><p>11.47 (3.39)</p></td><td align="left"><p>11.47 (3.39)</p></td></tr><tr><td align="left" colspan="4"><p>Percent of variance explained</p></td></tr><tr><td align="left"><p> Level-1 (students)</p></td><td align="left"><p>–</p></td><td align="left"><p>0.44%</p></td><td align="left"><p>0.44%</p></td></tr><tr><td align="left"><p> Level-2 (schools)</p></td><td align="left"><p>–</p></td><td align="left"><p>–</p></td><td align="left"><p>24.33%</p></td></tr><tr><td align="left" colspan="4"><p>Model fit</p></td></tr><tr><td align="left"><p> BIC</p></td><td align="left"><p>608,599.00</p></td><td align="left"><p>608,219.90</p></td><td align="left"><p>608,126.74</p></td></tr><tr><td align="left"><p> L ratio</p></td><td align="left" /><td align="left"><p>507.26***</p></td><td align="left"><p>186.33***</p></td></tr></tbody></table> </ephtml> </p> <p>Grade level and district dummy variables are omitted for readability <emph>FRL</emph> Free or Reduced Lunch, <emph>EL</emph> English Learner, <emph>FRL%</emph> the percentage of students with FRL per school, <emph>EL%</emph> the percentage of students classified as EL per school, <emph>SpEd%</emph> the percentage of students with special education services per school, <emph>Minority%</emph> the percentage of students classified as minorities per school, <emph>Size</emph> the number of students enrolled in a school <emph>*p</emph> <.05, **<emph>p</emph> <.01, ***<emph>p</emph> <.001</p> <p>Results indicated that, within schools, the number of essays students completed was generally not affected by student demographics. ELs, students with disabilities, and students receiving FRL all utilized Utah Compose to an equal degree as their counterparts. Relative to White students, only two racial/ethnic groups showed statistically significant usage patterns: on average, American Indian students completed 0.41 fewer essays and African American students completed an additional 0.27 essays. Males completed statistically significant fewer essays than females across the year, however the practical significance was minimal: on average, male students completed 0.06 fewer essays than their female peers. Finally, an increase in one grade level was associated with a decrease in 0.28 essays completed. These student-level factors explained only 0.44% of variance in the number of essays completed.</p> <p>Demographic factors did explain a large amount of the between-school variance in number of essays completed. However, these results were encouraging with respect to equity and access issues. Schools with greater percentages of EL students completed, on average, 4.76 more essays per year than comparison schools. Similarly, schools with greater percentages of students with disabilities completed, on average, 3.17 more essays per year than comparison schools. School size, the percentage of low-income or minority students, and charter or public status were not statistically significant predictors. Elementary schools had a higher average number of essays completed than middle schools who in turn had a higher average than high schools. These school-level factors explained 24.33% of the variance in number of essays completed.</p> <hd id="AN0151271343-29">Number of drafts per essay</hd> <p>Results from the HLM models with number of drafts per essay are presented in Table 5. The results indicated that, within schools, the degree to which students revised their essays in response to automated feedback was affected by student demographics in ways that parallel equity concerns. Students who were male (− 0.02), Hispanic (− 0.16), Pacific Islander (− 0.20), American Indian (− 0.31), EL (− 0.23), or received special education services (− 0.43) or FRL (− 0.10) completed fewer drafts per essay than their peers when controlling for student and school-level factors. Students in higher grade levels tended to revise less than lower grade-level peers (− 0.05). Nevertheless, these factors explained only 1.23% of the variance in the average number of drafts completed per essay. With respect to between-school variance, high schools (− 0.56) and schools with greater FRL% (− 0.64) had lower average drafts per essay when controlling for student and school-level variables. These school-level factors explained 8.24% of the variance in the average number of drafts per essay.</p> <p>Table 5 Results of two-level hierarchical models predicting drafts per essay</p> <p> <ephtml> <table frame="hsides" rules="groups"><thead><tr><th align="left" /><th align="left"><p>Model 1</p></th><th align="left"><p>Model 2</p></th><th align="left"><p>Model 3</p></th></tr><tr><th align="left"><p>Fixed effects</p></th><th align="left"><p>Coefficient (S. E.)</p></th><th align="left"><p>Coefficient (S. E.)</p></th><th align="left"><p>Coefficient (S. E.)</p></th></tr></thead><tbody><tr><td align="left"><p>Intercept</p><p><italic>Degrees of freedom (df)</italic></p></td><td align="left"><p>2.84*** (0.03)</p><p><italic>df</italic> = 113,891</p></td><td align="left"><p>2.84*** (0.03)</p><p><italic>df</italic> = 113,880</p></td><td align="left"><p>2.84*** (0.04)</p><p><italic>df</italic> = 113,880</p></td></tr><tr><td align="left" colspan="4"><p>Student demographics</p></td></tr><tr><td align="left"><p> Male</p></td><td align="left" /><td align="left"><p>− 0.02* (0.01)</p></td><td align="left"><p>− 0.02* (0.01)</p></td></tr><tr><td align="left"><p> Hispanic</p></td><td align="left" /><td align="left"><p>− 0.16*** (0.02)</p></td><td align="left"><p>− 0.16*** (0.02)</p></td></tr><tr><td align="left"><p> Multiple races</p></td><td align="left" /><td align="left"><p>− 0.06* (0.03)</p></td><td align="left"><p>− 0.06* (0.03)</p></td></tr><tr><td align="left"><p> Asian</p></td><td align="left" /><td align="left"><p>0.03 (0.03)</p></td><td align="left"><p>0.03 (0.03)</p></td></tr><tr><td align="left"><p> Pacific Islander</p></td><td align="left" /><td align="left"><p>− 0.20*** (0.04)</p></td><td align="left"><p>− 0.20*** (0.04)</p></td></tr><tr><td align="left"><p> American Indian</p></td><td align="left" /><td align="left"><p>− 0.31*** (0.05)</p></td><td align="left"><p>− 0.31*** (0.05)</p></td></tr><tr><td align="left"><p> African American</p></td><td align="left" /><td align="left"><p>− 0.04 (0.05)</p></td><td align="left"><p>− 0.04 (0.05)</p></td></tr><tr><td align="left"><p> Special education</p></td><td align="left" /><td align="left"><p>− 0.43*** (0.02)</p></td><td align="left"><p>− 0.43*** (0.02)</p></td></tr><tr><td align="left"><p> EL</p></td><td align="left" /><td align="left"><p>− 0.23*** (0.03)</p></td><td align="left"><p>− 0.23*** (0.03)</p></td></tr><tr><td align="left"><p> Special education</p></td><td align="left" /><td align="left"><p>− 0.43*** (0.02)</p></td><td align="left"><p>− 0.43*** (0.02)</p></td></tr><tr><td align="left"><p> FRL</p></td><td align="left" /><td align="left"><p>− 0.10*** (0.01)</p></td><td align="left"><p>− 0.10*** (0.01)</p></td></tr><tr><td align="left"><p> Grade</p></td><td align="left" /><td align="left"><p>− 0.05*** (0.01)</p></td><td align="left"><p>− 0.05*** (0.01)</p></td></tr><tr><td align="left" /><td align="left" /><td align="left"><p><italic>df</italic> = 113,880</p></td><td align="left"><p><italic>df</italic> = 113,880</p></td></tr><tr><td align="left" colspan="4"><p>School predictors</p></td></tr><tr><td align="left"><p> Size</p></td><td align="left" /><td align="left" /><td align="left"><p>0.00 (0.0002)</p></td></tr><tr><td align="left"><p> Minority%</p></td><td align="left" /><td align="left" /><td align="left"><p>− 0.08 (0.35)</p></td></tr><tr><td align="left"><p> EL%</p></td><td align="left" /><td align="left" /><td align="left"><p>0.02 (0.76)</p></td></tr><tr><td align="left"><p> SpEd%</p></td><td align="left" /><td align="left" /><td align="left"><p>0.47 (0.40)</p></td></tr><tr><td align="left"><p> FRL%</p></td><td align="left" /><td align="left" /><td align="left"><p>− 0.64* (0.25)</p></td></tr><tr><td align="left"><p> Middle</p></td><td align="left" /><td align="left" /><td align="left"><p>− 0.05 (0.08)</p></td></tr><tr><td align="left"><p> High</p></td><td align="left" /><td align="left" /><td align="left"><p>− 0.56*** (0.09)</p></td></tr><tr><td align="left"><p> Charter district</p></td><td align="left" /><td align="left" /><td align="left"><p>− 0.50 (0.51)</p></td></tr><tr><td align="left" /><td align="left" /><td align="left" /><td align="left"><p><italic>df</italic> = 638</p></td></tr><tr><td align="left" colspan="4"><p>Variance components</p></td></tr><tr><td align="left"><p> Intercept (SD)</p></td><td align="left"><p>0.49 (0.68)</p></td><td align="left"><p>0.49 (0.70)</p></td><td align="left"><p>0.45 (0.69)</p></td></tr><tr><td align="left"><p> Residual (SD)</p></td><td align="left"><p>2.55 (1.60)</p></td><td align="left"><p>2.51 (1.59)</p></td><td align="left"><p>2.51 (1.59)</p></td></tr><tr><td align="left" colspan="4"><p>Percent of variance explained</p></td></tr><tr><td align="left"><p> Level-1 (students)</p></td><td align="left"><p>–</p></td><td align="left"><p>1.23%</p></td><td align="left"><p>1.23%</p></td></tr><tr><td align="left"><p> Level-2 (schools)</p></td><td align="left"><p>–</p></td><td align="left"><p>–</p></td><td align="left"><p>8.24%</p></td></tr><tr><td align="left" colspan="4"><p>Model fit</p></td></tr><tr><td align="left"><p> BIC</p></td><td align="left"><p>434,891.00</p></td><td align="left"><p>433,612.57</p></td><td align="left"><p>433,650.50</p></td></tr><tr><td align="left"><p> L ratio</p></td><td align="left" /><td align="left"><p>1406.57***</p></td><td align="left"><p>55.26***</p></td></tr></tbody></table> </ephtml> </p> <p>Grade level and district dummy variables are omitted for readability <emph>FRL</emph> Free or Reduced Lunch, <emph>EL</emph> English Learner, <emph>FRL%</emph> the percentage of students with FRL per school, <emph>EL%</emph> the percentage of students classified as EL per school, <emph>SpEd%</emph> the percentage of students with special education services per school, <emph>Minority%</emph> the percentage of students classified as minorities per school, <emph>Size</emph> the number of students enrolled in a school <emph>*p</emph> <.05, **<emph>p</emph> <.01, ***<emph>p</emph> <.001</p> <hd id="AN0151271343-30">RQ3: association between usage and gains in writing performance</hd> <p>Results from the HLM models for RQ3 are presented in Table 6. Model 1 was an unconditional model with Year 2 SAGE scores as an outcome and fixed effects for districts. Model 2 added student and school-level demographic variables and Year 1 SAGE scores. Model 3 added the linear and quadratic terms of Utah Compose usage variables for number of essays completed and number of drafts per essay. School-level measures of the number of essays and drafts per essay per school were also added. Model 2 explained 41% of the variance at the student-level and 65% at the school-level when compared to the null model; adding the Utah Compose usage variables in model 3 explained an additional 0.95% of variance at the student-level and 0.37% at the school-level.</p> <p>Table 6 Results of two-level hierarchical models predicting year 2 state test writing performance</p> <p> <ephtml> <table frame="hsides" rules="groups"><thead><tr><th align="left" /><th align="left"><p>Model 1</p></th><th align="left"><p>Model 2</p></th><th align="left"><p>Model 3</p></th></tr><tr><th align="left"><p>Fixed effects</p></th><th align="left"><p>Coefficient (S. E.)</p></th><th align="left"><p>Coefficient (S. E.)</p></th><th align="left"><p>Coefficient (S. E.)</p></th></tr></thead><tbody><tr><td align="left"><p>Intercept</p><p><italic>Degrees of freedom (df)</italic></p></td><td align="left"><p>421.87*** (1.68)</p><p><italic>df</italic> = 113,891</p></td><td align="left"><p>442.27*** (1.38)</p><p><italic>df</italic> = 113,879</p></td><td align="left"><p>442.27*** (1.37)</p><p><italic>df</italic> = 113,875</p></td></tr><tr><td align="left" colspan="4"><p>Student demographics</p></td></tr><tr><td align="left"><p> Male</p></td><td align="left" /><td align="left"><p>− 16.26*** (0.36)</p></td><td align="left"><p>− 16.10*** (0.36)</p></td></tr><tr><td align="left"><p> Hispanic</p></td><td align="left" /><td align="left"><p>− 3.05*** (0.60)</p></td><td align="left"><p>− 3.05*** (0.60)</p></td></tr><tr><td align="left"><p> Multiple races</p></td><td align="left" /><td align="left"><p>1.56 (1.19)</p></td><td align="left"><p>1.66 (1.18)</p></td></tr><tr><td align="left"><p> Asian</p></td><td align="left" /><td align="left"><p>17.30*** (1.52)</p></td><td align="left"><p>17.40*** (1.51)</p></td></tr><tr><td align="left"><p> Pacific Islander</p></td><td align="left" /><td align="left"><p>− 0.57 (1.61)</p></td><td align="left"><p>− 0.003 (1.60)</p></td></tr><tr><td align="left"><p> American Indian</p></td><td align="left" /><td align="left"><p>− 6.47** (1.96)</p></td><td align="left"><p>− 5.20** (1.94)</p></td></tr><tr><td align="left"><p> African American</p></td><td align="left" /><td align="left"><p>− 12.41*** (1.78)</p></td><td align="left"><p>− 12.62*** (1.76)</p></td></tr><tr><td align="left"><p> Special education</p></td><td align="left" /><td align="left"><p>− 31.65*** (0.63)</p></td><td align="left"><p>− 31.65*** (0.63)</p></td></tr><tr><td align="left"><p> EL</p></td><td align="left" /><td align="left"><p>− 17.82*** (1.18)</p></td><td align="left"><p>− 17.09*** (1.17)</p></td></tr><tr><td align="left"><p> FRL</p></td><td align="left" /><td align="left"><p>− 10.41*** (0.43)</p></td><td align="left"><p>− 10.06*** (0.43)</p></td></tr><tr><td align="left"><p> Grade</p></td><td align="left" /><td align="left"><p>13.26*** (0.23)</p></td><td align="left"><p>13.26*** (0.23)</p></td></tr><tr><td align="left"><p> Prior achievement</p></td><td align="left" /><td align="left"><p>0.44*** (0.002)</p></td><td align="left"><p>0.44*** (0.002)</p></td></tr><tr><td align="left" colspan="4"><p>Student Utah Compose usage</p></td></tr><tr><td align="left"><p> Essays completed</p></td><td align="left" /><td align="left" /><td align="left"><p>3.00*** (0.16)</p></td></tr><tr><td align="left"><p> Essays completed (quadratic)</p></td><td align="left" /><td align="left" /><td align="left"><p>− 0.11*** (0.01)</p></td></tr><tr><td align="left"><p> Drafts/essay</p></td><td align="left" /><td align="left" /><td align="left"><p>9.23*** (0.42)</p></td></tr><tr><td align="left"><p> Drafts/essay (quadratic)</p></td><td align="left" /><td align="left" /><td align="left"><p>− 0.80*** (0.05)</p></td></tr><tr><td align="left" /><td align="left" /><td align="left"><p><italic>df</italic> = 638</p></td><td align="left"><p><italic>df</italic> = 636</p></td></tr><tr><td align="left" colspan="4"><p>School demographics</p></td></tr><tr><td align="left"><p> Size</p></td><td align="left" /><td align="left"><p>0.04*** (0.01)</p></td><td align="left"><p>0.04*** (0.01)</p></td></tr><tr><td align="left"><p> Minority%</p></td><td align="left" /><td align="left"><p>24.04 (12.75)</p></td><td align="left"><p>25.93* (12.71)</p></td></tr><tr><td align="left"><p> EL%</p></td><td align="left" /><td align="left"><p>− 141.74*** (27.95)</p></td><td align="left"><p>− 145.56*** (27.87)</p></td></tr><tr><td align="left"><p> SpEd%</p></td><td align="left" /><td align="left"><p>− 120.89*** (14.60)</p></td><td align="left"><p>− 124.73*** (14.59)</p></td></tr><tr><td align="left"><p> FRL%</p></td><td align="left" /><td align="left"><p>− 51.16*** (9.34)</p></td><td align="left"><p>− 51.17*** (9.35)</p></td></tr><tr><td align="left"><p> Middle</p></td><td align="left" /><td align="left"><p>29.33*** (2.95)</p></td><td align="left"><p>30.73*** (3.00)</p></td></tr><tr><td align="left"><p> High</p></td><td align="left" /><td align="left"><p>66.82*** (3.41)</p></td><td align="left"><p>71.45*** (3.79)</p></td></tr><tr><td align="left"><p> Charter district</p></td><td align="left" /><td align="left"><p>6.72 (18.63)</p></td><td align="left"><p>10.67 (18.59)</p></td></tr><tr><td align="left" colspan="4"><p>School Utah Compose usage</p></td></tr><tr><td align="left"><p> Essays completed per school</p></td><td align="left" /><td align="left" /><td align="left"><p>0.89* (0.43)</p></td></tr><tr><td align="left"><p> Drafts/essay per school</p></td><td align="left" /><td align="left" /><td align="left"><p>2.37 (1.46)</p></td></tr><tr><td align="left" colspan="4"><p>Variance components</p></td></tr><tr><td align="left"><p> Intercept (SD)</p></td><td align="left"><p>1696.65 (41.19)</p></td><td align="left"><p>595.48 (24.40)</p></td><td align="left"><p>589.24 (24.27)</p></td></tr><tr><td align="left"><p> Residual (SD)</p></td><td align="left"><p>6074.97 (77.94)</p></td><td align="left"><p>3585.85 (59.88)</p></td><td align="left"><p>3527.90 (59.40)</p></td></tr><tr><td align="left" colspan="4"><p>Percent of variance explained</p></td></tr><tr><td align="left"><p> Level-1 (students)</p></td><td align="left"><p>–</p></td><td align="left"><p>40.97%</p></td><td align="left"><p>41.93%</p></td></tr><tr><td align="left"><p> Level-2 (schools)</p></td><td align="left"><p>–</p></td><td align="left"><p>64.90%</p></td><td align="left"><p>65.27%</p></td></tr><tr><td align="left" colspan="4"><p>Model fit</p></td></tr><tr><td align="left"><p> BIC</p></td><td align="left"><p>1,326,327</p></td><td align="left"><p>1,265,816</p></td><td align="left"><p>1,264,023</p></td></tr><tr><td align="left"><p> L ratio</p></td><td align="left" /><td align="left"><p>60,743.87***</p></td><td align="left"><p>1863.22***</p></td></tr></tbody></table> </ephtml> </p> <p>Grade level and district dummy variables are omitted for readability <emph>FRL</emph> Free or Reduced Lunch, <emph>EL</emph> English Language Learner, <emph>FRL%</emph> the percentage of students with FRL per school, <emph>EL%</emph> the percentage of students classified as EL per school, <emph>SpEd%</emph> the percentage of students with special education services per school, <emph>Minority%</emph> the percentage of students classified as minorities per school, <emph>Size</emph> the number of students enrolled in a school <emph>*p</emph> <.05, **<emph>p</emph> <.01, ***<emph>p</emph> <.001</p> <p>Results indicated that students' gains in writing performance were predicted by prior performance, demographics, and usage. Within schools, students who completed a greater number of essays and a greater number of drafts per essay demonstrated greater writing performance than their peers. The associated effect of these usage variables was not linear, as evidenced by the statistically significant quadratic terms associated with each variable—greater Utah Compose usage was associated with greater state test writing performance up to a point after which there was diminishing returns. Saturation points were calculated using the following equation (Singer & Willett, [<reflink idref="bib56" id="ref82">56</reflink>]): [(− 1 × coefficient of the linear term)/(2 × coefficient of the squared term)]. The saturation point for number of essays and drafts per essay was 13.6 and 5.8, respectively. Supplemental figures illustrate these effects. All student-level factors explained 41.93% of the variance in Year 2 SAGE scores.</p> <p>Between school variability in Year 2 state test writing performance was also predicted by demographic factors and Utah Compose usage. With respect to the focal Utah Compose usage predictors, only the mean number of essays completed per school was statistically significantly associated with greater SAGE writing performance at the school level. Higher rates of revision at the school level, indicated by the drafts/essay variable, was not statistically significantly associated with greater SAGE writing performance. In all, these school-level factors explained 65.27% of the between-school variance in Year 2 SAGE writing performance.</p> <hd id="AN0151271343-31">Discussion</hd> <p>The purpose of the present study was to extend the literature on large-scale implementation of AWE by examining which aspects of AWE are utilized at scale, whether issues of equity and access influence AWE usage, and whether AWE usage is associated with gains in state test writing performance. Results indicated that the main functions of AWE (i.e., writing and revising essays in response to automated feedback) were used by students, but supplemental pedagogical functions (e.g., peer review and skill-building lessons) were not. With respect to equity and access, some student and school-level demographic factors were associated with lower usage rates for essays completed and drafts per essay. However, these differences may not be practically meaningful given that student variables only explained 0.44% variance in number of essays and 1.23% in drafts per essay. After accounting for student and school-level demographic factors and prior writing performance, Utah Compose usage was associated with statistically significant gains in state test writing performance. Results are discussed with respect to implications and limitations of implementing AWE at scale to support the teaching and learning of writing.</p> <hd id="AN0151271343-32">Describing Utah Compose usage</hd> <p>The current study updates results of prior studies of other naturalistic AWE implementations with respect to AWE usage. Smaller-scale studies have found that despite teachers reporting positive attitudes towards AWE and its ability to facilitate greater revision, AWE has been used mainly for first-draft only-draft writing; students do not frequently revise with feedback to submit multiple drafts (see Grimes & Warschauer, [<reflink idref="bib22" id="ref83">22</reflink>]; Warschauer & Grimes, [<reflink idref="bib62" id="ref84">62</reflink>]). The present study, however, found that students completed, on average, 6 essays per year and revised, on average, each essay 3 times. These are encouraging statistics, suggesting that, perhaps over time, AWE technologies have been gaining traction for use to support the teaching and learning of writing.</p> <p>Nevertheless, it is interesting that students in higher grade levels and middle and high schools completed fewer essays and fewer drafts/essay than students in lower grade levels and elementary school. Given that one of the intentions of AWE is to remove barriers to increasing writing practice erected by the time costs of teachers scoring and evaluating writing, we might expect that secondary school teachers would want to use AWE to an equal or greater degree than elementary school teachers, especially because secondary school teachers instruct upwards of 100 + students. Although the present study is not positioned to definitively explain this incongruence, national surveys of writing instruction in the U.S. indicate that English language arts curricula tend to de-emphasize writing, particularly in the upper grades (Brindle et al., [<reflink idref="bib9" id="ref85">9</reflink>]; Graham et al., [<reflink idref="bib21" id="ref86">21</reflink>]; Kiuhara et al., [<reflink idref="bib35" id="ref87">35</reflink>]). If this is the case, then AWE is only removing one barrier to increasing writing practice and a larger, more systemic barrier, that of the ELA curriculum, is unaffected by the affordances offered by AWE.</p> <p>Furthermore, descriptive statistics indicated that peer review and skill-building lessons were rarely used. As an interactive learning environment, the Utah Compose AWE system facilitates a number of pedagogically sound interactions, one of which is peer review. Limited peer review usage was somewhat surprising because peer feedback is a common effective instructional practice (Huisman et al., [<reflink idref="bib30" id="ref88">30</reflink>]). Possible reasons for underutilization may include lack of training, the use of peer review outside of the system, peer review not being included in the curriculum, or not being valued by a teacher. Nevertheless, our findings illustrate an important point: simply providing <emph>access</emph> to AWE systems that facilitate desirable pedagogical interactions is insufficient to ensure <emph>implementation</emph> of those interactions.</p> <hd id="AN0151271343-33">Explaining variability in Utah Compose usage</hd> <p>Student-level demographic factors were minimally associated with Utah Compose usage, but, in the case of drafts/essay paralleled existing performance gaps in writing. Still, the majority of variance in student-level AWE usage was unexplained by demographic factors. While the present study is not positioned to say definitively, one set of factors that might be more salient than student demographics may be teacher characteristics and instructional methods. Prior research indicates that teachers across grade levels share different attitudes and beliefs about technology implementation and possess varying levels of technical pedagogical content knowledge (Carver, [<reflink idref="bib13" id="ref89">13</reflink>]; Hew & Brush, [<reflink idref="bib27" id="ref90">27</reflink>]; Williams & Beam, [<reflink idref="bib64" id="ref91">64</reflink>]). Thus, it is reasonable that teacher differences might explain variance in Utah Compose usage.</p> <hd id="AN0151271343-34">Association between usage and gains in writing performance</hd> <p>Results extend limited and conflicting research on relationships between AWE usage and state test writing performance (see Shermis et al., [<reflink idref="bib55" id="ref92">55</reflink>]; Wilson et al., [<reflink idref="bib69" id="ref93">69</reflink>]; Wilson & Roscoe, [<reflink idref="bib71" id="ref94">71</reflink>]). On average, after controlling for school district, demographics, and prior performance, students who completed an additional essay within Utah Compose increased their SAGE score by 3 points and students who revised an additional draft per essay increased their SAGE scores by 9.23 points. However, these positive effects were not linear. Gains in SAGE writing performance diminished as students approached the saturation point for number of essays completed (13 essays) and average drafts per essay (5 drafts/essay). Thus, greater use of AWE is better, but only up to a point. Furthermore, usage variables explained less than 1% of variance in writing scores.</p> <p>Interestingly, the associated effect of drafts/essay on SAGE writing performance at the school level was not statistically significant. At the student level, one of the hallmarks of a strong writer is frequent and effective revising (Deane, [<reflink idref="bib17" id="ref95">17</reflink>]), but prior research is less clear on what characterizes effective systematic school-wide writing practices. Is it sufficient for a handful of teachers to implement AWE creatively and effectively? Or is it sufficient for a school to have the majority of teachers use AWE moderately? Although the present study is not positioned to answer these questions, findings raise these important questions for future research.</p> <hd id="AN0151271343-35">Limitations and directions for future research</hd> <p>The present study provides a birds-eye view of typical usage of a statewide implementation of an AWE program when districts, schools, and teachers were free to implement AWE as little or as much as they chose. To understand whether AWE implementation is causally related to performance gains, future research should utilize experimental designs to compare gains in state test performance among students who did and did not use AWE.</p> <p>Granular data to analyze how classroom-level factors impacted results were not available. These factors might have explained variance in our outcome measures. Thus, future research should incorporate measures of teacher-level instructional factors, such as attitudes towards technology and AWE or measures of instructional quality.</p> <p>Moreover, given our data, we do not know the extent to which students used the AWE feedback to guide their revisions. There is evidence from prior research that students use automated feedback to make productive revisions (Moore & MacArthur, [<reflink idref="bib41" id="ref96">41</reflink>]; Roscoe et al., [<reflink idref="bib53" id="ref97">53</reflink>]); however, students also may rely on internal schemas in addition to (or in exception of) the AWE feedback (Chapelle et al., [<reflink idref="bib14" id="ref98">14</reflink>]). Feedback uptake is complex (Carless & Boud, [<reflink idref="bib12" id="ref99">12</reflink>]); future AWE research should attend to these nuances.</p> <p>Next, the present study did not allow for a nuanced analysis of technology access issues. An evaluation of math technology implementation in Utah during this same timeframe found that one-third of teachers claimed technology access was a barrier in implementing math technologies (Brasiel et al., [<reflink idref="bib8" id="ref100">8</reflink>]). In absence of technology access data, all HLM analyses were conducted by nesting students within schools and using fixed-effects dummy variables for school districts. This analytic approach enabled schools to be their own controls and to examine within- and between-school variance in AWE usage and gains in state test writing performance.</p> <p>Finally, results might not generalize to a different context outside of Utah. Follow-up studies with racially and ethnically-diverse populations should be conducted.</p> <hd id="AN0151271343-36">Implications for usage and implementation</hd> <p>Study findings have important implications for states and districts considering adopting AWE to support the teaching and learning of writing. First, stakeholders must understand that providing access to AWE does not ensure full or effective implementation. A coordinated plan for curricular integration with clear usage expectations and significant professional development are likely necessary. Second, despite our study finding minimal evidence of issues of equity and access, states and districts must not ignore these issues. States and districts considering implementing AWE at scale should ensure teachers and students have sufficient access to technology to productively utilize and benefit from AWE. Otherwise, AWE implementation may result in widening, not closing, achievement gaps. Third, states and districts should recognize that, when implemented in a similar manner as the present study (i.e., non-compulsory and provided as a supplemental curricular resource), AWE will not dramatically transform writing outcomes, but neither will it diminish those outcomes—at worst, it appears that implementation of AWE is innocuous. Overall, those considering implementing AWE at scale should understand that it is a tool that can support instruction by affording opportunities for writing practice and immediate feedback, but like any tool, its effectiveness depends on how it is implemented, as well as systematic support and user expertise.</p> <hd id="AN0151271343-37">Acknowledgements</hd> <p>This research was supported in part by Delegated Authority contract EDUC432914160001 from Measurement Incorporated<sups>®</sups> and by Grant R305H170046 from the Institute of Education Sciences, U.S. Department of Education, to the University of Delaware. The opinions expressed are those of the authors and do not represent the views of Measurement Incorporated, the Institute, or the U.S. Department of Education, and no official endorsement by these agencies should be inferred. Thank you to Drs. Christina Barbieri and Henry May for feedback on prior drafts.</p> <hd id="AN0151271343-38">Declarations</hd> <p></p> <hd id="AN0151271343-39">Conflict of interest</hd> <p>The authors declare that they have no conflict of interest.</p> <hd id="AN0151271343-40">Supplementary Information</hd> <p>Below is the link to the electronic supplementary material.</p> <p>Graph: Supplementary file1 (DOCX 1325 kb)</p> <hd id="AN0151271343-41">Publisher's Note</hd> <p>Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.</p> <ref id="AN0151271343-42"> <title> References </title> <blist> <bibl id="bib1" idref="ref7" type="bt">1</bibl> <bibtext> Allen LK, Jacovina ME, McNamara DSMacArthur CA, Graham S, Fitzgerald J. Computer-based writing instruction. Handbook of writing research. 2016: New York; The Guildford Press: 316-329</bibtext> </blist> <blist> <bibl id="bib2" idref="ref80" type="bt">2</bibl> <bibtext> Allison PD. Fixed effects regression models. 2009: Thousand Oaks, CA; SAGE. 10.4135/9781412993869</bibtext> </blist> <blist> <bibl id="bib3" idref="ref76" type="bt">3</bibl> <bibtext> American Institutes for Research. (2018). Utah State Assessments 2017–2018 technical report: Volume 1 Technical report.https://schools.utah.gov/file/97391cfd-9251-4ad1-9266-47b2ebe88e84</bibtext> </blist> <blist> <bibl id="bib4" idref="ref38" type="bt">4</bibl> <bibtext> Bai L, Hu G. In the face of fallible AWE feedback: How do students respond. Educational Psychology. 2017; 37: 67-81. 10.1080/01443410.2016.1223275</bibtext> </blist> <blist> <bibl id="bib5" idref="ref42" type="bt">5</bibl> <bibtext> Bauer MS, Damschroder L, Hagedom H, Smith J, Kilbourne AM. An introduction to implementation science for the non-specialist. BMC Psychology. 2015; 3; 32: 1-12</bibtext> </blist> <blist> <bibl id="bib6" idref="ref32" type="bt">6</bibl> <bibtext> Bejar II, Flor M, Futagi Y, Ramineni C. On the vulnerability of automated scoring to construct-irrelevant response strategies (CIRS): An illustration. Assessing Writing. 2014; 22: 48-59. 10.1016/j.asw.2014.06.001</bibtext> </blist> <blist> <bibl id="bib7" idref="ref73" type="bt">7</bibl> <bibtext> Black P, Wiliam D. Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability. 2009; 21: 5-31. 10.1007/s11092-008-9068-5</bibtext> </blist> <blist> <bibl id="bib8" idref="ref100" type="bt">8</bibl> <bibtext> Brasiel S, Jeong S, Ames C, Lawanto K, Yuan M, Martin T. Effects of educational technology on mathematics achievement for K-12 students in Utah. Journal of Online Learning Research. 2016; 2: 205-226</bibtext> </blist> <blist> <bibl id="bib9" idref="ref85" type="bt">9</bibl> <bibtext> Brindle M, Graham S, Harris KR, Hebert M. Third and fourth grade teachers' classroom practices in writing: A national survey. Reading and Writing. 2016; 29: 929-954. 10.1007/s11145-015-9604-x</bibtext> </blist> <blist> <bibtext> Bunch MB, Vaughn D, Miel SRosen Y, Ferrara S, Mosharraf M. Automated scoring in assessment systems. Handbook of research on technology tools for real-world skill development. 2016: Hershey, PA; IGI Global: 611-626. 10.4018/978-1-4666-9441-5.ch023</bibtext> </blist> <blist> <bibtext> Campuzano L, Dynarski M, Agodini R, Rall K. Effectiveness of reading and mathematics software products: Findings from two student cohorts (NCEE 2009–4042). 2009: Washington, DC; National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education</bibtext> </blist> <blist> <bibtext> Carless D, Boud D. The development of student feedback literacy: Enabling uptake of feedback. Assessment & Evaluation in Higher Education. 2018; 43; 8: 1315-1325. 10.1080/02602938.2018.1463354</bibtext> </blist> <blist> <bibtext> Carver, L. B. (2016). Teacher perception of barriers and benefits in K-12 technology usage. Turkish Online Journal of Educational Technology, 15, 110–116. Retrieved from <ulink href="http://www.tojet.net/articles/v15i1/15111.pdf">http://www.tojet.net/articles/v15i1/15111.pdf</ulink></bibtext> </blist> <blist> <bibtext> Chapelle CA, Cotos E, Lee J. Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing. 2015; 32; 3: 385-405. 10.1177/0265532214565386</bibtext> </blist> <blist> <bibtext> Coe M, Hanita M, Nishioka V, Smiley R. An investigation of the impact of the 6 + 1 trait writing model on grade 5 student writing achievement (Final Report NCEE 2012–4010). 2011: Washington, DC; National Center for Education Evaluation and Regional Assistance</bibtext> </blist> <blist> <bibtext> Conference on College Composition and Communication. (2014). CCCC position statement on teaching, learning and assessing writing in digital environments. Retrieved April 14, 2021, from https://cccc.ncte.org/cccc/resources/positions/writingassessment</bibtext> </blist> <blist> <bibtext> Deane P. The challenge of writing in school: Conceptualizing writing development within a sociocognitive framework. Educational Psychologist. 2018; 53: 280-300. 10.1080/00461520.2018.1513844</bibtext> </blist> <blist> <bibtext> Dikli S. The nature of automated essay scoring feedback. CALICO Journal. 2010; 28: 99-134. 10.11139/cj.28.1.99-134</bibtext> </blist> <blist> <bibtext> Ericsson PF, Haswell RJ. Machine scoring of student essays: Truth and consequences. 2006; Utah State University Press</bibtext> </blist> <blist> <bibtext> Flower L, Hayes JRGregg L, Steinberg E. The dynamics of composing: Making plans and juggling constraints. Cognitive processes in writing. 1980: Hillsdale, NJ; Erlbaum: 31-50</bibtext> </blist> <blist> <bibtext> Graham S, Capizzi A, Harris KR, Hebert M, Morphy P. Teaching writing to middle school students: A national survey. Reading and Writing. 2014; 27: 1015-1042. 10.1007/s11145-013-9495-7</bibtext> </blist> <blist> <bibtext> Grimes, D, & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. The Journal of Technology, Learning and Assessment, 8(6), 1–44. Retrieved from <ulink href="http://www.jtla.org">http://www.jtla.org</ulink></bibtext> </blist> <blist> <bibtext> Grissmer DW, Berends M. Student achievement and the changing American family. 1994: Santa Monica, CA; RAND Corporation</bibtext> </blist> <blist> <bibtext> Hattie J, Timperley H. The power of feedback. Review of Educational Research. 2007; 77: 81-112. 10.3102/003465430298487</bibtext> </blist> <blist> <bibtext> Hayes JRLevy CM, Ransdell S. A new framework for understanding cognition and affect in writing. The science of writing: Theories, methods, individual differences, and applications. 1996: Mahwah, NJ; Erlbaum: 1-27</bibtext> </blist> <blist> <bibtext> Hayes JR. Modeling and remodeling writing. Written Communication. 2012; 29; 3: 369-388. 10.1177/0741088312451260</bibtext> </blist> <blist> <bibtext> Hew KF, Brush T. Integrating technology into K-12 teaching and learning: Current knowledge gaps and recommendations for future research. Educational Technology Research and Development. 2007; 55: 223-252. 10.1007/s11423-006-9022-5</bibtext> </blist> <blist> <bibtext> Higgins D, Heilman M. Managing what we can measure: Quantifying the susceptibility of automated scoring systems to gaming behavior. Educational Measurement: Issues and Practice. 2014; 33; 3: 36-46. 10.1111/emip.12036</bibtext> </blist> <blist> <bibtext> Hoffman K, Liagas C. Status and trends in the education of blacks (NCES Publication No. 2003–034). 2003: Washington, DC; U.S. Department of Education</bibtext> </blist> <blist> <bibtext> Huisman B, Saab N, van den Broek P, van Driel J. The impact of formative peer feedback on higher education students' academic writing: A meta-analysis. Assessment & Evaluation in Higher Evaluation. 2019; 44: 863-880. 10.1080/02602938.2018.1545896</bibtext> </blist> <blist> <bibtext> Hull M, Dutch K. One-to-one technology and student outcomes: Evidence from Mooresville's digital conversion initiative. Educational Evaluation and Policy Analysis. 2019; 41: 79-97. 10.3102/0162373718799969</bibtext> </blist> <blist> <bibtext> Jeno LM, Vandvik V, Eliassen S, Grytnes J. Testing the novelty effect of an m-learning tool on internalization and achievement: A self-determination theory approach. Computers & Education. 2019; 128: 398-413. 10.1016/j.compedu.2018.10.008</bibtext> </blist> <blist> <bibtext> Keller J, Suzuki K. Learner motivation and e-learning design: A multinationally validated process. Journal of Educational Media. 2004; 29: 229-239. 10.1080/1358165042000283084</bibtext> </blist> <blist> <bibtext> Kellogg RT, Whiteford AP. Training advanced writing skills: The case for deliberate practice. Educational Psychologist. 2009; 44; 4: 250-266. 10.1080/00461520903213600</bibtext> </blist> <blist> <bibtext> Kiuhara SA, Graham S, Hawken LS. Teaching writing to high school students: A national survey. Journal of Educational Psychology. 2009; 101: 136-160. 10.1037/a0013097</bibtext> </blist> <blist> <bibtext> Lee V. Using hierarchical linear modeling to study social Contexts: The case of school effects. Educational Psychologist. 2000; 35: 125-141. 10.1207/S15326985EP3502_6</bibtext> </blist> <blist> <bibtext> Little CW, Clark JC, Tani NE, Connor CM. Improving writing skills through technology-based instruction: A meta-analysis. Review of Education. 2018; 6: 183-201. 10.1002/rev3.3114</bibtext> </blist> <blist> <bibtext> Liu S, Kunnan AJ. Investigating the application of automated writing evaluation to Chinese undergraduate english majors: A case study of WriteToLearn. CALICO Journal. 2016; 33: 71-91. 10.1558/cj.v33i1.26380</bibtext> </blist> <blist> <bibtext> Liu M, Li Y, Xu W, Liu L. Automated essay feedback generation and its impact on revision. IEEE Transactions on Learning Technologies. 2017; 10: 502-513. 10.1109/TLT.2016.2612659</bibtext> </blist> <blist> <bibtext> Lu R, Overbaugh RC. School environment and technology implementation in K–12 classrooms. Computers in the Schools. 2009; 26: 89-106. 10.1080/07380560902906096</bibtext> </blist> <blist> <bibtext> Moore NS, MacArthur CA. Student use of automated essay evaluation technology during revision. Journal of Writing Research. 2016; 8: 149-175. 10.17239/jowr-2016.08.01.05</bibtext> </blist> <blist> <bibtext> Morphy P, Graham S. Word processing programs and weaker writers/readers: A meta-analysis of research findings. Reading and Writing. 2012; 25: 641-678. 10.1007/s11145-010-9292-5</bibtext> </blist> <blist> <bibtext> National Center for Education Statistics. The Nation's report card: Writing 2011 (NCES 2012–470). 2012: Washington, D.C; Institute of Education Sciences, U.S. Department of Education</bibtext> </blist> <blist> <bibtext> National Commission on Writing for America's Families, Schools, and Colleges. Writing: A ticket to work ... or a ticket out. A survey of business leaders. 2004: New York; College Entrance Examination Board</bibtext> </blist> <blist> <bibtext> National Council of Teachers of English. (2013). NCTE position statement on machine scoring. Retrieved from: <ulink href="http://www.ncte.org/positions/statements/machine%5fscoring">http://www.ncte.org/positions/statements/machine%5fscoring</ulink>.</bibtext> </blist> <blist> <bibtext> Page EBShermis MD, Burstein J. Project essay grade: PEG. Automated essay scoring: A cross-disciplinary perspective. 2003: Mahwah, NJ; Lawrence Erlbaum Associates Publishers: 43-54</bibtext> </blist> <blist> <bibtext> Palermo C, Thomson MM. Teacher implementation of self-regulated strategy development with an automated writing evaluation system: Effects on the argumentative writing performance of middle school students. Contemporary Educational Psychology. 2018; 54: 255-270. 10.1016/j.cedpsych.2018.07.002</bibtext> </blist> <blist> <bibtext> Palermo C, Wilson J. Implementing automated writing evaluation in different instructional contexts: A mixed-methods study. Journal of Writing Research. 2020; 12; 1: 63-108. 10.17239/jowr-2020.12.01.04</bibtext> </blist> <blist> <bibtext> Pinheiro, J, Bates, D, DebRoy, S, Sarkar, D. & R Core Team (2019). Nlme: Linear and nonlinear mixed effects models. R package version 3.1-140.</bibtext> </blist> <blist> <bibtext> Ranalli J. Automated written corrective feedback: How well can students make use of it?. Computer Assisted Language Learning. 2018; 31: 653-674. 10.1080/09588221.2018.1428994</bibtext> </blist> <blist> <bibtext> Ranalli J, Link S, Chukharev-Hudilainen E. Automated writing evaluation for formative assessment of second language writing: Investigating the accuracy and usefulness of feedback as part of argument-based validation. Educational Psychology. 2017; 37: 8-25. 10.1080/01443410.2015.1136407</bibtext> </blist> <blist> <bibtext> Raudenbush SW, Bryk AS. Hierarchical linear models: Applications and data analysis methods. 2002: Thousand Oaks; Sage</bibtext> </blist> <blist> <bibtext> Roscoe, R. D, Allen, L. K, Johnson, A. C, & McNamara, D. S. (2018). Automated writing instruction and feedback: Instructional mode, attitudes, and revising. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2089–2093. Retrieved from https://journals.sagepub.com/doi/https://doi.org/10.1177/1541931218621471</bibtext> </blist> <blist> <bibtext> Shermis MD. State-of-the-art automated essay scoring: Competition, results, and future directions from a United States demonstration. Assessing Writing. 2014; 20: 53-76. 10.1016/j.asw.2013.04.001</bibtext> </blist> <blist> <bibtext> Shermis, M. D, Burstein, J. C, & Bliss, L. (2004, April). The impact of automated essay scoring on high stakes writing assessments. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego</bibtext> </blist> <blist> <bibtext> Singer JD, Willett JB. Applied longitudinal data analysis: Modeling change and event occurrence. 2003: New York; Oxford. 10.1093/acprof:oso/9780195152968.001.0001</bibtext> </blist> <blist> <bibtext> Smola AJ, Scholkopf B. A tutorial on support vector regression. Statistics and Computing. 2004; 14: 199-222. 10.1023/B:STCO.0000035301.49549.88</bibtext> </blist> <blist> <bibtext> Stevenson M. A critical interpretative synthesis: The integration of automated writing evaluation into classroom writing instruction. Computers and Composition. 2016; 42: 1-16. 10.1016/j.compcom.2016.05.001</bibtext> </blist> <blist> <bibtext> Stevenson M, Phakiti A. The effects of computer-generated feedback on the quality of writing. Assessing Writing. 2014; 19: 51-65. 10.1016/j.asw.2013.11.007</bibtext> </blist> <blist> <bibtext> Strobl C, Ailhaud E, Benetos K, Devitt A, Kruse O, Proske A, Rapp C. Digital support for academic writing: A review of technologies and pedagogies. Computers & Education. 2019; 131: 33-48. 10.1016/j.compedu.2018.12.005</bibtext> </blist> <blist> <bibtext> U. S. Department of Education, Office of Educational Technology. (2017). Reimagining the role of Technology in Education: 2017 National Educational Technology Plan Update. Washington, DC: Author. Retrieved from https://tech.ed.gov/</bibtext> </blist> <blist> <bibtext> Warschauer, M, & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies An International Journal, 3, 22–36. https://doi.org/10.1080/15544800701771580</bibtext> </blist> <blist> <bibtext> Warschauer M, Knobel M, Stone L. Technology and equity in schooling: Deconstructing the digital divide. Educational Policy. 2004; 18: 562-588. 10.1177/0895904804266469</bibtext> </blist> <blist> <bibtext> Williams C, Beam S. Technology and writing: Review of research. Computers & Education. 2019; 128: 227-242. 10.1016/j.compedu.2018.09.024</bibtext> </blist> <blist> <bibtext> Wilson J. Universal screening with automated essay scoring: Evaluating classification accuracy in grades 3 and 4. Journal of School Psychology. 2018; 68: 19-37. 10.1016/j.jsp.2017.12.005</bibtext> </blist> <blist> <bibtext> Wilson, J, & Andrada, G. N. (2016). Using automated feedback to improve writing quality: Opportunities and challenges. In Y. Rosen, S. Ferrara & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp.678–703). Hershey, PA: IGI Global.</bibtext> </blist> <blist> <bibtext> Wilson J, Chen D, Sandbank MP, Hebert M. Generalizability of automated scores of writing quality in grades 3-5. Journal of Educational Psychology. 2019; 111; 4: 619-640. 10.1037/edu0000311</bibtext> </blist> <blist> <bibtext> Wilson J, Czik A. Automated essay evaluation software in English language arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education. 2016; 100: 94-109. 10.1016/j.compedu.2016.05.004</bibtext> </blist> <blist> <bibtext> Wilson J, Huang Y, Palermo C, Beard G, MacArthur CA. Automated feedback and automated scoring in the elementary grades: Usage, attitudes, and associations with writing outcomes in a districtwide implementation of MI Write. International Journal of Artificial Intelligence in Education. 2021. 10.1007/s40593-020-00236-w</bibtext> </blist> <blist> <bibtext> Wilson J, Olinghouse NG, Andrada GN. Does automated feedback improve writing quality. Learning Disabilities A Contemporary Journal. 2014; 12: 93-118</bibtext> </blist> <blist> <bibtext> Wilson J, Roscoe RD. Automated writing evaluation and feedback: Multiple metrics of efficacy. Journal of Educational Computing Research. 2020; 58: 87-125. 10.1177/0735633119830764</bibtext> </blist> <blist> <bibtext> Wise SL, Kong X. Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education. 2010; 18: 22-36</bibtext> </blist> <blist> <bibtext> Zhu, M, Liu, O. L, & Lee, H. (2020). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing. Computers & Education, 143. Advanced online publication. https://doi.org/10.1016/j.compedu.2019.103668</bibtext> </blist> </ref> <aug> <p>By Andrew Potter and Joshua Wilson</p> <p>Reported by Author; Author</p> <p></p> <p>Andrew Potter is a Ph.D. education student at the University of Delaware. His research interests include integrated reading and writing instruction and development for studnets who are at-risk. He previously worked as a special education teacher.</p> <p>Joshua Wilson is an Associate Professor of Education at the University of Delaware. His research focuses on ways that automated scoring and automated feedback may be used to support the teaching and learning of writing. A former special education teacher, Dr. Wilson is particularly concerned with supporting the writing outcomes of those most at risk of learning difficulties.</p> </aug> <nolink nlid="nl1" bibid="bib43" firstref="ref1"></nolink> <nolink nlid="nl2" bibid="bib44" firstref="ref2"></nolink> <nolink nlid="nl3" bibid="bib37" firstref="ref3"></nolink> <nolink nlid="nl4" bibid="bib60" firstref="ref4"></nolink> <nolink nlid="nl5" bibid="bib64" firstref="ref5"></nolink> <nolink nlid="nl6" bibid="bib42" firstref="ref6"></nolink> <nolink nlid="nl7" bibid="bib22" firstref="ref8"></nolink> <nolink nlid="nl8" bibid="bib58" firstref="ref9"></nolink> <nolink nlid="nl9" bibid="bib59" firstref="ref12"></nolink> <nolink nlid="nl10" bibid="bib39" firstref="ref13"></nolink> <nolink nlid="nl11" bibid="bib53" firstref="ref14"></nolink> <nolink nlid="nl12" bibid="bib66" firstref="ref15"></nolink> <nolink nlid="nl13" bibid="bib70" firstref="ref16"></nolink> <nolink nlid="nl14" bibid="bib73" firstref="ref17"></nolink> <nolink nlid="nl15" bibid="bib68" firstref="ref21"></nolink> <nolink nlid="nl16" bibid="bib71" firstref="ref22"></nolink> <nolink nlid="nl17" bibid="bib47" firstref="ref24"></nolink> <nolink nlid="nl18" bibid="bib48" firstref="ref25"></nolink> <nolink nlid="nl19" bibid="bib55" firstref="ref29"></nolink> <nolink nlid="nl20" bibid="bib16" firstref="ref33"></nolink> <nolink nlid="nl21" bibid="bib19" firstref="ref34"></nolink> <nolink nlid="nl22" bibid="bib28" firstref="ref35"></nolink> <nolink nlid="nl23" bibid="bib45" firstref="ref36"></nolink> <nolink nlid="nl24" bibid="bib50" firstref="ref37"></nolink> <nolink nlid="nl25" bibid="bib18" firstref="ref39"></nolink> <nolink nlid="nl26" bibid="bib38" firstref="ref40"></nolink> <nolink nlid="nl27" bibid="bib51" firstref="ref41"></nolink> <nolink nlid="nl28" bibid="bib62" firstref="ref44"></nolink> <nolink nlid="nl29" bibid="bib40" firstref="ref46"></nolink> <nolink nlid="nl30" bibid="bib61" firstref="ref47"></nolink> <nolink nlid="nl31" bibid="bib63" firstref="ref48"></nolink> <nolink nlid="nl32" bibid="bib29" firstref="ref50"></nolink> <nolink nlid="nl33" bibid="bib23" firstref="ref51"></nolink> <nolink nlid="nl34" bibid="bib32" firstref="ref54"></nolink> <nolink nlid="nl35" bibid="bib33" firstref="ref55"></nolink> <nolink nlid="nl36" bibid="bib11" firstref="ref56"></nolink> <nolink nlid="nl37" bibid="bib31" firstref="ref57"></nolink> <nolink nlid="nl38" bibid="bib72" firstref="ref58"></nolink> <nolink nlid="nl39" bibid="bib46" firstref="ref59"></nolink> <nolink nlid="nl40" bibid="bib15" firstref="ref60"></nolink> <nolink nlid="nl41" bibid="bib57" firstref="ref61"></nolink> <nolink nlid="nl42" bibid="bib10" firstref="ref62"></nolink> <nolink nlid="nl43" bibid="bib54" firstref="ref63"></nolink> <nolink nlid="nl44" bibid="bib67" firstref="ref64"></nolink> <nolink nlid="nl45" bibid="bib65" firstref="ref65"></nolink> <nolink nlid="nl46" bibid="bib69" firstref="ref66"></nolink> <nolink nlid="nl47" bibid="bib24" firstref="ref67"></nolink> <nolink nlid="nl48" bibid="bib34" firstref="ref69"></nolink> <nolink nlid="nl49" bibid="bib20" firstref="ref70"></nolink> <nolink nlid="nl50" bibid="bib25" firstref="ref71"></nolink> <nolink nlid="nl51" bibid="bib26" firstref="ref72"></nolink> <nolink nlid="nl52" bibid="bib30" firstref="ref75"></nolink> <nolink nlid="nl53" bibid="bib52" firstref="ref77"></nolink> <nolink nlid="nl54" bibid="bib49" firstref="ref78"></nolink> <nolink nlid="nl55" bibid="bib36" firstref="ref81"></nolink> <nolink nlid="nl56" bibid="bib56" firstref="ref82"></nolink> <nolink nlid="nl57" bibid="bib21" firstref="ref86"></nolink> <nolink nlid="nl58" bibid="bib35" firstref="ref87"></nolink> <nolink nlid="nl59" bibid="bib13" firstref="ref89"></nolink> <nolink nlid="nl60" bibid="bib27" firstref="ref90"></nolink> <nolink nlid="nl61" bibid="bib17" firstref="ref95"></nolink> <nolink nlid="nl62" bibid="bib41" firstref="ref96"></nolink> <nolink nlid="nl63" bibid="bib14" firstref="ref98"></nolink> <nolink nlid="nl64" bibid="bib12" firstref="ref99"></nolink>
Header DbId: eric
DbLabel: ERIC
An: EJ1302863
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Statewide Implementation of Automated Writing Evaluation: Analyzing Usage and Associations with State Test Performance in Grades 4-11
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Potter%2C+Andrew%22">Potter, Andrew</searchLink> (ORCID <externalLink term="http://orcid.org/0000-0002-1012-2680">0000-0002-1012-2680</externalLink>)<br /><searchLink fieldCode="AR" term="%22Wilson%2C+Joshua%22">Wilson, Joshua</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Educational+Technology+Research+and+Development%22"><i>Educational Technology Research and Development</i></searchLink>. Jun 2021 69(3):1557-1578.
– Name: Avail
  Label: Availability
  Group: Avail
  Data: Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 22
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2021
– Name: SourceSuprt
  Label: Sponsoring Agency
  Group: SrcSuprt
  Data: Institute of Education Sciences (ED)
– Name: NumberContract
  Label: Contract Number
  Group: NumCntrct
  Data: R305H170046
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Reports - Research
– Name: Audience
  Label: Education Level
  Group: Audnce
  Data: <searchLink fieldCode="EL" term="%22Elementary+Education%22">Elementary Education</searchLink><br /><searchLink fieldCode="EL" term="%22Secondary+Education%22">Secondary Education</searchLink>
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Computer+Assisted+Testing%22">Computer Assisted Testing</searchLink><br /><searchLink fieldCode="DE" term="%22Writing+Evaluation%22">Writing Evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22Feedback+%28Response%29%22">Feedback (Response)</searchLink><br /><searchLink fieldCode="DE" term="%22Scoring%22">Scoring</searchLink><br /><searchLink fieldCode="DE" term="%22Revision+%28Written+Composition%29%22">Revision (Written Composition)</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Software%22">Computer Software</searchLink><br /><searchLink fieldCode="DE" term="%22Elementary+School+Students%22">Elementary School Students</searchLink><br /><searchLink fieldCode="DE" term="%22Secondary+School+Students%22">Secondary School Students</searchLink><br /><searchLink fieldCode="DE" term="%22Writing+Tests%22">Writing Tests</searchLink><br /><searchLink fieldCode="DE" term="%22Peer+Evaluation%22">Peer Evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22State+Programs%22">State Programs</searchLink><br /><searchLink fieldCode="DE" term="%22Program+Implementation%22">Program Implementation</searchLink>
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1007/s11423-021-10004-9
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 1042-1629
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Automated Writing Evaluation (AWE) provides automatic writing feedback and scoring to support student writing and revising. The purpose of the present study was to analyze a statewide implementation of an AWE software (n = 114,582) in grades 4-11. The goals of the study were to evaluate: (1) to what extent AWE features were used; (2) if equity and access issues influenced AWE usage; and (3) if AWE usage was associated with writing performance on a large-scale state writing assessment. Descriptive statistics and hierarchical linear modeling were used to answer the research questions. Results indicated that the main feature of AWE (i.e., writing and revising essays) were used but some features (peer review and independent lessons) were underutilized. School and student level demographic variables explained little variance in AWE usage. AWE usage was statistically and positively associated with performance on a large-scale state writing assessment when controlling for prior performance and demographics. The study presents evidence that AWE can positively influence writing on a distal measure when implemented at-scale. Implications for large-scale AWE implementation are discussed.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: CodeSource
  Label: IES Funded
  Group: SrcInfo
  Data: Yes
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2021
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1302863
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1302863
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1007/s11423-021-10004-9
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 22
        StartPage: 1557
    Subjects:
      – SubjectFull: Computer Assisted Testing
        Type: general
      – SubjectFull: Writing Evaluation
        Type: general
      – SubjectFull: Feedback (Response)
        Type: general
      – SubjectFull: Scoring
        Type: general
      – SubjectFull: Revision (Written Composition)
        Type: general
      – SubjectFull: Computer Software
        Type: general
      – SubjectFull: Elementary School Students
        Type: general
      – SubjectFull: Secondary School Students
        Type: general
      – SubjectFull: Writing Tests
        Type: general
      – SubjectFull: Peer Evaluation
        Type: general
      – SubjectFull: State Programs
        Type: general
      – SubjectFull: Program Implementation
        Type: general
    Titles:
      – TitleFull: Statewide Implementation of Automated Writing Evaluation: Analyzing Usage and Associations with State Test Performance in Grades 4-11
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Potter, Andrew
      – PersonEntity:
          Name:
            NameFull: Wilson, Joshua
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 06
              Type: published
              Y: 2021
          Identifiers:
            – Type: issn-print
              Value: 1042-1629
          Numbering:
            – Type: volume
              Value: 69
            – Type: issue
              Value: 3
          Titles:
            – TitleFull: Educational Technology Research and Development
              Type: main
ResultId 1