Mind over Model? Students' Evaluation and Use of ChatGPT-Generated versus Human-Generated Texts
Saved in:
| Title: | Mind over Model? Students' Evaluation and Use of ChatGPT-Generated versus Human-Generated Texts |
|---|---|
| Language: | English |
| Authors: | Natalia Latini, Ivar Bråten (ORCID |
| Source: | Reading Research Quarterly. 2026 61(1). |
| Availability: | Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us |
| Peer Reviewed: | Y |
| Page Count: | 21 |
| Publication Date: | 2026 |
| Document Type: | Journal Articles Reports - Research |
| Education Level: | High Schools Secondary Education |
| Descriptors: | Student Attitudes, Artificial Intelligence, Technology Uses in Education, High School Students, Credibility, Value Judgment, Trust (Psychology) |
| DOI: | 10.1002/rrq.70087 |
| ISSN: | 0034-0553 1936-2722 |
| Abstract: | This experimental study investigated how high-school students judged the credibility of ChatGPT-generated versus human-generated texts on two different topics, as well as how they justified their text credibility judgments and used the texts in a post-reading integrative writing task. Results showed that, across both topics, students judged the texts to be less credible when they were presented as generated by ChatGPT than when they were presented as generated by a human, and they also justified their credibility judgments more by referring to how the texts were generated when they were presented as generated by ChatGPT. However, on the post-reading writing task, students reading ChatGPT-generated texts displayed a more integrated understanding of the texts on the two topics than did students reading human-generated texts. These findings may have not only theoretical but also practical implications, suggesting that although students may put less trust in model-generated texts due to the way they are created, it may be possible to harness their critical stance toward such texts in the service of deeper and more integrated understanding of the issues discussed in the texts. |
| Abstractor: | As Provided |
| Entry Date: | 2026 |
| Accession Number: | EJ1494624 |
| Database: | ERIC |
|
Full text is not displayed to guests.
Login for full access.
|
|
| FullText | Links: – Type: pdflink Url: https://content.ebscohost.com/cds/retrieve?content=AQICAHj0k_4E0hTGH8RJwT4gCJyBsGNe_WN95AvKlDbXJGqwxwHoWtAYzkkZcOTgc-9TtRHrAAAA4zCB4AYJKoZIhvcNAQcGoIHSMIHPAgEAMIHJBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDCA0GXVtDKsfVVX3HgIBEICBm3zLNmxJkViR7ZA5ASOJwhgGz3Zk6RMZwBnD6YG3c5fw_FaaR3SZh9uGspe4yLk9NNB_JHRoelHVaLgoMhasmnnY6Fy-GPV-uzHbfkQCxiyEA76VVPYfKGIA2cZmN_RqzZe7tVeTpEasA93UQbCZx0B1ewF5wffrZBnrYxpiLN4c8bxoLjWAZ7-EeRo16AOprdONvqqA6uUCxBIo Text: Availability: 1 Value: <anid>AN0191105826;[nrnu]01jan.26;2026Jan28.02:53;v2.2.500</anid> <title id="AN0191105826-1">Mind Over Model? Students' Evaluation and Use of ChatGPT‐Generated Versus Human‐Generated Texts </title> <p>This experimental study investigated how high‐school students judged the credibility of ChatGPT‐generated versus human‐generated texts on two different topics, as well as how they justified their text credibility judgments and used the texts in a post‐reading integrative writing task. Results showed that, across both topics, students judged the texts to be less credible when they were presented as generated by ChatGPT than when they were presented as generated by a human, and they also justified their credibility judgments more by referring to how the texts were generated when they were presented as generated by ChatGPT. However, on the post‐reading writing task, students reading ChatGPT‐generated texts displayed a more integrated understanding of the texts on the two topics than did students reading human‐generated texts. These findings may have not only theoretical but also practical implications, suggesting that although students may put less trust in model‐generated texts due to the way they are created, it may be possible to harness their critical stance toward such texts in the service of deeper and more integrated understanding of the issues discussed in the texts.</p> <p>This experimental study investigated how high‐school students judged the credibility of ChatGPT‐generated versus human‐generated texts on two different topics, as well as how they justified their text credibility judgments and used the texts in a post‐reading integrative writing task. Results showed that, across both topics, students judged the texts to be less credible when they were presented as generated by ChatGPT than when they were presented as generated by a human, and they also justified their credibility judgments more by referring to how the texts were generated when they were presented as generated by ChatGPT.</p> <p>Keywords: ChatGPT‐generated vs. human‐generated texts; credibility justifications; integrated text understanding; multiple‐document comprehension; text credibility judgments</p> <hd id="AN0191105826-2">Introduction</hd> <p>Although trust ratings seem to have decreased somewhat in recent years, Norway is a country characterized by a high level of trust, including trust in public institutions (OECD [<reflink idref="bib78" id="ref1">78</reflink>]). Until quite recently, trust essentially has been a matter of trust in human‐generated information. In the age of large language models, however, people quite often encounter information generated by artificial intelligence rather than humans, which potentially may have serious consequences for their trust in that information. As an illustration, a large municipality in the north of Norway recently presented a report including a knowledge base and a proposal for a new kindergarten and school structure (Tromsø kommune [<reflink idref="bib94" id="ref2">94</reflink>]), with a section of this report including a review of research with 18 references. Before the report was discussed in the municipality council, the report was made available to involved parties. When researchers at the local university discovered that seven of the 18 references included in the report actually did not exist, the municipality council's discussion was postponed and an investigation by an external agency was initiated. This investigation revealed that part of the report had not been written by humans. Instead, the alleged authors had prompted a large language model to write a summary and to include relevant references (PricewaterhouseCoopers [<reflink idref="bib79" id="ref3">79</reflink>]), which raised a number of critical questions regarding the credibility of the administration as well as the content of the report.</p> <p>Although this illustration from the sphere of public administration did not involve any direct comparison of people's trust in human‐generated and model‐generated information, it may suggest a preference for the former. Such comparisons seem highly pertinent in the current information landscape, however, given the co‐existence of human‐generated and model‐generated information on almost any issue. Focusing on an educational context, this experimental study uniquely contributes to building a knowledge base in this area by comparing how students judge the credibility of textual information generated by a human versus by a large language model, as well as how they justify their text credibility judgments and subsequently use the information in an integrative writing task. Such knowledge is important not only because it can inform about students' trust in different texts generated and used within an educational context, but also because it may signal how the influx of model‐generated textual information might influence the level of trust in society at large. In the following, we describe the theoretical and empirical background of our study, before we specify the research questions and hypotheses that guided our work.</p> <hd id="AN0191105826-3">Theoretical and Empirical Background</hd> <p></p> <hd id="AN0191105826-4">The Nature of Model‐Generated Text</hd> <p>Since the Chat Generative Pretrained Transformer (ChatGPT) was introduced by OpenAI in November 2022, research on the application of this large language model (LLM) has emerged as a major topic within education (Obreja et al. [<reflink idref="bib77" id="ref4">77</reflink>]). Essentially, LLMs such as ChatGPT are advanced text‐prediction machines programmed to generate human‐like responses to prompts provided by humans (Dennis [<reflink idref="bib31" id="ref5">31</reflink>]; Mittelstadt et al. [<reflink idref="bib72" id="ref6">72</reflink>]; Shanahan [<reflink idref="bib88" id="ref7">88</reflink>]). To achieve this, an LLM draws on an enormous amount of publicly available text that typically resides on the internet. From this textual corpus, it creates a massive statistical model that is used for probability calculation in predicting the next word or phrase from already generated text initiated by a human prompt. Beyond feeding the model with an enormous textual dataset, designers can train it by providing positive and negative feedback on its attempts to predict the most likely string of text to follow given a particular context (Cao et al. [<reflink idref="bib24" id="ref8">24</reflink>]). Such pretraining is typically followed by fine‐tuning to enhance model performance on particular tasks, and interactions with users through cycles of queries and responses can further refine the model and help it acquire statistical information about language structures and patterns that can be used for subsequent text prediction in response to a prompt. Of note is that users need to interact with an LLM via clear, goal‐focused, and context‐rich prompts in order to obtain task‐relevant output from the model (Eager and Brunton [<reflink idref="bib35" id="ref9">35</reflink>]; Federiakin et al. [<reflink idref="bib37" id="ref10">37</reflink>]).</p> <p>However, high‐quality prompting of LLMs does not necessarily result in high‐quality textual outputs. To the contrary, although such models have the capacity to generate well‐formulated, well‐organized, and informative outputs that can be mistaken for human‐made texts, their use also involves a number of epistemic risks. These risks include the danger of receiving directly erroneous information as well as more subtle inaccuracies, oversimplifications, and biased responses presented as truth in a confident way (Mittelstadt et al. [<reflink idref="bib72" id="ref11">72</reflink>]; Mitchell [<reflink idref="bib71" id="ref12">71</reflink>]). Because LLMs such as ChatGPT essentially are indifferent to the truth of the textual information that they generate in response to a prompt, Hicks et al. ([<reflink idref="bib46" id="ref13">46</reflink>]) have argued that they, following Frankfurt ([<reflink idref="bib39" id="ref14">39</reflink>]), should be characterized as "bullshitters." That is, although people may anthropomorphize LLMs and sometimes treat them as human truth‐tellers (Mitchell [<reflink idref="bib70" id="ref15">70</reflink>], [<reflink idref="bib71" id="ref16">71</reflink>]), such models are not designed to provide truthful information about the natural and social world but, rather, to generate human‐like linguistic responses by operating on already available linguistic information. In contrast, humans are embodied, biological creatures that also have extra‐linguistic contact with reality and, in general, are concerned with representing this reality in an accurate way when producing linguistic output (Bayne and Williams [<reflink idref="bib7" id="ref17">7</reflink>]; Chemero [<reflink idref="bib26" id="ref18">26</reflink>]; Glenberg and Jones [<reflink idref="bib41" id="ref19">41</reflink>]; Hicks et al. [<reflink idref="bib46" id="ref20">46</reflink>]). Needless to say, the nature and epistemic risks of LLMs call for epistemic vigilance and knowledge in critically evaluating textual information generated by LLMs, for example, by corroborating this information by checking other sources for consistency or inconsistency (Caulfield and Wineburg [<reflink idref="bib25" id="ref21">25</reflink>]).</p> <hd id="AN0191105826-5">LLMs at School</hd> <p>Within education, LLMs have been identified as a gamechanger with the potential to transform literacy and learning (Kalantzis and Cope [<reflink idref="bib50" id="ref22">50</reflink>]; McCarthy and Yan [<reflink idref="bib66" id="ref23">66</reflink>]; Robinson and Hollett [<reflink idref="bib83" id="ref24">83</reflink>]; Xing et al. [<reflink idref="bib97" id="ref25">97</reflink>]). For example, in a much cited paper, Kasneci et al. ([<reflink idref="bib52" id="ref26">52</reflink>]) argued that these models may create new opportunities for students by supporting their development of language, reading, writing, critical thinking, and problem‐solving skills. In addition, these authors argued that LLMs may assist teachers with a number of instructional tasks ranging from lesson planning to assessment. However, Kasneci et al. ([<reflink idref="bib52" id="ref27">52</reflink>]) also warned that students may come to rely too much on LLMs and sometimes even present model‐generated textual output as their own. Accordingly, Stadler et al. ([<reflink idref="bib89" id="ref28">89</reflink>]) recently showed that students' reliance on LLMs during learning actually can be counterproductive because it may reduce cognitive engagement and processing and, in turn, lead to lower‐quality learning outcomes. In the same vein, Kosmyna et al. ([<reflink idref="bib59" id="ref29">59</reflink>]) found that using ChatGPT in the educational context of writing an essay may have cognitive costs that negatively impact students' performance, with students using ChatGPT for essay writing paying a price in terms of less engagement indicated by brain activity and poorer performance indicated by behavioral data when later writing an essay without ChatGPT available.</p> <p>Although overreliance on LLMs undoubtedly may be a key educational challenge, recent meta‐analyses (e.g., Deng et al. [<reflink idref="bib30" id="ref30">30</reflink>]; Laun and Wolff [<reflink idref="bib61" id="ref31">61</reflink>]) have suggested that the use of ChatGPT still may have positive effects on student learning. A caveat is, however, that these meta‐analyses typically are based on a relatively small number of primary studies that compare learning with ChatGPT with control conditions. As argued by Weidlich et al. ([<reflink idref="bib96" id="ref32">96</reflink>]), these meta‐analyses, more often than not, are hampered by serious methodological limitations. Such limitations concern the nature of the experimental treatment, the control conditions, and the validity of the learning outcome measures (Weidlich et al. [<reflink idref="bib96" id="ref33">96</reflink>]). Interestingly, Bauer et al. ([<reflink idref="bib6" id="ref34">6</reflink>]) recently suggested that students' AI literacy may be an underexplored yet important moderator for the effects of using ChatGPT on students' learning.</p> <hd id="AN0191105826-6">The Need for AI Literacy</hd> <p>In the intriguing yet challenging literacy and learning landscape introduced by LLMs, many researchers and educators have become concerned with how students can be taught AI literacy (e.g., Allen and Kendeou [<reflink idref="bib1" id="ref35">1</reflink>]; Almatrafi et al. [<reflink idref="bib3" id="ref36">3</reflink>]; Chiu et al. [<reflink idref="bib27" id="ref37">27</reflink>]; Long and Magerko [<reflink idref="bib64" id="ref38">64</reflink>]; McCarthy and Yan [<reflink idref="bib66" id="ref39">66</reflink>]; Zhong and Liu [<reflink idref="bib99" id="ref40">99</reflink>]). In an influential paper, Long and Magerko ([<reflink idref="bib64" id="ref41">64</reflink>], p.2) defined AI literacy as "a set of competencies that enables individuals to critically evaluate AI technologies; communicate and collaborate effectively with AI; and use AI as a tool online, at home, and in the workplace." In educational contexts, different conceptualizations have highlighted that AI literacy includes a foundational understanding of the potentials and limitations of LLMs such as ChatGPT, as a basis for critically evaluating and using model‐generated information (e.g., Allen and Kendeou [<reflink idref="bib1" id="ref42">1</reflink>]; Almatrafi et al. [<reflink idref="bib3" id="ref43">3</reflink>]). Presumably, more critical evaluation and use of model‐generated information also may serve deeper learning rather than the opposite (i.e., a lack of engagement that may undermine deeper learning; Bauer et al. [<reflink idref="bib6" id="ref44">6</reflink>]).</p> <p>Of note is that AI literacy can be considered an important form of critical literacy within postdigital practices, that is, practices in which the use of digital technologies has become interwoven into everyday human life (Rowsell [<reflink idref="bib84" id="ref45">84</reflink>]). Within such practices, AI literacy can involve a critical stance toward AI technologies or platforms such as LLMs that may help students raise issues not only about the outcomes of such models but also about their blackboxed underlying architecture (e.g., algorithmic processes) and political‐economic dimension (e.g., business interests) (Nichols et al. [<reflink idref="bib76" id="ref46">76</reflink>]).</p> <p>It could be argued, however, that a better understanding of how students actually evaluate and use LLM‐generated compared to human‐generated information is needed to ground viable instructional programs targeting AI literacy. By comparing students' credibility judgments, credibility justifications (i.e., how they justify their credibility judgments), and use of identical ChatGPT‐generated and human‐generated textual information, this experimental study has the potential to contribute uniquely in this regard.</p> <hd id="AN0191105826-7">Students' Perceptions and Evaluations of Generative AI Tools</hd> <p>Prior research on how students perceive and evaluate generative AI tools, especially LLMs such as ChatGPT, has mainly used quantitative survey methods (e.g., Dietrich and Grassini [<reflink idref="bib32" id="ref47">32</reflink>]; Gruenhagen et al. [<reflink idref="bib42" id="ref48">42</reflink>]; Kamoun et al. [<reflink idref="bib51" id="ref49">51</reflink>]; Moghavvemi and Jam [<reflink idref="bib73" id="ref50">73</reflink>]; Morrell‐Mengual et al. [<reflink idref="bib74" id="ref51">74</reflink>]; Nazaretsky et al. [<reflink idref="bib75" id="ref52">75</reflink>]; Ravselj et al. [<reflink idref="bib80" id="ref53">80</reflink>]; Stojanov et al. [<reflink idref="bib90" id="ref54">90</reflink>]; Suriano et al. [<reflink idref="bib91" id="ref55">91</reflink>]; Yu et al. [<reflink idref="bib98" id="ref56">98</reflink>]; Zhong and Liu [<reflink idref="bib99" id="ref57">99</reflink>]). For example, Stojanov et al. ([<reflink idref="bib90" id="ref58">90</reflink>]) identified five profile groups based on university students' self‐reports of their reliance on ChatGPT for different learning tasks. Across these five groups, students scored relatively high on a self‐report measure focusing on critical use of ChatGPT, including critical evaluation of ChatGPT‐generated information and cross‐checking of ChatGPT‐generated information against other sources. Interestingly, the 23% of the students who belonged to a group that relied heavily on ChatGPT for completing written assignments also scored lowest with respect to self‐reported critical use of the model. Gruenhagen et al. ([<reflink idref="bib42" id="ref59">42</reflink>]) also found that a substantial proportion (36%) of their participating university students reportedly had used a chatbot for completing assessment tasks. Although these students did not necessarily perceive the information provided by the chatbot to be credible, they evaluated the information somewhat more positively than did students who had not used a chatbot to complete assessment tasks. As a final example, when Nazaretsky et al. ([<reflink idref="bib75" id="ref60">75</reflink>]) asked university students about their trust in AI‐based tools such as ChatGPT, students' trust ratings were generally moderate, with very few wanting to rely more on the recommendations of such tools than on the recommendations of fellow learners or teachers, in particular. Several other quantitative survey studies with university‐level participants also have found that students may report considerable mistrust in content generated by AI tools, in particular, by ChatGPT (e.g., Dietrich and Grassini [<reflink idref="bib32" id="ref61">32</reflink>]; Kamoun et al. [<reflink idref="bib51" id="ref62">51</reflink>]; Morrell‐Mengual et al. [<reflink idref="bib74" id="ref63">74</reflink>]; Yu et al. [<reflink idref="bib98" id="ref64">98</reflink>]).</p> <p>Some relevant qualitative work in this area also highlights that students may display a sound skepticism toward ChatGPT‐generated output. For example, Sedlbauer et al. ([<reflink idref="bib87" id="ref65">87</reflink>]) had college students work on an essay assignment with the help of ChatGPT and reflect on their experience of using ChatGPT in writing. Analysis of students' written responses showed that they considered it challenging to identify and evaluate the sources that ChatGPT drew on in generating its output. However, students also believed that the need to critically evaluate ChatGPT's imperfect output might be beneficial because it could promote their critical thinking. In another qualitative study, which combined questionnaires with open‐ended questions and focus group interviews, Higgs and Stornaluolo ([<reflink idref="bib47" id="ref66">47</reflink>]) found that high‐school students were concerned that ChatGPT might be a threat to human creativity and that model‐generated texts might lack authenticity and reproduce biases present in the dataset used for model training.</p> <p>Although prior research has provided much valuable information about students' perceptions and evaluations of information generated by LLMs such as ChatGPT, there is a conspicuous lack of experimental work comparing students' evaluations of model‐ and human‐generated information while controlling for other factors. However, in an experiment conducted before ChatGPT was launched, Lermann Henestrosa and Kimmerle ([<reflink idref="bib62" id="ref67">62</reflink>]) presented evidence suggesting that a popular science text may be perceived as less credible when presented as AI‐generated than when presented as authored by a journalist. Based on prior research on human‐generated texts, it seems important that further experimental work in this area include comparisons of not only text credibility but also of credibility justifications and how texts are used in post‐reading tasks, like we did in the current study.</p> <hd id="AN0191105826-8">Focus on Credibility Judgments, Credibility Justifications, and Text Integration</hd> <p>Within literacy research, readers' judgments of text credibility and their justifications for their judgments are considered aspects of source evaluation (Bråten et al. [<reflink idref="bib18" id="ref68">18</reflink>], [<reflink idref="bib15" id="ref69">15</reflink>]). In the last decades, source evaluation has come to be considered a hallmark of advanced literacy skills because the textual information readers encounter on a daily basis can vary vastly with respect to their credibility, for example, dependent on who generated the texts and for which purpose they were generated. While students' credibility judgments concern both the confirmation of more credible texts and the questioning of less credible texts (Kiili et al. [<reflink idref="bib55" id="ref70">55</reflink>], [<reflink idref="bib56" id="ref71">56</reflink>]), justifications for text credibility judgments may be based on criteria ranging from personal opinion to who authored the text and the quality of the arguments (e.g., Braasch et al. [<reflink idref="bib11" id="ref72">11</reflink>]; Bråten et al. [<reflink idref="bib21" id="ref73">21</reflink>]; Britt and Aglinskas [<reflink idref="bib22" id="ref74">22</reflink>]; Kiili et al. [<reflink idref="bib54" id="ref75">54</reflink>], [<reflink idref="bib57" id="ref76">57</reflink>]). As shown by Thomm and Bromme ([<reflink idref="bib93" id="ref77">93</reflink>]), including references in texts also may increase the perceived credibility of the texts because it makes texts seem more scientific. Accordingly, Bråten et al. ([<reflink idref="bib19" id="ref78">19</reflink>]) highlighted that the attribution of textual information to the sources it actually draws on is regarded as essential in scholarly work across domains and disciplines.</p> <p>The extent to which students trust the information they read and their reasons for trusting (or mistrusting) it may not only have consequences for their decision‐making and behavior but also for the way they use that information in subsequent tasks. In particular, many studies have indicated that both students' text credibility judgments and their justifications for those judgments may influence their integration of information within and across texts (for reviews, see Bråten et al. [<reflink idref="bib15" id="ref79">15</reflink>]; McCrudden et al. [<reflink idref="bib67" id="ref80">67</reflink>]). For example, Bråten et al. ([<reflink idref="bib20" id="ref81">20</reflink>]) showed that students' trust in textual information and the criteria they use in justifying their trust may be independent predictors of integrated understanding within and across multiple texts. Because the line of research described in this section has been limited to human‐generated texts, however, it seems essential that extant work on credibility judgment, credibility justification, and text integration is extended to include texts generated by LLMs, which is needed to better understand the potential implications of reading such texts both in and out of school. Of note is that this study is among the very first to broaden the research agenda on multiple‐document comprehension with a focus on source evaluation and content integration (e.g., Barzilai and Chinn [<reflink idref="bib5" id="ref82">5</reflink>]; Bråten et al. [<reflink idref="bib16" id="ref83">16</reflink>]; Christhilf et al. [<reflink idref="bib28" id="ref84">28</reflink>]; Macedo‐Rouet et al. [<reflink idref="bib65" id="ref85">65</reflink>]; Savvidou et al. [<reflink idref="bib85" id="ref86">85</reflink>])—which is a particularly vibrant field within digital reading (Coiro [<reflink idref="bib29" id="ref87">29</reflink>])—to involve both ChatGPT‐ and human‐generated texts.</p> <hd id="AN0191105826-9">The Present Study</hd> <p>Given the preceding background analysis, the main purpose of this experimental study was to gain a better understanding of whether and how students' reading of ChatGPT‐generated texts might affect their evaluation of the texts' credibility as well as their ability to integrate information across texts, as compared with the reading of human‐generated texts. We therefore had Norwegian high‐school students read expository texts on two different topics, with participants randomly assigned to read the texts as either generated by ChatGPT or by a newspaper journalist. Because we also were interested in whether any effects of the texts' origin, that is, as ChatGPT‐generated versus human‐generated, might be moderated by the inclusion of a reference list at the end of the texts, the texts that participants read were presented either with or without a list of relevant references at the end of the texts. The reason that we focused on ChatGPT rather than on other LLMs or generative AI tools more generally was the extreme popularity of this tool (Duarte [<reflink idref="bib34" id="ref88">34</reflink>]), as well as the fact that ChatGPT very clearly stands out among the AI tools reportedly used by Norwegian students (Bjaaland et al. [<reflink idref="bib8" id="ref89">8</reflink>]).</p> <p>Specifically, the following questions guided our research:</p> <p></p> <ulist> <item> Do students' judgments of the texts' credibility differ when they read ChatGPT‐generated versus human‐generated texts, and, if so, is this difference moderated by the inclusion of a list of relevant references at the end of the texts?</item> <p></p> <item> Do students' justifications for their credibility judgments differ when they read ChatGPT‐generated versus human‐generated texts and when they read texts with or without a reference list?</item> <p></p> <item> Does students' integrated understanding of different texts differ when they read ChatGPT‐generated versus human‐generated texts?</item> </ulist> <p>Regarding the first research question, we expected that participants reading ChatGPT‐generated texts would trust the content of the texts less than would participants reading exactly the same texts presented as generated by a human. Given the epistemic risks that have been associated with texts generated by LLMs such as ChatGPT (e.g., Mittelstadt et al. [<reflink idref="bib72" id="ref90">72</reflink>]; Mitchell [<reflink idref="bib71" id="ref91">71</reflink>]) and the description of such risks in popular media (e.g., Heikkilä [<reflink idref="bib45" id="ref92">45</reflink>]; Milmo [<reflink idref="bib69" id="ref93">69</reflink>]; Zilber [<reflink idref="bib100" id="ref94">100</reflink>]), as well as empirical research indicating that many students report considerable mistrust in content generated by LLMs (e.g., Gruenhagen et al. [<reflink idref="bib42" id="ref95">42</reflink>]; Morrell‐Mengual et al. [<reflink idref="bib74" id="ref96">74</reflink>]; Nazaretsky et al. [<reflink idref="bib75" id="ref97">75</reflink>]), it seems likely that students will trust ChatGPT‐generated texts less than texts generated by a journalist in a well‐known mainstream newspaper. However, because one of the main issues with model‐generated output has been the lack of transparency with respect to the underlying sources it draws on (Bråten et al. [<reflink idref="bib15" id="ref98">15</reflink>]), and because the inclusion of references can be regarded as a hallmark of academic prose that signals credibility (Bråten et al. [<reflink idref="bib19" id="ref99">19</reflink>]; Thomm and Bromme [<reflink idref="bib93" id="ref100">93</reflink>]), we considered it likely that being presented with a list of references would increase participants' trust in the model‐generated textual content, in particular, and thus reduce (if not eliminate) the expected difference in text credibility judgments.</p> <p>Regarding the second research question, we expected that participants would justify their credibility judgments more by referring to the sources of the texts, or to how the texts were generated, when the texts were presented as generated by ChatGPT than when they were generated by a human. Crucial to this assumption is the possibility that the texts' origin might be more salient for the students when the texts were generated by a large language model than when they were generated by a journalist, simply because the latter source might be more conventional and also more familiar to the students. Further, because the presence (or absence) of references might be more important for judging the credibility of ChatGPT‐generated texts than for judging the credibility of texts written by a serious journalist, who by default can be expected to build on and check relevant references as part of the writing and publishing process, we considered it likely that participants who were presented with ChatGPT‐generated texts also would justify their text credibility judgments more in terms of the presence (or absence) of references than would participants presented with human‐generated texts. Also, we expected that justifying text credibility judgments in terms of (existing) references would be more likely when participants actually were presented with texts that included lists of references than when the texts they read did not include any reference lists, with increased attention to references cued by the reference lists themselves in the former condition.</p> <p>Regarding the third research questions, we adopted two alternative working hypotheses. On the one hand, we entertained the possibility that participants reading human‐generated texts would outperform participants reading ChatGPT‐generated texts with respect to integrated (cross‐text) understanding, as assessed with a post‐reading integrative writing task. This alternative is consistent with the idea that when readers consider the texts they read to be more credible, they would also be more likely to draw on and integrate the content of those texts in post‐reading tasks (Bråten et al. [<reflink idref="bib20" id="ref101">20</reflink>]; Richter and Maier [<reflink idref="bib81" id="ref102">81</reflink>]). On the other hand, it is possible that participants reading ChatGPT‐generated texts would perform better with respect to integrated text understanding because they rely more on relevant justification criteria (e.g., how the texts were generated) in grounding their text credibility judgments. This possibility is consistent with the idea that what readers find credible and which criteria they use to justify credibility may be independent predictors of integrated text understanding, and that readers may take the two aspects of source information separately into account in comprehension tasks (Bråten et al. [<reflink idref="bib20" id="ref103">20</reflink>]). Essentially, this assumption involves that justification by relevant criteria represents a higher level of sourcing skills that may contribute to integrative processing and understanding of different textual resources in a way that overrides the potential contribution of readers' credibility judgments (Braasch and Bråten [<reflink idref="bib10" id="ref104">10</reflink>]; Bråten et al. [<reflink idref="bib20" id="ref105">20</reflink>]; Incognito and Tarchi [<reflink idref="bib48" id="ref106">48</reflink>]).</p> <p>Because our focus in this study was on ChatGPT‐generated and human‐generated texts with and without reference lists rather than on individual differences, we wanted to control for the potential impact of participants' reading comprehension skills, prior beliefs about the topics discussed in the texts, and prior knowledge about these topics. These variables were regarded as relevant because they may be associated with readers' credibility judgments and justifications as well as with their integrated text understanding (e.g., Bråten et al. [<reflink idref="bib20" id="ref107">20</reflink>], [<reflink idref="bib21" id="ref108">21</reflink>]; Kiili et al. [<reflink idref="bib54" id="ref109">54</reflink>], [<reflink idref="bib57" id="ref110">57</reflink>]; Richter et al. [<reflink idref="bib82" id="ref111">82</reflink>]; Richter and Maier [<reflink idref="bib81" id="ref112">81</reflink>]). In addition, we controlled for participants' algorithmic literacy, that is, their general knowledge and awareness of algorithms as embedded in online environments, because this construct has been considered important for adaptive use of and critical reflection on models and systems that rely on algorithms (Dogruel et al. [<reflink idref="bib33" id="ref113">33</reflink>]), which include large language models.</p> <hd id="AN0191105826-10">Method</hd> <p></p> <hd id="AN0191105826-11">Participants and Context</hd> <p>Participants were 267 second‐year high‐school students from 13 classes at two schools in a large city in southeast Norway. All participants completed college‐preparatory courses. They had an overall mean age of 16.79 years (SD = 0.51) and 52.8% identified as female, 46.4% as male, and 0.7% as other. Most participants (60.7%) had Norwegian as their sole language background. Although 30.0% had a mixed language background (i.e., Norwegian and another language) and 9.4% had a non‐Norwegian language background, all participants were proficient in Norwegian at the time of data collection.</p> <p>The municipality in which this study was conducted relatively quickly developed an adaptation of ChatGPT for schools that adhered to the General Data Protection Regulation Act of the European Union in protecting students' privacy and personal data (Elstad and Eriksen [<reflink idref="bib36" id="ref114">36</reflink>]). This model initially was based on GPT‐3.5 Turbo but later updated to GPT‐4o mini by Azure OpenAI. At the time of data collection, the municipality‐specific ChatGPT version had been available to the participants for approximately 1 year, while the commercial version had been available for about 2 years. At the participating schools, some of the teachers had not initiated any pedagogical use of a LLM at the time of data collection, whereas others had started to implement ChatGPT in their class instruction, also informing students about potential limitations and biases of model responses. However, the vast majority of the participants (94%) reported that they had used ChatGPT for schoolwork, with most of them using it daily (15.1%) or several times a week (42.6%). Only 6.4% reportedly used ChatGPT for schoolwork less often than monthly. With respect to leisure time use, the majority (80.5%) reportedly used ChatGPT in their leisure time, with most of them using it weekly (25.6%) or monthly (48.4%).</p> <p>Participation in the study was voluntary and anonymous and all participants signed an informed consent form. For their participation, each class received gift cards worth approximately USD 200 that they could use for some class activity. The procedures for collecting and handling the data were reviewed and approved by the Norwegian Social Science Data Services.</p> <hd id="AN0191105826-12">Materials</hd> <p></p> <hd id="AN0191105826-13">Texts and Experimental Manipulations</hd> <p>All participants read two separate expository texts, one discussing emotions in animals and the other discussing the use of animals in medical research. Each text consisted of 500 words, and based on Björnsson's ([<reflink idref="bib9" id="ref115">9</reflink>]) formula, the readability estimates (viz., 49 and 51) indicated that their difficulty level was comparable to the difficulty level of information texts from the Norwegian government (Latini et al. [<reflink idref="bib60" id="ref116">60</reflink>]). The texts were slightly adapted to ensure similar length and difficulty level across the two texts.</p> <p>The text on animal emotions consisted of an introduction to the topic followed by four paragraphs. The first paragraph presented scientific evidence for animal emotions by referring to behavioral observations and brain research. The second paragraph described the difference between primary and secondary emotions, especially discussing to what extent animals might experience the latter, and the third paragraph discussed ethical implications based on the understanding that animals have emotions, for example, related to animal welfare and rights. In the final paragraph, it was concluded that increasing scientific support for the existence of complex emotions in animals requires a reconsideration of how animals are treated.</p> <p>Similarly, the text on animals in medical research consisted of an introduction to the topic followed by four paragraphs. The first paragraph of this text described how research on animals has contributed to the treatment of different diseases, and the second paragraph explained how such research has been important for testing new medicines and treatments before approving them for humans. The third paragraph addressed the lack of adequate alternatives to animal research, yet presented some new, promising approaches such as data modeling and organoids (i.e., mini‐organs derived from stem cells). Finally, the fourth paragraph discussed ethical and future perspectives, concluding that, despite the desire to reduce and replace the use of animals in medical research, it is still necessary to save human lives.</p> <p>Source information was manipulated between participants, such that for half of the participants (randomly assigned) the two texts were presented as generated by ChatGPT based on a specific prompt, whereas for the other half the same two texts were presented as generated by a journalist in a well‐known, mainstream Norwegian newspaper (Aftenposten). When presenting the texts as ChatGPT‐generated, we used a mock‐up simulating the ChatGPT interface, displaying all the symbols included in an original ChatGPT interface (e.g., a "share" symbol) in addition to the specific prompt given to the model. The date of text generation was said to be September 2024 for both texts. When presenting the texts as human‐generated, we used a mock‐up simulating the interface of the online version of the newspaper, as displayed when a particular article has been selected. Thus, all symbols included in an original newspaper interface were displayed (e.g., a drop‐down menu symbol), and the name of the newspaper was shown with its original logo. After the title of the text, the name of the journalist appeared together with the publication date. The journalists said to have authored the two texts both had common Norwegian names and both texts were said to have been published in September 2024. Of note is that the mock‐ups of the two source conditions (i.e., ChatGPT vs. human‐generated texts) did not include any advertisements or illustrations. Further, both mock‐ups were static visual representations (i.e., non‐clickable). English translations of both texts are included in Appendix A in the Supporting Information, with the animal emotions text presented as ChatGPT‐generated and the animal research text presented as human‐generated (as described above, both texts were presented as either ChatGPT‐generated or human‐generated to the participants).</p> <p>In addition to source information, we manipulated whether the two texts included lists of references, such that for half of the participants (randomly assigned) a list of six relevant references was included at the end of each text (i.e., after the fourth paragraph), whereas for the other half no list of references was included. The lists included real authors, books, and publishers, with the book titles clearly signaling the books' relevance to the two topics. For example, the reference list for the animal emotions text included book titles such as "The emotional lives of animals: A leading scientist explores animal joy, sorrow, and empathy—and why they matter" and "When elephants weep: The emotional lives of animals." The reference list for the animal research topic included titles such as "The ethics of animal research" and "The principles of humane experimental technique." The reference lists included for both texts in the reference list condition are included in Appendix B in the Supporting Information.</p> <hd id="AN0191105826-14">Outcome Measures</hd> <p>In the following sections, we present the outcome measures of text credibility, credibility justifications (both open‐ and close‐ended), and integrated text understanding. The measures of text credibility and credibility justifications were first completed for the text participants read first and then for the text they read second.</p> <hd id="AN0191105826-15">Text Credibility</hd> <p>We measured participants' judgments of text credibility by asking them to what extent they trusted the content of each of the two texts they had read. Participants responded to this question by using a 10‐point Likert‐type scale ranging from 1 (to a very low degree) to 10 (to a very high degree). Before responding to this question for the first text, participants were presented with the title of this text, as well as source information and a one‐sentence summary of the text content (in the ChatGPT‐generated condition, the prompt was also included). Similarly, before responding to this question for the second text, participants were presented with the title of the text, as well as source information and a one‐sentence summary of the text content (the order of the texts was counterbalanced across participants; see the Procedure section). The repetition of this information about each text was meant to help participants distinguish between the two texts when making their credibility judgments. The use of a single item to measure text credibility for each text was based on the assumption that one item would be sufficient to capture this simple and unitary construct (cf. Allen et al. [<reflink idref="bib2" id="ref117">2</reflink>]).</p> <hd id="AN0191105826-16">Credibility Justifications—Open‐Ended</hd> <p>To assess participants' justifications for their credibility judgments, we first asked them to justify their credibility judgment of each text. Thus, after having rated to what extent they trusted the first text, they were asked to briefly justify their rating of this text in a dedicated textbox, and after having rated to what extent they trusted the second text, they were asked to briefly justify their rating of the text in another dedicated textbox.</p> <p>The coding of participants' written responses to the open‐ended justification questions was informed by a coding system developed and validated by Braasch et al. ([<reflink idref="bib11" id="ref118">11</reflink>]) and Bråten et al. ([<reflink idref="bib13" id="ref119">13</reflink>]), yet grounded in the particular data set that was analyzed. This means that although codes suggested by prior research formed an interpretative backdrop, the coders were open to the data in a way that brought nuances to these codes. The coding process resulted in four different codes that were common for both texts and all experimental conditions.</p> <p>The category of personal justification referred to personal opinions, experiences, prior knowledge, or interests as a basis for judging the credibility of the text content, such as references to something previously heard or observations of emotions in own animals. The category of content justification referred to the information provided in the text and the way it was presented, for example, to the textual arguments and the text structure. The category of reference justification mentioned the presence (or absence) of references as the basis for judging text credibility. Finally, source justification explicitly mentioned that the text was generated by ChatGPT or Aftenposten as a basis for the credibility judgment, or unambiguously just referred to "the source" of the text. This category also referred to the sources by mentioning "the journalist," "the newspaper," "AI," "language models," and that the text was "not written by a human." The four categories of credibility justification resulting from the open‐ended questions are further described and exemplified in Appendix C in the Supporting Information.</p> <p>First, blind to experimental conditions, the three authors coded the written responses of 28 participants to the two open‐ended questions (one for each text) collaboratively in developing the coding system. Next, a random selection of 54 participants' written responses to both questions was coded independently by two coders to establish intercoder reliability. The independent coding resulted in a Cohen's <emph>κ</emph> of 0.86 for the topic of animal emotions; for the topic of animal research, it was 0.90. All disagreements were solved in discussion between the coders and the remaining participants' responses were divided between the three authors and coded separately.</p> <hd id="AN0191105826-17">Credibility Justifications—Close‐Ended</hd> <p>After having justified their credibility judgments for both texts by responding to the open‐ended questions, participants were asked to rate to what extent they based their credibility judgments on each of nine different justification criteria: when the text was generated, how the text was generated, the quality of the arguments, the way the text was written (writing style), how objective (unbiased) the text was, own opinion about the text content, if someone has checked the quality of the content, what other sources say about the issue, and what I already know about the topic. These justification criteria were also based on prior research on how adolescent and young adult readers justify their judgments of text credibility (e.g., Braasch et al. [<reflink idref="bib11" id="ref120">11</reflink>]; Bråten et al. [<reflink idref="bib21" id="ref121">21</reflink>]; Britt and Aglinskas [<reflink idref="bib22" id="ref122">22</reflink>]; Kiili et al. [<reflink idref="bib54" id="ref123">54</reflink>], [<reflink idref="bib57" id="ref124">57</reflink>]).</p> <p>Before participants rated the nine justification criteria for the first text, they were again presented with the title of this text, as well as source information and a one‐sentence summary of the text content (in the ChatGPT‐generated condition, the prompt was also included). Similarly, before rating the nine justification criteria for the second text, they were presented with the title of the text, as well as source information and a one‐sentence summary of the text content. For each text, participants rated to what extent they had based their judgment of text credibility on each of the nine justification criteria by using a 10‐point Likert‐type scale ranging from 1 (to a very low degree) to 10 (to a very high degree). The internal consistency reliability (Cronbach's <emph>α</emph>) for participants' scores on the nine items concerning the animal emotions text was 0.70; for the nine items concerning the animal research text, it was 0.67.</p> <hd id="AN0191105826-18">Integrated Text Understanding</hd> <p>We assessed participants' integrated understanding of the two texts on animal emotions and the use of animals in medical research by asking them to discuss the claim that knowledge about animal emotions should reduce the use of animals in medical research (see the Procedure section for the complete writing prompt). For six main ideas that were presented in the two texts, three in each text, participants were awarded 0–2 points. For one additional main idea that was presented in the animal research text, participants were awarded 0–1 point. A score of 0 was given if a main idea (e.g., that animal emotions can be divided into different types) was not represented in the written report, a score of 1 was given if the main idea was represented but not elaborated, and a score of 2 was given if the main idea was both represented and elaborated (e.g., by describing primary and (potential) secondary emotions in animals). The reason that a score of 0–1 was awarded for one main idea presented in the animal research text is that no elaboration of that main idea was identified in the text (this idea involved an ethical perspective on reducing and replacing the use of animals in medical research). The potential maximum score for representing and elaborating the main ideas in the two texts was 13. Further description and exemplification of the coding of main idea representation and elaboration are included in Appendix D in the Supporting Information.</p> <p>Because we also wanted to assess to what extent participants integrated ideas across the two texts, we counted the number of switches between main ideas (with or without elaborations) from the two texts in participants' written responses as an indication of their cross‐text integration. For example, if a written response included five main ideas and all came from one of the texts, this would give a score of 0, but if the two first ideas came from one of the texts and the three last ideas came from the other text, this would give a score of 1. If a written response included five ideas and the first two ideas came from text 1, the third one from text 2, and the two last ones from text 1, this would count as two switches and give a score of 2. Of note is that counting the number of switches between ideas from different texts in students' written products has been used and validated as an indication of cross‐text integration in a range of multiple‐text comprehension studies (e.g., Bråten et al. [<reflink idref="bib12" id="ref125">12</reflink>], [<reflink idref="bib13" id="ref126">13</reflink>]; Britt and Sommer [<reflink idref="bib23" id="ref127">23</reflink>]; Gil et al. [<reflink idref="bib40" id="ref128">40</reflink>]).</p> <p>We computed each participant's total text integration score by multiplying the score the participant obtained from representing and elaborating main ideas from the texts with the number of switches in the participant's written response. Following Andresen et al. ([<reflink idref="bib4" id="ref129">4</reflink>]), we thus considered both the coverage of the ideas included in the texts and the integration of those ideas when measuring text integration, putting particular emphasis on participants' efforts to integrate information across texts. Higher total scores on this measure can be considered to represent an elaborated and integrated understanding of the content of the two texts. To avoid that any participant obtained a total score of 0 despite representing main ideas (i.e., because they had no switches), we added a constant of 1 to the number of switches before computing the total score as described above. Only participants' total scores were used in subsequent statistical analyses.</p> <p>First, blind to experimental conditions, the three authors scored the responses of 27 participants to the writing prompt collaboratively to develop the scoring system, identifying the main ideas and elaborations in participants' responses as well as the number of switches between ideas from different texts. Next, a random selection of 54 participants' written responses was coded independently by two coders to establish intercoder reliability. The independent coding resulted in a high intercoder consistency with respect to the total scores, with Pearson's <emph>r</emph> = 0.88. All disagreements were solved in discussion between the coders, and the remaining participants' responses were divided between the three authors who coded them separately.</p> <hd id="AN0191105826-19">Covariates</hd> <p>The following sections describe the measures included as potential covariates in this study. These measures focused on participants' reading comprehension skills, prior beliefs about the topics discussed in the two texts, algorithmic literacy, and prior knowledge about the topics discussed in the two texts.</p> <hd id="AN0191105826-20">Reading Comprehension</hd> <p>To assess reading comprehension skills, we used a Norwegian adaptation of the deep cloze comprehension measure developed by Jensen and Elbro ([<reflink idref="bib49" id="ref130">49</reflink>]). This measure consists of 34 short (2–4 sentences) narrative passages, with one gap (i.e., missing word) in each passage that participants are asked to fill by choosing among four alternative words. Importantly, correct filling of each gap requires that participants draw inferences about the global situation described in the respective passages, that is, situation model construction (Kintsch [<reflink idref="bib58" id="ref131">58</reflink>]). A sample passage translated into English reads:</p> <p>They had agreed to meet right after work. The woman in the suit pulled out one tray after the other while they held hands and looked on intensely. They found it hard to decide. They were not even sure that it should be [yearly, <emph>gold</emph>, large, chequered] (Jensen and Elbro [<reflink idref="bib49" id="ref132">49</reflink>], p.1235)</p> <p>Participants were given 10 min to read the passages and fill as many gaps as possible. Scoring involves counting the number of correctly filled gaps (maximum score = 34). The Norwegian adaptation of this measure has been validated in several recent studies including undergraduate readers (Bråten et al. [<reflink idref="bib14" id="ref133">14</reflink>], [<reflink idref="bib17" id="ref134">17</reflink>]; Haverkamp et al. [<reflink idref="bib43" id="ref135">43</reflink>]). For example, Bråten et al. ([<reflink idref="bib17" id="ref136">17</reflink>]) found that students' scores on this measure were predicted by their word recognition skills and book reading experiences and, in turn, predicted their verbal abilities as measured by the Wechsler adult intelligence scale (Wechsler [<reflink idref="bib95" id="ref137">95</reflink>]). The internal consistency reliability (Cronbach's <emph>α</emph>) of participants' scores was 0.88.</p> <hd id="AN0191105826-21">Prior Topic Beliefs</hd> <p>We assessed participants' prior beliefs about each of the two topics that were discussed in the texts, that is, animal emotions and the use of animals in medical research. For each topic, we administered a 2‐item measure using a 10‐point Likert‐type scale (1 = not at all true, 10 = very true). For the topic of animal emotions, the items were "I believe animals have complex emotional lives very similar to humans," and "I believe animals have the same emotions as humans." For the topic of animal research, the items were "I believe animal experimentation is important in medical research," and "I believe animal experimentation can provide important knowledge about different diseases and treatments." For each topic, scores on the measure were divided by the number of items such that they ranged from 1 to 10. Cronbach's <emph>α</emph> was 0.65 for the topic of animal emotions; for the topic of animal research, it was 0.82.</p> <hd id="AN0191105826-22">Algorithm Literacy</hd> <p>We measured participants' general knowledge and awareness of algorithms as embedded in online environments by means of a Norwegian adaptation of the Algorithm Literacy Scale, which originally was developed and validated by Dogruel et al. ([<reflink idref="bib33" id="ref138">33</reflink>]). The measure consisted of 23 multiple‐choice and true/false items focusing on the meaning of algorithms (e.g., What is an algorithm?), how algorithms work on the Internet (e.g., I can influence algorithms by the way I use the Internet), the data sources used to develop and apply algorithms (e.g., Wearable devices measuring body functions, such as heart rate monitors and activity trackers), and functions often performed by algorithms (e.g., Making product recommendations). The scoring was done by counting the number of correct responses (maximum score = 23). The internal consistency reliability (Cronbach's <emph>α</emph>) for participants' scores on this measure was 0.72.</p> <hd id="AN0191105826-23">Prior Topic Knowledge</hd> <p>We assessed participants' prior knowledge about each of the two topics that were discussed in the texts by asking them to write down what they knew about the emotions of animals and the use of animal experimentation in research, respectively, in two separate textboxes. We developed scoring systems for the two topics that were based on the content of the texts that they subsequently read, yet also took other relevant ideas and experiences represented in participants' written responses into consideration.</p> <p>For the topic of animal emotions, the scoring system yielded scores ranging from 0 (answering "I don't know anything about this topic" or providing irrelevant information) to 5 (describing one or more animal emotions and providing relevant examples or evidence for at least one of them, also including qualifications related to type of animals or animal species that were based on subject matter knowledge or reflection). Blind to experimental conditions, the second and third authors scored 27 participants' responses concerning animal emotions in developing the scoring system. Next, a random selection of 54 participants' responses concerning this topic was coded independently by two coders to establish intercoder reliability. The independent coding resulted in a high intercoder reliability coefficient (Pearson's <emph>r</emph>) of 0.84 for participants' scores on the prior topic knowledge measure concerning animal emotions. All disagreements were solved in discussion between the coders, and the remaining participants' responses were divided between the two coders who scored them separately. Further description and exemplification of the scoring system used for this topic knowledge measure are included in Appendix E in the Supporting Information.</p> <p>For the topic of animal research, the scoring system yielded scores ranging from 0 to 6 depending on the number of relevant ideas identified across participants' written responses that were included in a particular response. Across the written responses, the following six relevant ideas were identified: use of animals in developing various products (e.g., makeup), use of animals to learn about human functioning, use of particular animals in product development and/or medical research, ways in which animals are used in product development/medical research, use of animals in other research areas than product development and medicine (e.g., psychology), and ethical considerations in animal research. Participants were awarded one point for including each of these ideas in their written responses. Blind to experimental conditions, the second and third authors scored 27 participants' responses concerning animal research in developing the scoring system. Next, a random selection of 54 participants' responses concerning this topic was coded independently by two coders to establish intercoder reliability. The independent coding also resulted in a high intercoder reliability coefficient (Pearson's <emph>r</emph>) of 0.84 for participants' scores on the prior topic knowledge measure concerning animal research. All disagreements were solved in discussion between the coders and the remaining participants' responses were divided between the two coders who scored them separately. Further description and exemplification of the scoring system used for this topic knowledge measure are included in Appendix F in the Supporting Information.</p> <hd id="AN0191105826-24">Procedure</hd> <p>The data were collected in participants' regular classrooms during a 90‐min session by the three authors. Participants accessed a web‐based questionnaire by means of their school‐provided laptop (a 14″ Lenovo ThinkPad L14 gen. 2). Based on distributed web addresses, participants were randomly assigned to one of four versions of the questionnaire that corresponded to the four experimental conditions (i.e., ChatGPT‐generated/references [<emph>n</emph> = 70], ChatGPT‐generated/no references [<emph>n</emph> = 60], human‐generated/references [<emph>n</emph> = 66], and human‐generated/no references [<emph>n</emph> = 71]). The web‐based questionnaire started with the reading comprehension measure, with the 10‐min time frame administered orally by one of the authors. Participants worked on the remaining questionnaire tasks at their own pace. First, they completed the topic belief measures, the algorithm literacy scale, the topic knowledge measures, and a demographic survey in this order. In addition to questions about age, gender identification, and language background, the demographic survey included questions about their use of ChatGPT (or similar AI tools), both for schoolwork and during their leisure time. After these tasks, which were the same for all participants, the two texts were introduced in the following way (the difference between the ChatGPT‐ and human‐generated conditions is indicated by a slash):</p> <p>You are now going to read two different expository texts generated by ChatGPT/published in Aftenposten. One is about the use of animals in medical research; the other about the emotional lives of animals. When you have finished reading, you are going to evaluate each text and afterwards write your own text based on what you have read. You cannot look back to the two texts after you have read them. Imagine that these tasks are assignments in language arts.</p> <p>In the ChatGPT‐generated condition, an additional sentence in this instruction read: Before the text you can see the prompt/question that was used to ask ChatGPT to make a summary.</p> <p>After the reading task instruction, the two texts were presented on separate, successive pages, with the second text accessed by clicking a "Next" button. The order of the two texts was counterbalanced across participants. In the references condition, both texts were presented with a list of six relevant references at the end, whereas in the no references condition, no reference lists were presented.</p> <p>After having read the two texts, all participants completed the measures of text credibility, credibility justifications—open‐ended, and credibility justifications—close‐ended in this order for both texts without the texts available. Finally, all participants read the following writing prompt:</p> <p>Knowledge about the emotional lives of animals should reduce the use of animals in medical research. Discuss this claim by using information from the two texts that you just read. It is important that you express yourself as completely and elaborately as you can and that you use your own words. You cannot look back to the two texts while writing.</p> <p>Participants wrote their text in a dedicated textbox with no word limit and submitted the entire questionnaire to a server.</p> <hd id="AN0191105826-25">Results</hd> <p>Because we addressed our first two research questions, concerning potential effects of the experimental conditions on participants' text credibility judgments and their justifications for those judgments, by analyzing the results for the two topics that were discussed in the two texts (i.e., animal emotions and animal research) separately, we also computed descriptives and zero‐order correlations for each of those topics. Tables S1–S4 in the Supporting Information include this information for all measured variables for the entire sample. Performing separate analyses for the two topics allowed us to include covariates that targeted each of those topics (i.e., prior beliefs and prior knowledge about the respective topics), as well as to test the generalizability of our findings across two different topics. However, because our third research question concerned potential effects of the experimental conditions on participants' integration of information across the two topics, as assessed by the integrated text understanding measure, we addressed this question by analyzing the results for the two topics together, also creating prior beliefs and prior knowledge composites that measured these covariates across the two topics. In all analyses, we included covariates that could remove variance in the outcome measures because they correlated with those measures or remove differences between the experimental conditions with respect to the covariates (Field [<reflink idref="bib38" id="ref139">38</reflink>]; Tabachnick and Fidell [<reflink idref="bib92" id="ref140">92</reflink>]). Accordingly, covariates were not considered relevant if they neither correlated with an outcome, nor differed between the four subgroups varying with respect to source information (i.e., ChatGPT‐generated vs. human‐generated) and references (i.e., references vs. no references). Table 1 shows descriptive information about the covariates for these four subgroups.</p> <p>1 TABLE Descriptive information (means and standard deviations) about the covariates for subgroups differing with respect to source information and references.</p> <p> <ephtml> &lt;table&gt;&lt;thead valign="bottom"&gt;&lt;tr&gt;&lt;th align="left" /&gt;&lt;th align="center"&gt;ChatGPT&amp;#8208;generated&lt;/th&gt;&lt;th align="center"&gt;Human&amp;#8208;generated&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th align="center"&gt;References (&lt;italic&gt;n&lt;/italic&gt;&amp;#8201;=&amp;#8201;70)&lt;/th&gt;&lt;th align="center"&gt;No references (&lt;italic&gt;n&lt;/italic&gt;&amp;#8201;=&amp;#8201;60)&lt;/th&gt;&lt;th align="center"&gt;References (&lt;italic&gt;n&lt;/italic&gt;&amp;#8201;=&amp;#8201;66)&lt;/th&gt;&lt;th align="center"&gt;No references (&lt;italic&gt;n&lt;/italic&gt;&amp;#8201;=&amp;#8201;71)&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody valign="top"&gt;&lt;tr&gt;&lt;td align="left"&gt;Reading comprehension&lt;/td&gt;&lt;td align="center"&gt;20.84 (5.45)&lt;/td&gt;&lt;td align="center"&gt;19.68 (6.16)&lt;/td&gt;&lt;td align="center"&gt;19.74 (5.69)&lt;/td&gt;&lt;td align="center"&gt;21.14 (5.67)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Algorithm literacy&lt;/td&gt;&lt;td align="center"&gt;14.59 (3.61)&lt;/td&gt;&lt;td align="center"&gt;14.20 (4.10)&lt;/td&gt;&lt;td align="center"&gt;14.80 (3.39)&lt;/td&gt;&lt;td align="center"&gt;14.30 (3.51)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Beliefs about animal emotions&lt;/td&gt;&lt;td align="center"&gt;5.08 (1.88)&lt;/td&gt;&lt;td align="center"&gt;5.64 (1.86)&lt;/td&gt;&lt;td align="center"&gt;5.74 (1.66)&lt;/td&gt;&lt;td align="center"&gt;5.47 (1.92)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Beliefs about animal research&lt;/td&gt;&lt;td align="center"&gt;6.98 (2.35)&lt;/td&gt;&lt;td align="center"&gt;6.68 (2.28)&lt;/td&gt;&lt;td align="center"&gt;6.78 (2.22)&lt;/td&gt;&lt;td align="center"&gt;6.31 (2.31)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Knowledge about animal emotions&lt;/td&gt;&lt;td align="center"&gt;1.54 (1.10)&lt;/td&gt;&lt;td align="center"&gt;1.62 (0.96)&lt;/td&gt;&lt;td align="center"&gt;2.00 (1.11)&lt;/td&gt;&lt;td align="center"&gt;1.48 (1.04)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Knowledge about animal research&lt;/td&gt;&lt;td align="center"&gt;1.66 (1.05)&lt;/td&gt;&lt;td align="center"&gt;1.80 (0.99)&lt;/td&gt;&lt;td align="center"&gt;1.71 (0.92)&lt;/td&gt;&lt;td align="center"&gt;1.92 (1.23)&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <p>1 <emph>Note:</emph> One‐way analyses of variance using the four subgroups as the independent variable and the covariates as the dependent variables showed a statistically significant difference between the subgroups for knowledge about animal emotions, <emph>F</emph>(<reflink idref="bib3" id="ref141">3</reflink>, 263) = 3.34, <emph>p</emph> = 0.020, <emph>η</emph><sups>2</sups> = 0.037, but no statistically significant differences between them for the other variables, <emph>F</emph>s &lt; 1.74, <emph>p</emph>s &gt; 0.16.</p> <p>We performed one‐way analyses of variance (ANOVAs) with the four subgroups as the independent variable and covariates as the dependent variables. There were no statistically significant differences between these groups with respect to reading comprehension skills, <emph>F</emph>(<reflink idref="bib3" id="ref142">3</reflink>, 263) = 1.14, <emph>p</emph> = 0.335, <emph>η</emph><sups>2</sups> = 0.013; algorithmic literacy, <emph>F</emph>(<reflink idref="bib3" id="ref143">3</reflink>, 263) = 0.37, <emph>p</emph> = 0.775, <emph>η</emph><sups>2</sups> = 0.004; prior beliefs about animal emotions, <emph>F</emph>(<reflink idref="bib3" id="ref144">3</reflink>, 263) = 1.73, <emph>p</emph> = 0.162, <emph>η</emph><sups>2</sups> = 0.019; prior beliefs about animal research, <emph>F</emph>(<reflink idref="bib3" id="ref145">3</reflink>, 263) = 1.05, <emph>p</emph> = 0.369, <emph>η</emph><sups>2</sups> = 0.012; or prior knowledge about animal research, <emph>F</emph>(<reflink idref="bib3" id="ref146">3</reflink>, 263) = 0.80, <emph>p</emph> = 0.50, <emph>η</emph><sups>2</sups> = 0.009. However, the subgroups differed statistically significantly with respect to prior knowledge about animal emotions, <emph>F</emph>(<reflink idref="bib3" id="ref147">3</reflink>, 263) = 3.34, <emph>p</emph> = 0.020, <emph>η</emph><sups>2</sups> = 0.037, which was due to a statistically significant difference (<emph>p</emph> = 0.024) between the two subgroups reading human‐generated texts with or without a reference list.</p> <hd id="AN0191105826-26">Effects on Text Credibility</hd> <p>To address our first research question, concerning potential effects of the experimental conditions on participants' text credibility judgments, we first performed a 2 × 2 between‐subjects analysis of covariance (ANCOVA) for the topic of animal emotions, using source information (ChatGPT‐generated vs. human‐generated) and references (references vs. no references) as the independent variables, credibility judgments about the animal emotions text as the dependent variable, and prior knowledge about the topic of animal emotions as a covariate. Results of the evaluation of the assumptions for performing this ANCOVA were satisfactory.</p> <p>The results showed that there was a statistically significant main effect of source information (ChatGPT‐generated: <emph>M</emph> = 6.72, SE = 0.15; human‐generated: <emph>M</emph> = 7.71, SE = 0.15; <emph>F</emph>(<reflink idref="bib1" id="ref148">1</reflink>, 262) = 22.70, <emph>p</emph> &lt; 0.001, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.080) but no statistically significant main effect of references (references: <emph>M</emph> = 7.20, SE = 0.15; no references: <emph>M</emph> = 7.22, SE = 0.15; <emph>F</emph>(<reflink idref="bib1" id="ref149">1</reflink>, 262) = 0.01, <emph>p</emph> = 0.930, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.000) on the text credibility judgments. Also, there was no statistically significant interaction between the two independent variables, with <emph>F</emph>(<reflink idref="bib1" id="ref150">1</reflink>, 262) = 1.87, <emph>p</emph> = 0.173, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.007. The covariate of prior topic knowledge did not adjust participants' credibility judgments, with <emph>F</emph>(<reflink idref="bib1" id="ref151">1</reflink>, 262) = 0.37, <emph>p</emph> = 0.541, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.001.</p> <p>Next, we performed an ANCOVA with the same independent variables but this time using participants' credibility judgments for the animal research text as the dependent variable and their prior beliefs about the topic of animal research as a covariate. Results of evaluation of the assumptions for performing this ANCOVA were satisfactory. Again, there was a statistically significant main effect of source information (ChatGPT‐generated: <emph>M</emph> = 6.67, SE = 0.13; human‐generated: <emph>M</emph> = 7.79, SE = 0.13; <emph>F</emph>(<reflink idref="bib1" id="ref152">1</reflink>, 262) = 37.36, <emph>p</emph> &lt; 0.001, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.125) but no statistically significant main effect of references (references: <emph>M</emph> = 7.33, SE = 0.13; no references: <emph>M</emph> = 7.13, SE = 0.13; <emph>F</emph>(<reflink idref="bib1" id="ref153">1</reflink>, 262) = 1.20, <emph>p</emph> = 0.276, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.005) or the interaction between source information and references, with <emph>F</emph>(<reflink idref="bib1" id="ref154">1</reflink>, 262) = 1.65, <emph>p</emph> = 0.200, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.006. The covariate of topic beliefs uniquely adjusted participants' credibility judgment scores, with <emph>F</emph>(<reflink idref="bib1" id="ref155">1</reflink>, 262) = 9.62, <emph>p</emph> = 0.002, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.035. In sum, the results pertaining to our first research question were similar across the two topics, with participants reading the ChatGPT‐generated texts trusting the content of the texts less than did participants reading the human‐generated texts. Figure 1 shows the means for participants' text credibility judgments for both topics according to experimental condition.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/NRNU/01jan26/rrq70087-fig-0001.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="rrq70087-fig-0001.jpg" title="1 Estimated marginal means of text credibility judgments for human‐generated and ChatGPT‐generated texts by references condition (references vs. no references) for the topics of animal emotions and animal research. Error bars represent standard errors." /> </p> <p></p> <hd id="AN0191105826-28">Effects on Credibility Justifications</hd> <p>To address our second research question, concerning potential effects of the experimental conditions on participants' justifications for their credibility judgments, we first performed binary logistic regression analyses with participants' responses to the open‐ended justification questions as the dependent variables. Next, we performed multivariate and univariate analyses of variance using participants' responses to the close‐ended justification questions (i.e., ratings) as the dependent variables. Both types of analyses were first performed for the topic of animal emotions and then for the topic of animal research.</p> <hd id="AN0191105826-29">Open‐Ended Justifications</hd> <p>To examine potential effects of the experimental conditions on participants' credibility justifications as measured by the open‐ended questions, we performed a set of four binary logistic regression analyses for the topic of animal emotions, followed by another set of four binary logistic regression analyses for the topic of animal research. In each analysis, we specified three models. In the first model, we entered only relevant covariates as continuous variables. In the second model, we added source information, with ChatGPT‐generated texts coded as 1 and human‐generated texts coded as 0, and references, with references included coded as 1 and no references included coded as 0. In the third model, we added the interaction term between source information and references. Four both topics, whether participants used personal justification, content justification, reference justification, and source justification for their credibility judgment were included as the dependent variables, with the use of a particular type of justification coded as 1 and the nonuse of this type of justification coded as 0. Results for animal emotions and animal research, respectively, are summarized in Tables 2 and 3.</p> <p>2 TABLE Results of binary regression analyses for the topic of animal emotions.</p> <p> <ephtml> &lt;table&gt;&lt;thead valign="bottom"&gt;&lt;tr&gt;&lt;th align="left"&gt;Predictors&lt;/th&gt;&lt;th align="center"&gt;Dependent variables&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th align="center"&gt;Source justification&lt;/th&gt;&lt;th align="center"&gt;Content justification&lt;/th&gt;&lt;th align="center"&gt;Personal justification&lt;/th&gt;&lt;th align="center"&gt;Reference justification&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody valign="top"&gt;&lt;tr&gt;&lt;td align="left"&gt;Model 1&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior knowledge&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.11&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;/td&gt;&lt;td align="center"&gt;0.90&lt;/td&gt;&lt;td align="center"&gt;0.28&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;/td&gt;&lt;td align="center"&gt;1.32&lt;/td&gt;&lt;td align="center"&gt;0.22&lt;/td&gt;&lt;td align="center"&gt;0.14&lt;/td&gt;&lt;td align="center"&gt;1.24&lt;/td&gt;&lt;td align="center"&gt;0.24&lt;/td&gt;&lt;td align="center"&gt;0.13&lt;/td&gt;&lt;td align="center"&gt;1.27&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Algorithm literacy&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.07&lt;/td&gt;&lt;td align="center"&gt;0.04&lt;/td&gt;&lt;td align="center"&gt;1.07&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.10&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.04&lt;/td&gt;&lt;td align="center"&gt;0.91&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;R&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center"&gt;0.003&lt;/td&gt;&lt;td align="center"&gt;0.035&lt;/td&gt;&lt;td align="center"&gt;0.029&lt;/td&gt;&lt;td align="center"&gt;0.012&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Model 2&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior knowledge&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.11&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;/td&gt;&lt;td align="center"&gt;0.90&lt;/td&gt;&lt;td align="center"&gt;0.27&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;/td&gt;&lt;td align="center"&gt;1.31&lt;/td&gt;&lt;td align="center"&gt;0.24&lt;/td&gt;&lt;td align="center"&gt;0.13&lt;/td&gt;&lt;td align="center"&gt;1.27&lt;/td&gt;&lt;td align="center"&gt;0.27&lt;/td&gt;&lt;td align="center"&gt;0.14&lt;/td&gt;&lt;td align="center"&gt;1.31&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Algorithm literacy&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.07&lt;/td&gt;&lt;td align="center"&gt;0.04&lt;/td&gt;&lt;td align="center"&gt;1.07&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.09&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.04&lt;/td&gt;&lt;td align="center"&gt;0.91&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Source information&lt;/td&gt;&lt;td align="center"&gt;0.24&lt;/td&gt;&lt;td align="center"&gt;0.25&lt;/td&gt;&lt;td align="center"&gt;1.26&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.47&lt;/td&gt;&lt;td align="center"&gt;0.26&lt;/td&gt;&lt;td align="center"&gt;0.63&lt;/td&gt;&lt;td align="center"&gt;0.45&lt;/td&gt;&lt;td align="center"&gt;0.30&lt;/td&gt;&lt;td align="center"&gt;1.57&lt;/td&gt;&lt;td align="center"&gt;1.06&lt;xref ref-type="fn" rid="tfn5" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.32&lt;/td&gt;&lt;td align="center"&gt;2.88&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;References&lt;/td&gt;&lt;td align="center"&gt;0.09&lt;/td&gt;&lt;td align="center"&gt;0.25&lt;/td&gt;&lt;td align="center"&gt;1.09&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.17&lt;/td&gt;&lt;td align="center"&gt;0.26&lt;/td&gt;&lt;td align="center"&gt;0.85&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.17&lt;/td&gt;&lt;td align="center"&gt;0.30&lt;/td&gt;&lt;td align="center"&gt;0.84&lt;/td&gt;&lt;td align="center"&gt;0.92&lt;xref ref-type="fn" rid="tfn4" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.32&lt;/td&gt;&lt;td align="center"&gt;2.51&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;R&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center"&gt;0.007&lt;/td&gt;&lt;td align="center"&gt;0.049&lt;/td&gt;&lt;td align="center"&gt;0.038&lt;/td&gt;&lt;td align="center"&gt;0.086&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Model 3&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior knowledge&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.08&lt;/td&gt;&lt;td align="center"&gt;1.12&lt;/td&gt;&lt;td align="center"&gt;0.922&lt;/td&gt;&lt;td align="center"&gt;0.26&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;/td&gt;&lt;td align="center"&gt;1.30&lt;/td&gt;&lt;td align="center"&gt;0.26&lt;/td&gt;&lt;td align="center"&gt;0.14&lt;/td&gt;&lt;td align="center"&gt;1.29&lt;/td&gt;&lt;td align="center"&gt;0.25&lt;/td&gt;&lt;td align="center"&gt;0.14&lt;/td&gt;&lt;td align="center"&gt;1.29&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Algorithm literacy&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.07&lt;/td&gt;&lt;td align="center"&gt;0.04&lt;/td&gt;&lt;td align="center"&gt;1.07&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.10&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.04&lt;/td&gt;&lt;td align="center"&gt;0.91&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Source information&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.15&lt;/td&gt;&lt;td align="center"&gt;0.35&lt;/td&gt;&lt;td align="center"&gt;0.86&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.25&lt;/td&gt;&lt;td align="center"&gt;0.36&lt;/td&gt;&lt;td align="center"&gt;0.78&lt;/td&gt;&lt;td align="center"&gt;0.26&lt;/td&gt;&lt;td align="center"&gt;0.42&lt;/td&gt;&lt;td align="center"&gt;1.30&lt;/td&gt;&lt;td align="center"&gt;1.37&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.56&lt;/td&gt;&lt;td align="center"&gt;3.94&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;References&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.29&lt;/td&gt;&lt;td align="center"&gt;0.35&lt;/td&gt;&lt;td align="center"&gt;0.75&lt;/td&gt;&lt;td align="center"&gt;0.03&lt;/td&gt;&lt;td align="center"&gt;0.36&lt;/td&gt;&lt;td align="center"&gt;1.03&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.40&lt;/td&gt;&lt;td align="center"&gt;0.45&lt;/td&gt;&lt;td align="center"&gt;0.67&lt;/td&gt;&lt;td align="center"&gt;1.23&lt;xref ref-type="fn" rid="tfn3" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.56&lt;/td&gt;&lt;td align="center"&gt;3.43&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Source info&amp;#8201;&amp;#215;&amp;#8201;refer&lt;/td&gt;&lt;td align="center"&gt;0.77&lt;/td&gt;&lt;td align="center"&gt;0.50&lt;/td&gt;&lt;td align="center"&gt;2.15&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.43&lt;/td&gt;&lt;td align="center"&gt;0.52&lt;/td&gt;&lt;td align="center"&gt;0.65&lt;/td&gt;&lt;td align="center"&gt;0.40&lt;/td&gt;&lt;td align="center"&gt;0.61&lt;/td&gt;&lt;td align="center"&gt;1.50&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.48&lt;/td&gt;&lt;td align="center"&gt;0.68&lt;/td&gt;&lt;td align="center"&gt;0.62&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;R&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center"&gt;0.016&lt;/td&gt;&lt;td align="center"&gt;0.051&lt;/td&gt;&lt;td align="center"&gt;0.040&lt;/td&gt;&lt;td align="center"&gt;0.088&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <ulist> <item>2 <emph>Note:</emph> — = predictor not included in the analysis. ChatGPT‐generated = 1, human‐generated = 0, references = 1, no references = 0. <emph>R</emph><sups>2</sups> is based on the Cox–Snell statistics.</item> <item>3 * <emph>p</emph> &lt; 0.05.</item> <item>4 ** <emph>p</emph> &lt; 0.01.</item> <item>5 *** <emph>p</emph> &lt; 0.001.</item> <item>3 TABLE Results of binary regression analyses for the topic of animal research.</item> </ulist> <p> <ephtml> &lt;table&gt;&lt;thead valign="bottom"&gt;&lt;tr&gt;&lt;th align="left"&gt;Predictors&lt;/th&gt;&lt;th align="center"&gt;Dependent variables&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th align="center"&gt;Source justification&lt;/th&gt;&lt;th align="center"&gt;Content justification&lt;/th&gt;&lt;th align="center"&gt;Personal justification&lt;/th&gt;&lt;th align="center"&gt;Reference justification&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;th align="center"&gt;&lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;SE &lt;italic&gt;B&lt;/italic&gt;&lt;/th&gt;&lt;th align="center"&gt;OR&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody valign="top"&gt;&lt;tr&gt;&lt;td align="left"&gt;Model 1&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Reading comprehension&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.03&lt;/td&gt;&lt;td align="center"&gt;1.06&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Algorithm literacy&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior beliefs&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;/td&gt;&lt;td align="center"&gt;1.13&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.11&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;/td&gt;&lt;td align="center"&gt;0.89&lt;/td&gt;&lt;td align="center"&gt;0.17&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.07&lt;/td&gt;&lt;td align="center"&gt;1.19&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior knowledge&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.27&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.14&lt;/td&gt;&lt;td align="center"&gt;1.31&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;R&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center"&gt;0.018&lt;/td&gt;&lt;td align="center"&gt;0.016&lt;/td&gt;&lt;td align="center"&gt;0.044&lt;/td&gt;&lt;td align="center"&gt;0.015&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Model 2&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Reading comprehension&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.03&lt;/td&gt;&lt;td align="center"&gt;1.06&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Algorithm literacy&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior beliefs&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;/td&gt;&lt;td align="center"&gt;1.13&lt;/td&gt;&lt;td align="center"&gt;0.11&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;/td&gt;&lt;td align="center"&gt;0.89&lt;/td&gt;&lt;td align="center"&gt;0.17&lt;/td&gt;&lt;td align="center"&gt;0.07&lt;/td&gt;&lt;td align="center"&gt;1.18&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior knowledge&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.35&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.14&lt;/td&gt;&lt;td align="center"&gt;1.42&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Source information&lt;/td&gt;&lt;td align="center"&gt;0.65&lt;xref ref-type="fn" rid="tfn8" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.25&lt;/td&gt;&lt;td align="center"&gt;1.91&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.17&lt;/td&gt;&lt;td align="center"&gt;0.25&lt;/td&gt;&lt;td align="center"&gt;0.84&lt;/td&gt;&lt;td align="center"&gt;0.53&lt;/td&gt;&lt;td align="center"&gt;0.30&lt;/td&gt;&lt;td align="center"&gt;1.70&lt;/td&gt;&lt;td align="center"&gt;0.71&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.30&lt;/td&gt;&lt;td align="center"&gt;2.04&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;References&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.29&lt;/td&gt;&lt;td align="center"&gt;0.25&lt;/td&gt;&lt;td align="center"&gt;0.75&lt;/td&gt;&lt;td align="center"&gt;0.15&lt;/td&gt;&lt;td align="center"&gt;0.26&lt;/td&gt;&lt;td align="center"&gt;1.17&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.01&lt;/td&gt;&lt;td align="center"&gt;0.30&lt;/td&gt;&lt;td align="center"&gt;1.00&lt;/td&gt;&lt;td align="center"&gt;0.99&lt;xref ref-type="fn" rid="tfn9" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.31&lt;/td&gt;&lt;td align="center"&gt;2.69&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;R&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center"&gt;0.046&lt;/td&gt;&lt;td align="center"&gt;0.019&lt;/td&gt;&lt;td align="center"&gt;0.055&lt;/td&gt;&lt;td align="center"&gt;0.076&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Model 3&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Reading comprehension&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.07&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.03&lt;/td&gt;&lt;td align="center"&gt;1.07&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Algorithm literacy&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior beliefs&lt;/td&gt;&lt;td align="center"&gt;0.12&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;/td&gt;&lt;td align="center"&gt;1.13&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.11&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.06&lt;/td&gt;&lt;td align="center"&gt;0.89&lt;/td&gt;&lt;td align="center"&gt;0.17&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.07&lt;/td&gt;&lt;td align="center"&gt;1.18&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Prior knowledge&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;&amp;#8212;&lt;/td&gt;&lt;td align="center"&gt;0.36&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.15&lt;/td&gt;&lt;td align="center"&gt;1.43&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Source information&lt;/td&gt;&lt;td align="center"&gt;0.40&lt;/td&gt;&lt;td align="center"&gt;0.36&lt;/td&gt;&lt;td align="center"&gt;1.49&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.34&lt;/td&gt;&lt;td align="center"&gt;0.37&lt;/td&gt;&lt;td align="center"&gt;0.71&lt;/td&gt;&lt;td align="center"&gt;0.93&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.45&lt;/td&gt;&lt;td align="center"&gt;2.54&lt;/td&gt;&lt;td align="center"&gt;1.11&lt;xref ref-type="fn" rid="tfn7" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.51&lt;/td&gt;&lt;td align="center"&gt;3.04&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;References&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.53&lt;/td&gt;&lt;td align="center"&gt;0.35&lt;/td&gt;&lt;td align="center"&gt;0.59&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.01&lt;/td&gt;&lt;td align="center"&gt;0.35&lt;/td&gt;&lt;td align="center"&gt;1.00&lt;/td&gt;&lt;td align="center"&gt;0.43&lt;/td&gt;&lt;td align="center"&gt;0.46&lt;/td&gt;&lt;td align="center"&gt;1.53&lt;/td&gt;&lt;td align="center"&gt;1.36&lt;xref ref-type="fn" rid="tfn8" /&gt;&lt;/td&gt;&lt;td align="center"&gt;0.50&lt;/td&gt;&lt;td align="center"&gt;3.89&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;Source information&amp;#8201;&amp;#215;&amp;#8201;references&lt;/td&gt;&lt;td align="center"&gt;0.50&lt;/td&gt;&lt;td align="center"&gt;0.50&lt;/td&gt;&lt;td align="center"&gt;1.64&lt;/td&gt;&lt;td align="center"&gt;0.32&lt;/td&gt;&lt;td align="center"&gt;0.51&lt;/td&gt;&lt;td align="center"&gt;1.39&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.77&lt;/td&gt;&lt;td align="center"&gt;0.61&lt;/td&gt;&lt;td align="center"&gt;0.46&lt;/td&gt;&lt;td align="center"&gt;&amp;#8722;0.63&lt;/td&gt;&lt;td align="center"&gt;0.63&lt;/td&gt;&lt;td align="center"&gt;0.53&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="left"&gt;R&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center"&gt;0.050&lt;/td&gt;&lt;td align="center"&gt;0.020&lt;/td&gt;&lt;td align="center"&gt;0.061&lt;/td&gt;&lt;td align="center"&gt;0.079&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <ulist> <item>6 <emph>Note:</emph> — = predictor not included in the analysis. ChatGPT‐generated = 1, human‐generated = 0, references = 1, no references = 0. <emph>R</emph><sups>2</sups> is based on the Cox–Snell statistics.</item> <item>7 * <emph>p</emph> &lt; 0.05.</item> <item>8 ** <emph>p</emph> &lt; 0.01.</item> <item>9 *** <emph>p</emph> &lt; 0.001.</item> </ulist> <p>For the topic of animal emotions, the analysis using personal justification as the dependent variable showed that the first model, which included algorithmic literacy and prior knowledge about animal emotions as covariates, was statistically significant, with <emph>R</emph><sups>2</sups> = 0.029 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref156">2</reflink>) = 7.86, <emph>p</emph> = 0.020. In this model, algorithmic literacy was a negative predictor of participants' use of personal justification, <emph>B</emph> = −0.10, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref157">1</reflink>) = 6.16, <emph>p</emph> = 0.013. The odds ratio indicated that with each point increase in algorithmic literacy, the odds of using personal justification decreased by 9%. Neither model 2, which also included source information and references, <emph>R</emph><sups>2</sups> = 0.038 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref158">2</reflink>) = 2.50, <emph>p</emph> = 0.286, nor model 3, which also included the interaction between source information and references, <emph>R</emph><sups>2</sups> = 0.040 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref159">1</reflink>) = 0.44, <emph>p</emph> = 0.509, improved the prediction of the use of personal justification (the difference between models 1 and 3 was also not significant, Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib3" id="ref160">3</reflink>) = 2.94, <emph>p</emph> = 0.401). With content justification as the dependent variable, the first model including algorithmic literacy and prior knowledge about animal emotions as covariates was also statistically significant, with <emph>R</emph><sups>2</sups> = 0.035 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref161">2</reflink>) = 9.50, <emph>p</emph> = 0.009. In this model, topic knowledge was a positive predictor of participants' use of content justification, <emph>B</emph> = 0.28, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref162">1</reflink>) = 5.16, <emph>p</emph> = 0.023. The odds ratio indicated that with each point increase in topic knowledge, the odds of using content justification increased by 32%. Neither model 2, which also included source information and references, <emph>R</emph><sups>2</sups> = 0.049 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref163">2</reflink>) = 3.86, <emph>p</emph> = 0.145, nor model 3, which also included the interaction between source information and references, <emph>R</emph><sups>2</sups> = 0.051 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref164">1</reflink>) = 0.41, <emph>p</emph> = 0.412, improved the prediction of the use of content justification (the difference between models 1 and 3 was also not significant, Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib3" id="ref165">3</reflink>) = 4.53, <emph>p</emph> = 0.209). The analysis using source justification as the dependent variable, which included only prior knowledge about animal emotions as a covariate, was not statistically significant, with <emph>R</emph><sups>2</sups> = 0.003 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref166">1</reflink>) = 0.89, <emph>p</emph> = 0.345, and neither model 2, which also included source information and references, <emph>R</emph><sups>2</sups> = 0.007 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref167">2</reflink>) = 1.07, <emph>p</emph> = 0.585, nor model 3, which also included the interaction between source information and references, <emph>R</emph><sups>2</sups> = 0.016 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref168">1</reflink>) = 2.37, <emph>p</emph> = 0.124, improved the prediction of the use of source justification (the difference between models 1 and 3 was also not significant, Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib3" id="ref169">3</reflink>) = 3.44, <emph>p</emph> = 0.329). In the final analysis for the topic of animal emotions, using reference justification as the dependent variable and prior knowledge about animal emotions as a covariate, the first model was not statistically significant, with <emph>R</emph><sups>2</sups> = 0.012 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref170">1</reflink>) = 3.12, <emph>p</emph> = 0.077. However, model 2, which also included source information and references, improved the prediction of the use of reference justification, <emph>R</emph><sups>2</sups> = 0.086 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref171">2</reflink>) = 21.03, <emph>p</emph> &lt; 0.001. In this model, both source information, <emph>B</emph> = 1.06, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref172">1</reflink>) = 10.95, <emph>p</emph> &lt; 0.001, and references, <emph>B</emph> = 0.92, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref173">1</reflink>) = 8.39, <emph>p</emph> = 0.004, positively predicted participants' use of reference justification, with the odds ratios indicating that reading GPT‐generated texts increased the odds of using this type of justification by 188% and that being presented with a reference list increased it by 151%. Model 3, which also included the interaction between source information and references, did not improve the prediction of the use of reference justification compared to model 2, <emph>R</emph><sups>2</sups> = 0.088 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref174">1</reflink>) = 0.51, <emph>p</emph> = 0.476.</p> <p>For the topic of animal research, the analysis using personal justification as the dependent variable showed that the first model, which included reading comprehension skills and prior beliefs about animal research as covariates, was statistically significant, with <emph>R</emph><sups>2</sups> = 0.044 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref175">2</reflink>) = 12.01, <emph>p</emph> = 0.002. In this model, both reading comprehension, <emph>B</emph> = 0.06, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref176">1</reflink>) = 4.74, <emph>p</emph> = 0.030, and topic beliefs, <emph>B</emph> = 0.17, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref177">1</reflink>) = 6.22, <emph>p</emph> = 0.013, positively predicted participants' use of personal justification. The odds ratios indicated that with each point increase in reading comprehension skills, the odds of using personal justification increased by 6%, and with each point increase in topic beliefs, these odds increased by 19%. Neither model 2, which also included source information and references, <emph>R</emph><sups>2</sups> = 0.055 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref178">2</reflink>) = 3.11, <emph>p</emph> = 0.211, nor model 3, which also included the interaction between source information and references, <emph>R</emph><sups>2</sups> = 0.061 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref179">1</reflink>) = 1.58, <emph>p</emph> = 0.209, improved the prediction of the use of personal justification (the difference between models 1 and 3 was also not significant, Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib3" id="ref180">3</reflink>) = 4.69, <emph>p</emph> = 0.196). With content justification as the dependent variable, the first model including only prior beliefs about animal research as a covariate was also statistically significant, with <emph>R</emph><sups>2</sups> = 0.016 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref181">1</reflink>) = 4.20, <emph>p</emph> = 0.040. Prior topic beliefs were a negative predictor of participants' use of content justification, <emph>B</emph> = −0.11, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref182">1</reflink>) = 4.15, <emph>p</emph> = 0.042, with the odds ratio indicating that with each point increase in topic beliefs, the odds of using content justification decreased by 11%. Neither model 2, which also included source information and references, <emph>R</emph><sups>2</sups> = 0.019 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref183">2</reflink>) = 0.79, <emph>p</emph> = 0.674, nor model 3, which also included the interaction between source information and references, <emph>R</emph><sups>2</sups> = 0.020 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref184">1</reflink>) = 0.42, <emph>p</emph> = 0.519, improved the prediction of the use of content justification (the difference between models 1 and 3 was also not significant, Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib3" id="ref185">3</reflink>) = 1.21, <emph>p</emph> = 0.751). The analysis using source justification as the dependent variable, which included only prior beliefs about animal research as a covariate, was statistically significant, with <emph>R</emph><sups>2</sups> = 0.018 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref186">1</reflink>) = 4.94, <emph>p</emph> = 0.026. Prior topic beliefs positively predicted the use of source justification, <emph>B</emph> = 0.12, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref187">1</reflink>) = 4.84, <emph>p</emph> = 0.028, and the odds ratio indicated that with each point increase in topic beliefs, the odds of using source justification increased by 13%. Further, model 2, which also included source information and references, improved the prediction of the use of source justification, <emph>R</emph><sups>2</sups> = 0.046 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref188">2</reflink>) = 7.76, <emph>p</emph> = 0.021. In this model, source information positively predicted participants' use of source justification, <emph>B</emph> = 0.65, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref189">1</reflink>) = 6.64, <emph>p</emph> = 0.010, with the odds ratio indicating that reading GPT‐generated texts increased the odds of using source justification by 91%. Model 3, which also included the interaction between source information and references, did not improve the prediction of the use of source justification compared to model 2, <emph>R</emph><sups>2</sups> = 0.050 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref190">1</reflink>) = 0.97, <emph>p</emph> = 0.325. In the final analysis for the topic of animal research, using reference justification as the dependent variable and prior knowledge about animal research as a covariate, the first model was statistically significant, with <emph>R</emph><sups>2</sups> = 0.015 (Cox–Snell), <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref191">1</reflink>) = 4.00, <emph>p</emph> = 0.046. In this model, topic knowledge positively predicted the use of reference justification, <emph>B</emph> = 0.27, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref192">1</reflink>) = 3.97, <emph>p</emph> = 0.046, and the odds ratio indicated that with each point increase in topic knowledge, the odds of using reference justification increased by 31%. Model 2, which also included source information and references, improved the prediction of the use of reference justification, <emph>R</emph><sups>2</sups> = 0.076 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib2" id="ref193">2</reflink>) = 17.04, <emph>p</emph> &lt; 0.001. In this model, both source information, <emph>B</emph> = 0.71, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref194">1</reflink>) = 5.56, <emph>p</emph> = 0.018, and references, <emph>B</emph> = 0.99, Wald <emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref195">1</reflink>) = 10.13, <emph>p</emph> = 0.001, positively predicted participants' use of reference justification, with the odds ratios indicating that reading GPT‐generated texts increased the odds of using this type of justification by 104% and that being presented with a reference list increased it by 169%. Model 3, which also included the interaction between source information and references, did not improve the prediction of the use of reference justification compared to model 2, <emph>R</emph><sups>2</sups> = 0.079 (Cox–Snell), Δ<emph>χ</emph><sups>2</sups>(<reflink idref="bib1" id="ref196">1</reflink>) = 1.01, <emph>p</emph> = 0.316.</p> <hd id="AN0191105826-30">Close‐Ended Justifications</hd> <p>To examine potential effects of the experimental conditions on participants' credibility justifications as measured with the close‐ended questions, we first performed a 2 × 2 between‐subjects multivariate analysis of covariance (MANCOVA) for the animal emotions topic. The independent variables were source information (ChatGPT‐generated vs. human‐generated) and references (references vs. no references), and the dependent variables were participants' scores on nine criteria used to indicate their justifications for their text credibility judgments (when the text was generated, how the text was generated, the quality of the arguments, the way the text was written, how objective the text was, own opinion about the text content, if someone has checked the quality of the content, what other sources say about the issue, and what I already know about the topic). Adjustments were made for the four covariates of reading comprehension skills, prior beliefs about animal emotions, algorithmic literacy, and prior knowledge about animal emotions. Results of evaluation of the assumptions for performing this MANCOVA were satisfactory.</p> <p>Based on Pillai's criterion, the combined dependent variables were statistically significantly affected by source information, Pillai's <emph>V</emph> = 0.12, <emph>F</emph>(<reflink idref="bib9" id="ref197">9</reflink>, 250) = 3.66, <emph>p</emph> &lt; 0.001, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.117, but not by references, Pillai's <emph>V</emph> = 0.02, <emph>F</emph>(<reflink idref="bib9" id="ref198">9</reflink>, 250) = 0.61, <emph>p</emph> = 0.788, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.021, or the interaction between these two independent variables, Pillai's <emph>V</emph> = 0.04, <emph>F</emph>(<reflink idref="bib9" id="ref199">9</reflink>, 250) = 1.07, <emph>p</emph> = 0.388, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.037. Follow‐up ANCOVAs to investigate effects of source information on each dependent variable showed a statistically significant effect of source information on justification by how the text was generated, <emph>F</emph>(<reflink idref="bib1" id="ref200">1</reflink>, 258) = 8.62, <emph>p</emph> = 0.004, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.032, with participants reading the ChatGPT‐generated texts relying more on this justification criterion (<emph>M</emph> = 6.53, SE = 0.26) than did participants who read the human‐generated texts (<emph>M</emph> = 5.48, SE = 0.25). The only covariate uniquely adjusting participants' scores on this justification criterion was prior beliefs about animal emotions, with <emph>F</emph>(<reflink idref="bib1" id="ref201">1</reflink>, 258) = 9.84, <emph>p</emph> = 0.002, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.037 (for the other covariates, <emph>F</emph>s &lt; 2.72, <emph>p</emph>s &gt; 0.10). In addition, there was a statistically significant effect of source information on justification by "what I already know about the topic," <emph>F</emph>(<reflink idref="bib1" id="ref202">1</reflink>, 258) = 4.93, <emph>p</emph> = 0.027, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.019, with participants reading the ChatGPT‐generated texts relying more on this justification criterion (<emph>M</emph> = 6.65, SE = 0.23) than did participants who read the human‐generated texts (<emph>M</emph> = 5.92, SE = 0.23). Participants' scores on this justification criterion were uniquely adjusted by both prior beliefs, <emph>F</emph>(<reflink idref="bib1" id="ref203">1</reflink>, 258) = 8.56, <emph>p</emph> = 0.004, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.032, and prior knowledge, <emph>F</emph>(<reflink idref="bib1" id="ref204">1</reflink>, 258) = 7.00, <emph>p</emph> = 0.009, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.026, about animal emotions, but not by the other covariates (<emph>F</emph>s &lt; 1.48, <emph>p</emph>s &gt; 0.23). Finally, none of the other seven justification criteria were statistically significantly affected by source information, that is, by whether participants read ChatGPT‐generated or human‐generated texts, <emph>F</emph>s &lt; 3.08, <emph>p</emph>s &gt; 0.08. Figure 2 shows the means for participants' credibility justifications concerning how the text was generated and prior knowledge about the topic for animal emotions according to experimental condition.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/NRNU/01jan26/rrq70087-fig-0002.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="rrq70087-fig-0002.jpg" title="2 Estimated marginal means of use of the justification criteria of how the text was generated and prior knowledge about the topic for human‐generated and ChatGPT‐generated texts by references condition (references vs. no references) for the topic of animal emotions. Error bars represent standard errors." /> </p> <p></p> <p>Next, we first performed a 2 × 2 between‐subjects multivariate analysis of covariance (MANCOVA) for the animal research topic, using the same independent and dependent variables but including only reading comprehension skills and prior beliefs about the use of animal research as covariates. Results of evaluation of the assumptions for performing this MANCOVA were also satisfactory. Based on Pillai's criterion, the combined dependent variables were statistically significantly affected by source information, Pillai's <emph>V</emph> = 0.07, <emph>F</emph>(<reflink idref="bib9" id="ref205">9</reflink>, 253) = 2.13, <emph>p</emph> = 0.027, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.071, but not by references, Pillai's <emph>V</emph> = 0.03, <emph>F</emph>(<reflink idref="bib9" id="ref206">9</reflink>, 253) = 0.86, <emph>p</emph> = 0.559, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.030, or the interaction between these two independent variables, Pillai's <emph>V</emph> = 0.04, <emph>F</emph>(<reflink idref="bib9" id="ref207">9</reflink>, 253) = 1.26, <emph>p</emph> = 0.258, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.043. Follow‐up ANCOVAs to investigate effects of source information on each dependent variable again showed a statistically significant effect of source information on justification by how the text was generated, <emph>F</emph>(<reflink idref="bib1" id="ref208">1</reflink>, 261) = 6.46, <emph>p</emph> = 0.012, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.024, with participants reading the ChatGPT‐generated texts relying more on this justification criterion (<emph>M</emph> = 6.49, SE = 0.26) than did participants who read the human‐generated texts (<emph>M</emph> = 5.58, SE = 0.25). None of the covariates uniquely adjusted participants' scores on this justification criterion, <emph>F</emph>s &lt; 0.09, <emph>p</emph>s &gt; 0.75. In addition, there was a statistically significant effect of source information on justification by how objective (unbiased) the text was, <emph>F</emph>(<reflink idref="bib1" id="ref209">1</reflink>, 261) = 4.03, <emph>p</emph> = 0.046, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.015, with participants reading the ChatGPT‐generated texts relying more on this justification criterion (<emph>M</emph> = 7.08, SE = 0.23) than did participants who read the human‐generated texts (<emph>M</emph> = 6.43, SE = 0.22). Participants' scores on this justification criterion were uniquely adjusted by their prior beliefs about animal research, <emph>F</emph>(<reflink idref="bib1" id="ref210">1</reflink>, 261) = 4.20, <emph>p</emph> = 0.041, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.016, but not by their reading comprehension skills, <emph>F</emph>(<reflink idref="bib1" id="ref211">1</reflink>, 261) = 0.00, <emph>p</emph> = 0.999, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.000. Finally, none of the other seven justification criteria were statistically significantly affected by source information, that is, by whether participants read ChatGPT‐generated or human generated texts, <emph>F</emph>s &lt; 1.50, <emph>p</emph>s &gt; 0.22. Figure 3 shows the means for participants' credibility justifications concerning how the text was generated and text objectivity for the topic for animal research according to experimental condition.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/NRNU/01jan26/rrq70087-fig-0003.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="rrq70087-fig-0003.jpg" title="3 Estimated marginal means of use of the justification criteria of how the text was generated and how objective the text was for human‐generated and ChatGPT‐generated texts by references condition (references vs. no references) for the topic of animal research. Error bars represent standard errors." /> </p> <p></p> <hd id="AN0191105826-33">Effects on Integrated Text Understanding</hd> <p>To examine our third research question, concerning potential effects of the experimental conditions on participants' integrated text understanding, we performed a 2 × 2 between‐subjects analysis of covariance (ANCOVA) using source information (ChatGPT‐generated vs. human‐generated) and references (references vs. no references) as the independent variables, scores on the integrated text understanding measure as the dependent variable, and reading comprehension skills and prior knowledge as covariates. Because the dependent variable targeted an integrated understanding of the animal emotions text and the animal research text, we used a measure that assessed prior knowledge across the two topics as a covariate in this analysis. This measure was created by transforming both the scores on the prior knowledge measure concerning animal emotions and the scores on the measure concerning animal research into <emph>z</emph>‐scores and then averaging the means of these <emph>z</emph>‐scores to form a composite (i.e., cross‐topic) prior knowledge score. Results of the evaluation of the assumptions for performing this ANCOVA were satisfactory.</p> <p>The results showed that there was a statistically significant main effect of source information (ChatGPT‐generated: <emph>M</emph> = 7.88, SE = 0.72; human‐generated: <emph>M</emph> = 5.74, SE = 0.70; <emph>F</emph>[<reflink idref="bib1" id="ref212">1</reflink>, 261] = 4.62, <emph>p</emph> = 0.033, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.017) but no statistically significant main effect of references (references: <emph>M</emph> = 7.06, SE = 0.70; no references: <emph>M</emph> = 6.56, SE = 0.71; <emph>F</emph>[<reflink idref="bib1" id="ref213">1</reflink>, 261] = 0.26, <emph>p</emph> = 0.613, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.001) on integrated text understanding. Also, there was no statistically significant interaction between the two independent variables, with <emph>F</emph>(<reflink idref="bib1" id="ref214">1</reflink>, 261) = 0.02, <emph>p</emph> = 0.877, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.000. Both reading comprehension skills, <emph>F</emph>(<reflink idref="bib1" id="ref215">1</reflink>, 261) = 6.82, <emph>p</emph> = 0.010, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.025, and prior knowledge, <emph>F</emph>(<reflink idref="bib1" id="ref216">1</reflink>, 261) = 18.83, <emph>p</emph> &lt; 0.001, <emph>η</emph><subs><emph>p</emph></subs><sups>2</sups> = 0.067, uniquely adjusted the integrated text understanding scores, with both covariates being positively associated with the dependent variable. Figure 4 shows the means for participants' integrated text understanding according to experimental condition.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/NRNU/01jan26/rrq70087-fig-0004.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="rrq70087-fig-0004.jpg" title="4 Estimated marginal means of integrated text understanding for human‐generated and ChatGPT‐generated texts by references condition (references vs. no references). Error bars represent standard errors." /> </p> <p></p> <p>As shown in the previous section, source information had an effect on justification by how the text was generated for both topics, with participants reading the ChatGPT‐generated texts relying more on this criterion when justifying their text credibility judgments than did participants who read the human‐generated texts. Because use of this justification criterion for both topics also correlated positively with integrated text understanding, we performed a mediation analysis to explore whether the effect of source information (ChatGPT‐generated vs. human‐generated) on integrated text understanding might be mediated by participants' reliance on how the text was generated in justifying their text credibility judgments. Because participants' credibility justifications were assessed after text reading but before completing the writing task used to assess integrated text understanding (see the Procedure section), the temporal assumption needed for performing this mediation analysis also was met. In this analysis, we used a measure that assessed the use of the justification criterion across the two topics as a mediator. This measure was created by averaging the means on the justification criterion for the two topics to form a composite (i.e., cross‐topic) score for justification by how the text was created. Participants' reading comprehension skills and cross‐topic prior knowledge were included as covariates. ChatGPT‐generated was coded as 1 and human‐generated was coded as 0. The resulting mediation model is displayed in Figure 5.</p> <p> <img src="https://imageserver.ebscohost.com/img/embimages/rdk/NRNU/01jan26/rrq70087-fig-0005.jpg?ephost1=dGJyMMvl7ESepq84yOvsOLCmsE6epq5Srqa4SK6WxWXS" alt="rrq70087-fig-0005.jpg" title="5 Mediation model of the effect of source information (ChatGPT‐generated vs. human‐generated) on integrated text understanding with justification by how the text was generated as a mediator (standardized coefficients). *p &lt; 0.05, **p &lt; 0.01." /> </p> <p></p> <p>Using a bootstrap mediation approach with 5000 samples (Hayes [<reflink idref="bib44" id="ref217">44</reflink>]), the mediation model explained a statistically significant portion of the variance, <emph>R</emph><sups>2</sups> = 0.16, <emph>F</emph>(<reflink idref="bib4" id="ref218">4</reflink>, 262) = 12.38, <emph>p</emph> &lt; 0.001. There was a statistically significant positive effect of source information on the justification criterion (<emph>b</emph> = 0.34, SE = 0.24, <emph>p</emph> = 0.006), as well as a statistically significant positive effect of the justification criterion on integrated text understanding (<emph>b</emph> = 0.15, SE = 0.06, <emph>p</emph> = 0.011). The indirect effect of source information on integrated text understanding via the justification criterion of how the text was generated also was statistically significant, with an estimate of 0.05 (95% CI: 0.01 to 0.11). This means that reading ChatGPT‐generated texts increased the reliance on this justification criterion, which in turn improved participants' integrated text understanding scores. Given that the <emph>c</emph> path, representing the effect of source information on integrated text understanding adjusted for both covariates, was statistically significant (<emph>b</emph> = 0.25, SE = 0.12, <emph>p</emph> = 0.029), whereas the c' path, representing this effect also adjusted for the inclusion of the mediating variable, was not statistically significant (<emph>b</emph> = 0.20, SE = 0.12, <emph>p</emph> = 0.081), the effect of source information on integrated text understanding can be considered fully mediated by the justification criterion. Both the covariate of reading comprehension (<emph>b</emph> = 0.16, SE = 0.06, <emph>p</emph> = 0.007) and the covariate of prior knowledge (<emph>b</emph> = 0.26, SE = 0.06, <emph>p</emph> &lt; 0.001) uniquely adjusted integrated text understanding when the mediating variable was taken into consideration.</p> <hd id="AN0191105826-36">Discussion</hd> <p>LLMs such as ChatGPT may be considered wonderful tools for gaining access to relevant information about a topic and performing challenging synthesis writing tasks. However, these models also raise a number of pertinent issues for education, requiring a rethinking of what it means to be literate in the age of AI (Kalantzis and Cope [<reflink idref="bib50" id="ref219">50</reflink>]; Robinson and Hollett [<reflink idref="bib83" id="ref220">83</reflink>]). In the new literacy landscape of LLMs, AI literacy has emerged as a potentially viable pedagogical approach to helping students use such models in more critical and constructive ways (e.g., Allen and Kendeou [<reflink idref="bib1" id="ref221">1</reflink>]; Zhong and Liu [<reflink idref="bib99" id="ref222">99</reflink>]). However, limited knowledge is thus far available about how students evaluate and use texts generated by LLMs, as compared to how they evaluate and use human‐generated texts. In this experimental study, we addressed these issues by investigating not only how students would judge the credibility of the two types of texts (i.e., ChatGPT‐generated vs. human‐generated) and justify their text credibility judgments, but also how they would use the two types of texts in a post‐reading integrative writing task.</p> <p>As expected, we found that when the texts were presented as generated by ChatGPT, students judged them to be less credible than when they were presented as generated by a human, with similar results obtained across the two topics (i.e., animal emotions and the use of animals in medical research). Apparently, students who read ChatGPT‐generated texts were aware of some of the epistemic risks associated with such texts, with this awareness potentially developed through popular media accounts, discussions at school, or personal experiences using the model for schoolwork or in their leisure time. Although developing a more critical stance toward ChatGPT‐generated text is consistent with the epistemic risks noted by several authors (e.g., Bråten et al. [<reflink idref="bib15" id="ref223">15</reflink>]; Hicks et al. [<reflink idref="bib46" id="ref224">46</reflink>]; Mitchell [<reflink idref="bib70" id="ref225">70</reflink>], [<reflink idref="bib71" id="ref226">71</reflink>]; Mittelstadt et al. [<reflink idref="bib72" id="ref227">72</reflink>]), it, of course, does not guarantee that students have the skill or will to perform highly needed critical evaluations of ChatGPT‐generated text when using the model for schoolwork or during their leisure time (e.g., Allen and Kendeou [<reflink idref="bib1" id="ref228">1</reflink>]; Bråten et al. [<reflink idref="bib15" id="ref229">15</reflink>]; Caulfield and Wineburg [<reflink idref="bib25" id="ref230">25</reflink>]; Federiakin et al. [<reflink idref="bib37" id="ref231">37</reflink>]).</p> <p>Although we also expected that students would take the presence or absence of a reference list into consideration when judging text credibility, such that including a list of relevant references after the ChatGPT‐generated texts would increase the perceived credibility of the texts and, thus, reduce the difference between ChatGPT‐ and human‐generated texts in this regard, no main or interaction effect of references was found in this study. One possibility is that students did not pay sufficient attention to the reference lists in the references condition.[<reflink idref="bib1" id="ref232">1</reflink>] Another possibility is, however, that participants did not fully realize the importance of transparency concerning the sources in the context of ChatGPT‐generated texts, either because they were not familiar with this particular issue with such models or because they did not consider the inclusion of references an essential credibility indicator more generally. Yet another possibility is that because the references included in the lists may have been unknown to the participants, they were not able to judge their relevance or appropriateness in the current task context. Further research using other methods, such as post‐reading interviews, may clarify these possibilities. Moreover, experimental work that, for example, manipulates the salience or elaborateness of the source information (i.e., references) or tries to promote students' source knowledge and corresponding source evaluation strategies (Kiili et al. [<reflink idref="bib53" id="ref233">53</reflink>]) when processing and using ChatGPT‐generated texts may shed further light on the potential role of references in this context.</p> <p>As we expected, the close‐ended justifications showed that when the texts were presented as generated by ChatGPT, students justified their credibility judgments more by referring to how the texts were generated than when they were presented as generated by a human, with this finding also similar across the two topics. This finding is consistent with the assumption that a text's origin or source is more salient when it is generated by a LLM than when it is generated by a human, which may be due to the fact that students are (still) much more used to reading human‐generated than model‐generated texts both in and out of school. As we also expected, across the two topics, the open‐ended justifications suggested that students considered references to be more important for judging the credibility of ChatGPT‐generated text than for judging the credibility of human‐generated text, with students apparently also putting more emphasis on references when reference lists were included in the texts that they read. It is uncertain, however, whether students actually took references into consideration when performing their text credibility judgments, as no main effect of references (i.e., references vs. no references) or an interaction of references with source information (i.e., ChatGPT‐generated vs. human‐generated texts) on the text credibility judgments was observed (see above). This raises the question of whether students reflected on the potential importance of references in hindsight only, prompted by the open‐ended questions about their justifications for their text credibility judgments, rather than relying on this criterion spontaneously when performing their credibility judgments. Again, only further research using other methods, such as think‐alouds during text credibility judgments, can clarify this issue.</p> <p>With respect to integrated text understanding as assessed with a post‐reading writing task, the alternative hypothesis that students reading ChatGPT‐generated texts might outperform those reading human‐generated texts was supported. Considered together with the findings that, across the two topics, students trusted the ChatGPT‐generated texts less than the human‐generated texts and also justified their credibility judgments of the ChatGPT‐generated texts more in terms of how the texts were generated, this may suggest that students engaged more critically and deeply with the ChatGPT‐generated texts during reading. This suggestion seems consistent with Sedlbauer et al. ([<reflink idref="bib87" id="ref234">87</reflink>]), who in a qualitative study found that students considered ChatGPT‐generated output problematic and in need of critical evaluation, which the students actually considered beneficial in promoting their critical thinking about the topic in question. Our finding that students' reliance on the source‐oriented justification criterion of how the text was generated seemed to mediate the positive effect of ChatGPT‐generated texts on integrated text understanding also may suggest that students' increased attention to a relevant source feature contributed to cross‐text integration, which is consistent with much prior research on the integrated understanding of multiple human‐generated texts (Bråten et al. [<reflink idref="bib18" id="ref235">18</reflink>], [<reflink idref="bib15" id="ref236">15</reflink>]; McCrudden et al. [<reflink idref="bib67" id="ref237">67</reflink>]).</p> <p>Of course, the findings of this study are limited by the participants that we recruited, the particular texts and topics that we used, the particular source of the human‐generated text, and the way we operationalized and measured our outcome variables. First, because our study included a particular group of high‐school students in a particular context at a particular point of time, further research is needed to probe the generalizability of our findings across populations, contexts, and time. In particular, the frequency of use of ChatGPT among the Norwegian participants, as well as the level of integration of this LLM in their educational context (see the Participants and Context section), may limit generalizability to other participants being less (or more) familiar with ChatGPT. Replications across educational contexts and cultures are therefore highly needed. Second, to achieve experimental control, ecological validity was limited by having students work with only two preselected texts instead of completing more authentic literacy tasks involving ChatGPT‐generated and human‐generated texts. Because the human‐generated texts were presented as written by a journalist in a well‐known mainstream newspaper, it is also an open question to what extent texts written by another author and published in another outlet would have altered the findings. We chose this newspaper because we considered it a source that the students would regularly access for information and regard as fairly neutral in terms of political ideology, with journalists in this newspaper also expected to be working according to recognized journalistic principles. As such, the human‐generated texts that we used in the current study presumably included several source cues that may have influenced students' credibility judgments, with many other human‐generated texts from other sources (e.g., with a strong ideological bias) possibly faring less well in comparison with ChatGPT‐generated texts. Third, the writing task that we used as an outcome measure focused on representation and integration of ideas included in the two texts rather than critique of the textual information, such as by questioning the accuracy and completeness of the information included in the two texts (List [<reflink idref="bib63" id="ref238">63</reflink>]). Finally, although all effects of experimental manipulations were obtained after controlling for relevant individual differences, other individual differences could be included in future research, for example, motivational factors such as topic interest and cognitive factors such as reading comprehension strategies.</p> <p>Despite such limitations and all the future work needed to overcome them, we sincerely believe that the insights we gained by means of our participants and methodology may have not only theoretical but also practical implications. By contributing to a better understanding of how students actually may evaluate and work with texts generated by LLMs, as opposed to human‐generated texts, our findings may inform researchers and educators aiming to promote adaptive use of LLMs in educational contexts about the potential strengths and weaknesses of students' current approaches. Whereas students may put less trust in model‐generated texts due to the way they are created, it may be possible to harness and elaborate their critical stance toward such texts in the service of deeper and more integrated understanding of the issues discussed in the texts. For example, this might be done by means of a contrasting cases method (Schwartz and Bransford [<reflink idref="bib86" id="ref239">86</reflink>]) in which students are presented with and asked to compare and contrast two cases that illustrate more critical versus more naïve approaches to LLMs and their output. In classroom contexts, independent study of such contrasting cases could be combined with group and whole‐class discussions in preparation for integrative writing tasks that draw on model‐generated texts. Another possibility is to use a contrasting cases approach to compare and contrast the use of LLMs and traditional web search to promote understanding of when the latter actually may be a more adaptive strategy (e.g., when the goal is to build deeper knowledge of a complex topic involving different or even conflicting perspectives; Melumad and Yun [<reflink idref="bib68" id="ref240">68</reflink>]). In such ways, educators could try to foster critical and flexible use of LLMs with the aim of striking a synergistic balance between mind and model.</p> <hd id="AN0191105826-37">Funding</hd> <p>The authors have nothing to report.</p> <hd id="AN0191105826-38">Ethics Statement</hd> <p>The collection and handling of the data were approved by the Norwegian Social Science Data Services.</p> <hd id="AN0191105826-39">Consent</hd> <p>Informed consent was obtained from all participants.</p> <hd id="AN0191105826-40">Conflicts of Interest</hd> <p>The authors declare no conflicts of interest.</p> <hd id="AN0191105826-41">Data Availability Statement</hd> <p>The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.</p> <p>GRAPH: Data S1: rrq70087‐sup‐0001‐DataS1.docx.</p> <ref id="AN0191105826-42"> <title> Footnotes </title> <blist> <bibl id="bib1" idref="ref35" type="bt">1</bibl> <bibtext> We conducted a manipulation check by asking participants in the references condition to rate how carefully they had checked the reference lists following the two texts on a 10‐point scale, obtaining a mean score of 4.99 (SD = 2.95) and no statistically significant difference between ChatGPT‐ and human‐generated texts, <emph>t</emph>(134) = 0.81, <emph>p</emph> = 0.418.</bibtext> </blist> </ref> <ref id="AN0191105826-43"> <title> References </title> <blist> <bibtext> Allen, L. K., and P. Kendeou. 2024. " ED‐AI Lit: An Interdisciplinary Framework for AI Literacy in Education." Policy Insights From the Behavioral and Brain Sciences 11, no. 1 : 3 – 10. https://doi.org/10.1177/23727322231220339.</bibtext> </blist> <blist> <bibl id="bib2" idref="ref117" type="bt">2</bibl> <bibtext> Allen, M. S., D. Iliescu, and S. Greiff. 2022. " Single Item Measures in Psychological Science." European Journal of Psychological Assessment 38, no. 1 : 1 – 5. https://doi.org/10.1027/1015‐5759/a000699.</bibtext> </blist> <blist> <bibl id="bib3" idref="ref36" type="bt">3</bibl> <bibtext> Almatrafi, O., A. Johri, and H. Lee. 2024. " A Systematic Review of AI Literacy Conceptualization, Constructs, and Implementation and Assessment Efforts (2019‐2023)." Computers and Education Open 6 : 100173. https://doi.org/10.1016/j.caeo.2024.100173.</bibtext> </blist> <blist> <bibl id="bib4" idref="ref129" type="bt">4</bibl> <bibtext> Andresen, A., Ø. Anmarkrud, and I. Bråten. 2019. " Investigating Multiple Source Use Among Students with and Without Dyslexia." Reading and Writing 32, no. 5 : 1149 – 1174. https://doi.org/10.1007/s11145‐018‐9904‐z.</bibtext> </blist> <blist> <bibl id="bib5" idref="ref82" type="bt">5</bibl> <bibtext> Barzilai, S., and C. A. Chinn. 2025. " How Do Source Evaluation Criteria Develop? A Microgenetic Study of Growth of Epistemic Ideals." Computers in Human Behavior 172 : 108729. https://doi.org/10.1016/j.chb.2025.108729.</bibtext> </blist> <blist> <bibl id="bib6" idref="ref34" type="bt">6</bibl> <bibtext> Bauer, E., S. Greiff, A. C. Graesser, K. Scheiter, and M. Sailer. 2025. " Looking Beyond the Hype: Understanding the Effects of AI on Learning." Educational Psychology Review 37 : 45. https://doi.org/10.1007/s10648‐025‐10020‐8.</bibtext> </blist> <blist> <bibl id="bib7" idref="ref17" type="bt">7</bibl> <bibtext> Bayne, T., and I. Williams. 2023. " The Turing Test Is Not a Good Benchmark for Thought in LLMs." Nature Human Behaviour 7, no. 11 : 1806 – 1807. https://doi.org/10.1038/s41562‐023‐01710‐w.</bibtext> </blist> <blist> <bibl id="bib8" idref="ref89" type="bt">8</bibl> <bibtext> Bjaaland, I. F., M. M. Nilsen, G. Guajardo, and M. S. Hauge. 2025. Study Survey 2024. Norwegian Agency for Quality Assurance in Education.</bibtext> </blist> <blist> <bibl id="bib9" idref="ref115" type="bt">9</bibl> <bibtext> Björnsson, C. H. 1968. Läsbarhet [Readability]. Liber.</bibtext> </blist> <blist> <bibtext> Braasch, J. L. G., and I. Bråten. 2017. " The Discrepancy‐Induced Source Comprehension (D‐ISC) Model: Basic Assumptions and Preliminary Evidence." Educational Psychologist 52, no. 3 : 167 – 181. https://doi.org/10.1080/00461520.2017.1323219.</bibtext> </blist> <blist> <bibtext> Braasch, J. L. G., I. Bråten, H. I. Strømsø, Ø. Anmarkrud, and L. E. Ferguson. 2013. " Promoting Secondary School Students' Evaluation of Source Features of Multiple Documents." Contemporary Educational Psychology 38, no. 3 : 180 – 195. https://doi.org/10.1016/j.cedpsych.2013.03.003.</bibtext> </blist> <blist> <bibtext> Bråten, I., E. W. Brante, and H. I. Strømsø. 2018a. " What Really Matters: The Role of Behavioral Engagement in Multiple Document Literacy Tasks." Journal of Research in Reading 41, no. 4 : 680 – 699. https://doi.org/10.1111/1467‐9817.12247.</bibtext> </blist> <blist> <bibtext> Bråten, I., E. W. Brante, and H. I. Strømsø. 2019. " Teaching Sourcing in Upper‐Secondary School: A Comprehensive Sourcing Intervention with Follow‐Up Data." Reading Research Quarterly 54, no. 4 : 481 – 505. https://doi.org/10.1002/rrq.253.</bibtext> </blist> <blist> <bibtext> Bråten, I., Y. E. Haverkamp, and Ø. Anmarkrud. 2025a. " Gaining a Deeper Understanding of the Deep Cloze Reading Comprehension Test: Examining Potential Contributors and Consequences." Reading and Writing 38, no. 2 : 425 – 446. https://doi.org/10.1007/s11145‐024‐10521‐y.</bibtext> </blist> <blist> <bibtext> Bråten, I., N. Latini, and L. Salmerón. 2025b. " Source Evaluation." In Handbook of Writing Research, edited by C. A. MacArthur, S. Graham, and J. Fitzgerald, 3rd ed., 308 – 323. Guilford.</bibtext> </blist> <blist> <bibtext> Bråten, I., M. T. McCrudden, E. Stang Lund, E. W. Brante, and H. I. Strømsø. 2018b. " Task‐ Oriented Learning with Multiple Documents: Effects of Topic Familiarity, Author Expertise, and Content Relevance on Document Selection, Processing, and Use." Reading Research Quarterly 53, no. 3 : 345 – 365. https://doi.org/10.1002/rrq.197.</bibtext> </blist> <blist> <bibtext> Bråten, I., O. Skovdahl, Ø. Anmarkrud, and H. I. Strømsø. 2025c. Does Reading Still Make You Smarter? It depends. Reading and Writing. Advance online publication. https://doi.org/10.1007/s11145‐025‐10668‐2.</bibtext> </blist> <blist> <bibtext> Bråten, I., M. Stadtler, and L. Salmerón. 2018c. " The Role of Sourcing in Discourse Comprehension." In Handbook of Discourse Processes, edited by M. F. Schober, M. A. Britt, and D. N. Rapp, 2nd ed., 141 – 166. Routledge.</bibtext> </blist> <blist> <bibtext> Bråten, I., H. I. Strømsø, and R. Andreassen. 2016. " Sourcing in Professional Education: Do Text Factors Make Any Difference? " Reading and Writing 29, no. 8 : 1599 – 1628. https://doi.org/10.1007/s11145‐015‐9611‐y.</bibtext> </blist> <blist> <bibtext> Bråten, I., H. I. Strømsø, and M. A. Britt. 2009. " Trust Matters: Examining the Role of Source Evaluation in Students' Construction of Meaning Within and Across Multiple Texts." Reading Research Quarterly 44, no. 1 : 6 – 28. https://doi.org/10.1598/RRQ.44.1.1.</bibtext> </blist> <blist> <bibtext> Bråten, I., H. I. Strømsø, and L. Salmerón. 2011. " Trust and Mistrust When Students Read Multiple Information Sources About Climate Change." Learning and Instruction 21, no. 2 : 180 – 192. https://doi.org/10.1016/j.learninstruc.2010.02.002.</bibtext> </blist> <blist> <bibtext> Britt, M. A., and C. Aglinskas. 2002. " Improving Students' Ability to Identify and Use Source Information." Cognition and Instruction 20, no. 4 : 485 – 522. https://doi.org/10.1207/S1532690XCI2004_2.</bibtext> </blist> <blist> <bibtext> Britt, M. A., and J. Sommer. 2004. " Facilitating Textual Integration With Macro‐Structure Focusing Tasks." Reading Psychology 25, no. 4 : 313 – 339. https://doi.org/10.1080/02702710490522658.</bibtext> </blist> <blist> <bibtext> Cao, Y., S. Li, Y. Liu, et al. 2023. "A Comprehensive Survey of AI‐Generated Content (AIGC): A History of Generative AI From GAN to ChatGPT." https://arxiv.org/pdf/2303.04226.</bibtext> </blist> <blist> <bibtext> Caulfield, M., and S. Wineburg. 2023. Verified. University of Chicago Press.</bibtext> </blist> <blist> <bibtext> Chemero, A. 2023. " LLMs Differ From Human Cognition Because They Are Not Embodied." Nature Human Behaviour 7, no. 11 : 1845 – 1854. https://doi.org/10.1038/s41562‐023‐01723‐5.</bibtext> </blist> <blist> <bibtext> Chiu, T. K. F., Z. Ahmad, M. Ismailov, and I. T. Sanusi. 2024. " What Are Artificial Intelligence Literacy and Competency? A Comprehensive Framework to Support Them." Computers and Education Open 6 : 100171. https://doi.org/10.1016/j.caeo.2024.100171.</bibtext> </blist> <blist> <bibtext> Christhilf, K., A. Potter, J. P. Magliano, K. S. McCarthy, L. K. Allen, and D. S. McNamara. 2025. " Constructed Responses as a Window Into Strategic Processing: The Role of Prompts in Multiple‐Document Reading." Discourse Processes : 1 – 25. https://doi.org/10.1080/0163853X.2025.2578594.</bibtext> </blist> <blist> <bibtext> Coiro, J. 2021. " Toward a Multifaceted Heuristic of Digital Reading to Inform Assessment, Research, Practice, and Policy." Reading Research Quarterly 56, no. 1 : 9 – 31. https://doi.org/10.1002/rrq.302.</bibtext> </blist> <blist> <bibtext> Deng, R., M. Jiang, X. Yu, Y. Lu, and S. Liu. 2025. " Does ChatGPT Enhance Student Learning? A Systematic Review and Meta‐Analysis of Experimental Studies." Computers and Education 227 : 105224. https://doi.org/10.1016/j.compedu.2024.105224.</bibtext> </blist> <blist> <bibtext> Dennis, S. 2024. " Transformer Models as Predication Machines." Discourse Processes 61, no. 6–7 : 355 – 358. https://doi.org/10.1080/0163853X.2024.2362038.</bibtext> </blist> <blist> <bibtext> Dietrich, L. K., and S. Grassini. 2025. " Assessing ChatGPT Acceptance and Use in Education: A Comparative Study Among German‐Speaking Students and Teachers." Education and Information Technologies 30, no. 15 : 22151 – 22176. https://doi.org/10.1007/s10639‐025‐13658‐7.</bibtext> </blist> <blist> <bibtext> Dogruel, L., P. Masur, and S. Joeckel. 2022. " Development and Validation of an Algorithm Literacy Scale for Internet Users." Communication Methods and Measures 16, no. 2 : 115 – 133. https://doi.org/10.1080/19312458.2021.1968361.</bibtext> </blist> <blist> <bibtext> Duarte, P. 2025. "Number of ChatGPT Users (July 2025)." explodingtopics.com/blog/chatgpt‐users.</bibtext> </blist> <blist> <bibtext> Eager, B., and R. Brunton. 2023. " Prompting Higher Education Towards AI‐Augmented Teaching and Learning Practice." Journal of University Teaching and Learning Practice 20, no. 5 : 2. https://doi.org/10.53761/1.20.5.02.</bibtext> </blist> <blist> <bibtext> Elstad, E., and H. Eriksen. 2024. " High School Teachers' Adoption of Generative AI: Antecedents of Instructional AI Utility in the Early Stages of School‐Specific Chatbot Implementation." Nordic Journal of Comparative and International Education (NJCIE) 8, no. 1 : 5736. https://doi.org/10.7577/njcie.5736.</bibtext> </blist> <blist> <bibtext> Federiakin, D., D. Molerov, O. Zlatkin‐Troitschanskaia, and A. Maur. 2024. " Prompt Engineering as a New 21st Century Skill." Frontiers in Education 9 : 1366434. https://doi.org/10.3389/feduc.2024.1366434.</bibtext> </blist> <blist> <bibtext> Field, A. 2018. Discovering Statistics Using IBM SPSS Statistics. 5th ed. Sage.</bibtext> </blist> <blist> <bibtext> Frankfurt, H. 2005. On Bullshit. Princeton.</bibtext> </blist> <blist> <bibtext> Gil, L., I. Bråten, E. Vidal‐Abarca, and H. I. Strømsø. 2010. " Summary Versus Argument Tasks When Working with Multiple Documents: Which Is Better for Whom? " Contemporary Educational Psychology 35, no. 3 : 157 – 173. https://doi.org/10.1016/j.cedpsych.2009.11.002.</bibtext> </blist> <blist> <bibtext> Glenberg, A., and C. R. Jones. 2023. It Takes a Body to Understand the World—Why ChatGPT and Other Language AIs Don't Understand What They're Saying. The Conversation. https://theconversation.com.</bibtext> </blist> <blist> <bibtext> Gruenhagen, J. H., P. M. Sinclair, J.‐A. Carroll, P. R. A. Baker, A. Wilson, and D. Demant. 2024. " The Rapid Rise of Generative AI and Its Implication for Academic Integrity: Students Perceptions and Use of Chatbots for Assistance With Assessments." Computers and Education: Artificial Intelligence 7 : 100273. https://doi.org/10.1016/j.caeai.2024.100273.</bibtext> </blist> <blist> <bibtext> Haverkamp, Y. E., I. Bråten, N. Latini, and H. I. Strømsø. 2024. " Effects of Media Multitasking on the Processing and Comprehension of Multiple Documents: Does Main Idea Summarization Make a Difference? " Contemporary Educational Psychology 77 : 102271. https://doi.org/10.1016/j.cedpsych.2024.102271.</bibtext> </blist> <blist> <bibtext> Hayes, A. F. 2022. Introduction to Mediation, Moderation, and Conditional Process Analysis: A Regression‐Based Approach. 3rd ed. Guilford.</bibtext> </blist> <blist> <bibtext> Heikkilä, M. 2025. The 'Hallucinations' That Haunt AI: Why Chatbots Struggle to Tell the Truth. Financial Times. https://<ulink href="http://www.ft.com/content/7a4e7eae&amp;#8208;f004&amp;#8208;486a&amp;#8208;987f&amp;#8208;4a2e4dbd34fb">www.ft.com/content/7a4e7eae&amp;#8208;f004&amp;#8208;486a&amp;#8208;987f&amp;#8208;4a2e4dbd34fb</ulink>.</bibtext> </blist> <blist> <bibtext> Hicks, M. T., J. Humphries, and J. Slater. 2024. " ChatGPT Is Bullshit." Ethics and Information Technology 26 : 38. https://doi.org/10.1007/s10676‐024‐09775‐5.</bibtext> </blist> <blist> <bibtext> Higgs, J. M., and A. Stornaluolo. 2024. " Being Human in the Age of Generative AI: Young People's Ethical Concerns About Writing and Living With Machines." Reading Research Quarterly 69, no. 4 : 632 – 650. https://doi.org/10.1002/rrq.552.</bibtext> </blist> <blist> <bibtext> Incognito, O., and C. Tarchi. 2024. " The Association Between Sourcing Skills and Intertextual Integration in Lower Secondary School Students." European Journal of Psychology of Education 39 : 1485 – 1500. https://doi.org/10.1007/s10212‐023‐00750‐0.</bibtext> </blist> <blist> <bibtext> Jensen, K. L., and C. Elbro. 2022. " Clozing in on Reading Comprehension: A Deep Cloze Test of Global Inference Making." Reading and Writing 35, no. 5 : 1221 – 1237. https://doi.org/10.1007/s11145‐021‐10230‐w.</bibtext> </blist> <blist> <bibtext> Kalantzis, M., and B. Cope. 2025. " Literacy in the Time of Artificial Intelligence." Reading Research Quarterly 60, no. 1 : 591. https://doi.org/10.1002/rrq.591.</bibtext> </blist> <blist> <bibtext> Kamoun, F., W. El Ayeb, I. Jabri, S. Sifi, and F. Iqbal. 2024. " Exploring Students' and Faculty's Knowledge, Attitudes, and Perceptions Towards ChatGPT: A Cross‐Sectional Empirical Study." Journal of Information Technology Education: Research 23 : 5239. https://doi.org/10.28945/5239.</bibtext> </blist> <blist> <bibtext> Kasneci, E., K. Sessler, S. Kücherman, et al. 2023. " ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education." Learning and Individual Differences 103 : 102274. https://doi.org/10.1016/j.lindif.2023.102274.</bibtext> </blist> <blist> <bibtext> Kiili, C., I. Bråten, and J. Coiro. 2026. "The Framework of Prior Knowledge in Online Credibility Evaluation." Manuscript under revision.</bibtext> </blist> <blist> <bibtext> Kiili, C., I. Bråten, H. I. Strømsø, M. S. Hagerman, E. Räikkönen, and A. Jyrkiäinen. 2022. " Adolescents' Credibility Justifications When Evaluating Online Texts." Education and Information Technologies 27, no. 6 : 7421 – 7450. https://doi.org/10.1007/s10639‐022‐10907‐x.</bibtext> </blist> <blist> <bibtext> Kiili, C., D. J. Leu, J. Utrainen, et al. 2018. " Reading to Learn From Online Information: Modeling the Factor Structure." Journal of Literacy Research 50, no. 3 : 304 – 334. https://doi.org/10.1177/1086296X18784640.</bibtext> </blist> <blist> <bibtext> Kiili, C., E. Räikkönen, I. Bråten, H. I. Strømsø, and M. S. Hagerman. 2023. " Examining the Structure of Credibility Evaluation When Sixth Graders Read Online Texts." Journal of Computer Assisted Learning 39, no. 3 : 954 – 969. https://doi.org/10.1111/jcal.12779.</bibtext> </blist> <blist> <bibtext> Kiili, C., H. I. Strømsø, I. Bråten, J. Ruotsalainen, and E. Räikkönen. 2024. " Reading Comprehension Skills and Prior Knowledge Serve as Resources When Adolescents Justify the Credibility of Online Texts." Reading Psychology 45, no. 7 : 662 – 689. https://doi.org/10.1080/02702711.2024.2351485.</bibtext> </blist> <blist> <bibtext> Kintsch, W. 1998. Comprehension: A Paradigm for Cognition. Cambridge University Press.</bibtext> </blist> <blist> <bibtext> Kosmyna, N., E. Hauptmann, Y. T. Yuan, et al. 2025. " Your Brain on ChatGPT: Accumulation of Cognitive Debt When Using an AI Assistant for Essay Writing Task." arXiv. https://doi.org/10.48550/arXiv.2506.08872.</bibtext> </blist> <blist> <bibtext> Latini, N., I. Bråten, Ø. Anmarkrud, and L. Salmerón. 2019. " Investigating Effects of Reading Medium and Reading Purpose on Behavioral Engagement and Textual Integration in a Multiple Document Context." Contemporary Educational Psychology 59 : article 101797. https://doi.org/10.1016/j.cedpsych.2019.101797.</bibtext> </blist> <blist> <bibtext> Laun, M., and F. Wolff. 2025. " Chatbots in Education: Hype or Help? A Meta‐Analysis." Learning and Individual Differences 119 : 102646. https://doi.org/10.1016/j.lindif.2025.102646.</bibtext> </blist> <blist> <bibtext> Lermann Henestrosa, A., and J. Kimmerle. 2024. " The Effects of Assumed AI vs. Human Authorship on the Perception of a GPT‐Generated Text." Journalism and Media 5 : 1085 – 1097. https://doi.org/10.3390/journalmedia5030069.</bibtext> </blist> <blist> <bibtext> List, A. 2025. " Integrating Prior Knowledge and Multiple Texts: Expanding the Documents Model Framework." Reading and Writing : 1 – 26. https://doi.org/10.1007/s11145‐025‐10641‐z.</bibtext> </blist> <blist> <bibtext> Long, D., and B. Magerko. 2020. " What Is AI Literacy? Competencies and Design Considerations." In CHI'20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1 – 14. Association for Computer Machinery.</bibtext> </blist> <blist> <bibtext> Macedo‐Rouet, M., A. Potocki, L. Scharrer, et al. 2019. " How Good Is This Page? Benefits and Limits of Prompting on Adolescents' Evaluation of Web Information Quality." Reading Research Quarterly 54, no. 3 : 299 – 321. https://doi.org/10.1002/rrq.241.</bibtext> </blist> <blist> <bibtext> McCarthy, K. S., and E. F. Yan. 2024. " Reading Comprehension and Constructive Learning: Policy Considerations in the Age of Artificial Intelligence." Policy Insights From the Behavioral and Brain Sciences 11, no. 1 : 19 – 26. https://doi.org/10.1177/23727322231218891.</bibtext> </blist> <blist> <bibtext> McCrudden, M. T., I. Bråten, and L. Salmerón. 2023. " Learning from Multiple Texts." In International Encyclopedia of Education, edited by R. J. Tierney, F. Rizvi, and K. Ercikan, vol. 6, 4th ed., 353 – 363. Elsevier.</bibtext> </blist> <blist> <bibtext> Melumad, S., and J. H. Yun. 2025. " Experimental Evidence of the Effects of Large Language Models Versus Web Search on Depth of Learning." PNAS Nexus 4 : pgaf316. https://doi.org/10.1093/pnasnexus/pgaf316.</bibtext> </blist> <blist> <bibtext> Milmo, D. 2025. Norwegian Files Complaint After ChatGPT Falsely Said He Had Murdered His Children. Guardian. https://<ulink href="http://www.theguardian.com/technology/2025/mar/21/norwegian&amp;#8208;files&amp;#8208;complaint&amp;#8208;after&amp;#8208;chatgpt&amp;#8208;falsely&amp;#8208;said&amp;#8208;he&amp;#8208;had&amp;#8208;murdered&amp;#8208;his&amp;#8208;children">www.theguardian.com/technology/2025/mar/21/norwegian&amp;#8208;files&amp;#8208;complaint&amp;#8208;after&amp;#8208;chatgpt&amp;#8208;falsely&amp;#8208;said&amp;#8208;he&amp;#8208;had&amp;#8208;murdered&amp;#8208;his&amp;#8208;children</ulink>.</bibtext> </blist> <blist> <bibtext> Mitchell, M. 2023. " How Do We Know How Smart AI Systems Are? " Science 381 : 5957. https://doi.org/10.1126/science.adj5957.</bibtext> </blist> <blist> <bibtext> Mitchell, M. 2025. " Why AI Chatbots Lie to Us." Science 389 : 3922. https://doi.org/10.1126/science.aea3922.</bibtext> </blist> <blist> <bibtext> Mittelstadt, B., S. Wachter, and C. Russell. 2023. " To Protect Science, We Must Use LLMs as Zero‐Shot Translators." Nature Human Behaviour 7, no. 11 : 1830 – 1832. https://doi.org/10.1038/s41562‐023‐01744‐0.</bibtext> </blist> <blist> <bibtext> Moghavvemi, S., and F. A. Jam. 2025. " Unraveling the Influential Factors Driving Persistent Adoption of ChatGPT in Learning Environments." Education and Information Technologies 30, no. 15 : 22443 – 22470. https://doi.org/10.1007/s10639‐025‐13662‐x.</bibtext> </blist> <blist> <bibtext> Morrell‐Mengual, V., O. Fernández‐Garcia, C. Berenguer, J. Ortega‐Barón, M. D. Gil‐Llario, and V. Estruch‐García. 2025. " Characteristics, Motivations and Attitudes of Students Using ChatGPT and Other Language Model‐Based Chatbots in Higher Education." Education and Information Technologies 30, no. 15 : 22257 – 22274. https://doi.org/10.1007/s10639‐025‐13650‐1.</bibtext> </blist> <blist> <bibtext> Nazaretsky, T., P. Mejia‐Domenzain, V. Swamy, J. Frej, and T. Käser. 2025. " The Critical Role of Trust in Adopting AI‐Powered Educational Technology for Learning: An Instrument for Measuring Student Perceptions." Computers and Education: Artificial Intelligence 8 : 100368. https://doi.org/10.1016/j.caeai.2025.100368.</bibtext> </blist> <blist> <bibtext> Nichols, T. P., A. Thrall, J. Quiros, and E. Dixon‐Román. 2024. " Speculative Capture: Literacy After Platformization." Reading Research Quarterly 59, no. 2 : 211 – 218. https://doi.org/10.1002/rrq.535.</bibtext> </blist> <blist> <bibtext> Obreja, D. M., R. Rughinis, and D. Rosner. 2025. " Mapping the Multidimensional Trend of Generative AI: A Bibliometric Analysis and Qualitative Thematic Review." Computers in Human Behavior Reports 17 : 100576. https://doi.org/10.1016/j.chbr.2024.100576.</bibtext> </blist> <blist> <bibtext> OECD. 2024. "OECD Survey on Drivers of Trust in Public Institutions 2024. Results—Country Notes: Norway." https://<ulink href="http://www.oecd.org/en/publications/2024/06/oecd&amp;#8208;survey&amp;#8208;on&amp;#8208;drivers&amp;#8208;of&amp;#8208;trust&amp;#8208;in&amp;#8208;public&amp;#8208;institutions&amp;#8208;2024&amp;#8208;results&amp;#8208;country&amp;#8208;notes%5f33192204/norway%5f417608c9.html">www.oecd.org/en/publications/2024/06/oecd&amp;#8208;survey&amp;#8208;on&amp;#8208;drivers&amp;#8208;of&amp;#8208;trust&amp;#8208;in&amp;#8208;public&amp;#8208;institutions&amp;#8208;2024&amp;#8208;results&amp;#8208;country&amp;#8208;notes%5f33192204/norway%5f417608c9.html</ulink>.</bibtext> </blist> <blist> <bibtext> PricewaterhouseCoopers. 2025. "Evaluering av Bruk av Kunstig Intelligens i Tromsø Kommune (Evaluation of the Use of Artificial Intelligence in Tromsø Municipality)." https://<ulink href="http://www.pwc.no/no/publikasjoner/evaluering&amp;#8208;av&amp;#8208;bruk&amp;#8208;av&amp;#8208;kunstig&amp;#8208;intelligens&amp;#8208;i&amp;#8208;tromso&amp;#8208;kommune.pdf">www.pwc.no/no/publikasjoner/evaluering&amp;#8208;av&amp;#8208;bruk&amp;#8208;av&amp;#8208;kunstig&amp;#8208;intelligens&amp;#8208;i&amp;#8208;tromso&amp;#8208;kommune.pdf</ulink>.</bibtext> </blist> <blist> <bibtext> Ravselj, D., D. Kerzic, N. Tomazevic, et al. 2025. " Higher Education Students' Perceptions of ChatGPT: A Global Study of Early Reactions." PLoS One 20, no. 2 : 0315011. https://doi.org/10.1371/journal.pone.0315011.</bibtext> </blist> <blist> <bibtext> Richter, T., and J. Maier. 2017. " Comprehension of Multiple Documents With Conflicting Information: A Two‐Step Model of Validation." Educational Psychologist 52, no. 3 : 148 – 166. https://doi.org/10.1080/00461520.2017.1322968.</bibtext> </blist> <blist> <bibtext> Richter, T., H. Münchow, and J. Abendroth. 2020. " The Role of Validation in Integrating Multiple Perspectives." In Handbook of Learning From Multiple Representations and Perspectives, edited by P. van Meter, A. List, D. Lombardi, and P. Kendeou, 259 – 275. Routledge.</bibtext> </blist> <blist> <bibtext> Robinson, B., and Y. Hollett. 2024. " Literacy in the Age of AI." Reading Research Quarterly 59, no. 4 : 555 – 559. https://doi.org/10.1002/rrq.581.</bibtext> </blist> <blist> <bibtext> Rowsell, J. 2025. The Comfort of Screens: Literacy in Postdigital Times. Cambridge University Press.</bibtext> </blist> <blist> <bibtext> Savvidou, S. M., I.‐A. Diakidoy, and L. Mason. 2025. " Multiple‐Text Comprehension and Evaluation: The Influence of Reading Goal, Belief Consistency, and Argument Type." Reading Research Quarterly 60, no. 1 : e568. https://doi.org/10.1002/rrq.568.</bibtext> </blist> <blist> <bibtext> Schwartz, D. L., and J. D. Bransford. 1998. " A Time for Telling." Cognition and Instruction 16, no. 4 : 475 – 522. https://doi.org/10.1207/s1532690xci1604_4.</bibtext> </blist> <blist> <bibtext> Sedlbauer, J., J. Cincera, M. Slavik, and A. Hartlova. 2024. " Students' Reflections on Their Experience With ChatGPT." Journal of Computer Assisted Learning 40, no. 4 : 15 – 26. https://doi.org/10.1111/jcal.12967.</bibtext> </blist> <blist> <bibtext> Shanahan, M. 2022. " Talking About Large Language Models." Computation and Language 67 : 68 – 79. https://doi.org/10.48550/arXiv.2212.03551.</bibtext> </blist> <blist> <bibtext> Stadler, M., M. Bannert, and M. Sailer. 2024. " Cognitive Ease at a Cost: LLMs Reduce Mental Effort but Compromise Depth in Student Scientific Inquiry." Computers in Human Behavior 160 : 108386. https://doi.org/10.1016/j.chb.2024.108386.</bibtext> </blist> <blist> <bibtext> Stojanov, A., Q. Liu, and J. H. L. Koh. 2024. " University Students' Self‐Reported Reliance on ChatGPT for Learning: A Latent Profile Analysis." Computers and Education: Artificial Intelligence 6 : 100243. https://doi.org/10.1016/j.caeai.2024.100243.</bibtext> </blist> <blist> <bibtext> Suriano, R., A. Plebe, A. Acciai, and R. A. Fabio. 2025. " Student Interaction With ChatGPT Can Promote Complex Critical Thinking Skills." Learning and Instruction 95 : 102011. https://doi.org/10.1016/.learninstruc.2024.102011.</bibtext> </blist> <blist> <bibtext> Tabachnick, B. G., and L. S. Fidell. 2014. Using Multivariate Statistics. 6th ed. Pearson.</bibtext> </blist> <blist> <bibtext> Thomm, E., and R. Bromme. 2012. " 'It Should at Least Seem Scientific!' Textual Features of "Scientificness" and Their Impact on Lay Assessments of Online Information." Science Education 96, no. 2 : 187 – 211. https://doi.org/10.1002/sce.20480.</bibtext> </blist> <blist> <bibtext> Tromsø Kommune. 2025. "Ny Barnehage‐ og Skolestruktur—Kunnskapsgrunnlag (New Kindergarten and School Structure—Knowledge Base)." https://tromso.kommune.no/sites/default/files/2025‐02/Kunnskapsgrunnlag%20‐%20Ny%20barnehage‐%20og%20skolestruktur.pdf?v=581.</bibtext> </blist> <blist> <bibtext> Wechsler, D. 2008. Wechsler Adult Intelligence Scale—Fourth Edition. NCS Pearson.</bibtext> </blist> <blist> <bibtext> Weidlich, J., D. Gasevic, H. Drachsler, and P. Kirschner. 2025. " ChatGPT in Education: An Effect Is Search of a Cause." Journal of Computer Assisted Learning 41, no. 5 : 70105. https://doi.org/10.1111/jcal.70105.</bibtext> </blist> <blist> <bibtext> Xing, W., N. Nixon, S. Crossley, et al. 2025. " The Use of Large Language Models in Education." International Journal of Artificial Intelligence in Education 35, no. 2 : 439 – 443. https://doi.org/10.1007/s40593‐025‐00457‐x.</bibtext> </blist> <blist> <bibtext> Yu, S.‐C., Y.‐M. Huang, and T.‐T. Wu. 2024. " Tool, Threat, Tutor, Talk, and Trend: College Students' Attitudes Toward ChatGPT." Behavioral Science 14 : 755. https://doi.org/10.3390/bs14090755.</bibtext> </blist> <blist> <bibtext> Zhong, B., and X. Liu. 2025. " Evaluating AI Literacy of Secondary Students: Framework and Scale Development." Computers and Education 227 : 105230. https://doi.org/10.1016/j.compedu.2024.105230.</bibtext> </blist> <blist> <bibtext> Zilber, A. 2025. Google's AI Is 'Hallucinating,' Spreading Dangerous Info—Including a Suggestion to Add Glue to Pizza Sauce. New York Post.</bibtext> </blist> </ref> <aug> <p>By Natalia Latini; Ivar Bråten and Helge I. Strømsø</p> <p>Reported by Author; Author; Author</p> </aug> <nolink nlid="nl1" bibid="bib78" firstref="ref1"></nolink> <nolink nlid="nl2" bibid="bib94" firstref="ref2"></nolink> <nolink nlid="nl3" bibid="bib79" firstref="ref3"></nolink> <nolink nlid="nl4" bibid="bib77" firstref="ref4"></nolink> <nolink nlid="nl5" bibid="bib31" firstref="ref5"></nolink> <nolink nlid="nl6" bibid="bib72" firstref="ref6"></nolink> <nolink nlid="nl7" bibid="bib88" firstref="ref7"></nolink> <nolink nlid="nl8" bibid="bib24" firstref="ref8"></nolink> <nolink nlid="nl9" bibid="bib35" firstref="ref9"></nolink> <nolink nlid="nl10" bibid="bib37" firstref="ref10"></nolink> <nolink nlid="nl11" bibid="bib71" firstref="ref12"></nolink> <nolink nlid="nl12" bibid="bib46" firstref="ref13"></nolink> <nolink nlid="nl13" bibid="bib39" firstref="ref14"></nolink> <nolink nlid="nl14" bibid="bib70" firstref="ref15"></nolink> <nolink nlid="nl15" bibid="bib26" firstref="ref18"></nolink> <nolink nlid="nl16" bibid="bib41" firstref="ref19"></nolink> <nolink nlid="nl17" bibid="bib25" firstref="ref21"></nolink> <nolink nlid="nl18" bibid="bib50" firstref="ref22"></nolink> <nolink nlid="nl19" bibid="bib66" firstref="ref23"></nolink> <nolink nlid="nl20" bibid="bib83" firstref="ref24"></nolink> <nolink nlid="nl21" bibid="bib97" firstref="ref25"></nolink> <nolink nlid="nl22" bibid="bib52" firstref="ref26"></nolink> <nolink nlid="nl23" bibid="bib89" firstref="ref28"></nolink> <nolink nlid="nl24" bibid="bib59" firstref="ref29"></nolink> <nolink nlid="nl25" bibid="bib30" firstref="ref30"></nolink> <nolink nlid="nl26" bibid="bib61" firstref="ref31"></nolink> <nolink nlid="nl27" bibid="bib96" firstref="ref32"></nolink> <nolink nlid="nl28" bibid="bib27" firstref="ref37"></nolink> <nolink nlid="nl29" bibid="bib64" firstref="ref38"></nolink> <nolink nlid="nl30" bibid="bib99" firstref="ref40"></nolink> <nolink nlid="nl31" bibid="bib84" firstref="ref45"></nolink> <nolink nlid="nl32" bibid="bib76" firstref="ref46"></nolink> <nolink nlid="nl33" bibid="bib32" firstref="ref47"></nolink> <nolink nlid="nl34" bibid="bib42" firstref="ref48"></nolink> <nolink nlid="nl35" bibid="bib51" firstref="ref49"></nolink> <nolink nlid="nl36" bibid="bib73" firstref="ref50"></nolink> <nolink nlid="nl37" bibid="bib74" firstref="ref51"></nolink> <nolink nlid="nl38" bibid="bib75" firstref="ref52"></nolink> <nolink nlid="nl39" bibid="bib80" firstref="ref53"></nolink> <nolink nlid="nl40" bibid="bib90" firstref="ref54"></nolink> <nolink nlid="nl41" bibid="bib91" firstref="ref55"></nolink> <nolink nlid="nl42" bibid="bib98" firstref="ref56"></nolink> <nolink nlid="nl43" bibid="bib87" firstref="ref65"></nolink> <nolink nlid="nl44" bibid="bib47" firstref="ref66"></nolink> <nolink nlid="nl45" bibid="bib62" firstref="ref67"></nolink> <nolink nlid="nl46" bibid="bib18" firstref="ref68"></nolink> <nolink nlid="nl47" bibid="bib15" firstref="ref69"></nolink> <nolink nlid="nl48" bibid="bib55" firstref="ref70"></nolink> <nolink nlid="nl49" bibid="bib56" firstref="ref71"></nolink> <nolink nlid="nl50" bibid="bib11" firstref="ref72"></nolink> <nolink nlid="nl51" bibid="bib21" firstref="ref73"></nolink> <nolink nlid="nl52" bibid="bib22" firstref="ref74"></nolink> <nolink nlid="nl53" bibid="bib54" firstref="ref75"></nolink> <nolink nlid="nl54" bibid="bib57" firstref="ref76"></nolink> <nolink nlid="nl55" bibid="bib93" firstref="ref77"></nolink> <nolink nlid="nl56" bibid="bib19" firstref="ref78"></nolink> <nolink nlid="nl57" bibid="bib67" firstref="ref80"></nolink> <nolink nlid="nl58" bibid="bib20" firstref="ref81"></nolink> <nolink nlid="nl59" bibid="bib16" firstref="ref83"></nolink> <nolink nlid="nl60" bibid="bib28" firstref="ref84"></nolink> <nolink nlid="nl61" bibid="bib65" firstref="ref85"></nolink> <nolink nlid="nl62" bibid="bib85" firstref="ref86"></nolink> <nolink nlid="nl63" bibid="bib29" firstref="ref87"></nolink> <nolink nlid="nl64" bibid="bib34" firstref="ref88"></nolink> <nolink nlid="nl65" bibid="bib45" firstref="ref92"></nolink> <nolink nlid="nl66" bibid="bib69" firstref="ref93"></nolink> <nolink nlid="nl67" bibid="bib100" firstref="ref94"></nolink> <nolink nlid="nl68" bibid="bib81" firstref="ref102"></nolink> <nolink nlid="nl69" bibid="bib10" firstref="ref104"></nolink> <nolink nlid="nl70" bibid="bib48" firstref="ref106"></nolink> <nolink nlid="nl71" bibid="bib82" firstref="ref111"></nolink> <nolink nlid="nl72" bibid="bib33" firstref="ref113"></nolink> <nolink nlid="nl73" bibid="bib36" firstref="ref114"></nolink> <nolink nlid="nl74" bibid="bib60" firstref="ref116"></nolink> <nolink nlid="nl75" bibid="bib13" firstref="ref119"></nolink> <nolink nlid="nl76" bibid="bib12" firstref="ref125"></nolink> <nolink nlid="nl77" bibid="bib23" firstref="ref127"></nolink> <nolink nlid="nl78" bibid="bib40" firstref="ref128"></nolink> <nolink nlid="nl79" bibid="bib49" firstref="ref130"></nolink> <nolink nlid="nl80" bibid="bib58" firstref="ref131"></nolink> <nolink nlid="nl81" bibid="bib14" firstref="ref133"></nolink> <nolink nlid="nl82" bibid="bib17" firstref="ref134"></nolink> <nolink nlid="nl83" bibid="bib43" firstref="ref135"></nolink> <nolink nlid="nl84" bibid="bib95" firstref="ref137"></nolink> <nolink nlid="nl85" bibid="bib38" firstref="ref139"></nolink> <nolink nlid="nl86" bibid="bib92" firstref="ref140"></nolink> <nolink nlid="nl87" bibid="bib44" firstref="ref217"></nolink> <nolink nlid="nl88" bibid="bib53" firstref="ref233"></nolink> <nolink nlid="nl89" bibid="bib63" firstref="ref238"></nolink> <nolink nlid="nl90" bibid="bib86" firstref="ref239"></nolink> <nolink nlid="nl91" bibid="bib68" firstref="ref240"></nolink> |
|---|---|
| Header | DbId: eric DbLabel: ERIC An: EJ1494624 AccessLevel: 3 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Mind over Model? Students' Evaluation and Use of ChatGPT-Generated versus Human-Generated Texts – Name: Language Label: Language Group: Lang Data: English – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Natalia+Latini%22">Natalia Latini</searchLink><br /><searchLink fieldCode="AR" term="%22Ivar+Bråten%22">Ivar Bråten</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-9242-9087">0000-0002-9242-9087</externalLink>)<br /><searchLink fieldCode="AR" term="%22Helge+I%2E+Strømsø%22">Helge I. Strømsø</searchLink> – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="SO" term="%22Reading+Research+Quarterly%22"><i>Reading Research Quarterly</i></searchLink>. 2026 61(1). – Name: Avail Label: Availability Group: Avail Data: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us – Name: PeerReviewed Label: Peer Reviewed Group: SrcInfo Data: Y – Name: Pages Label: Page Count Group: Src Data: 21 – Name: DatePubCY Label: Publication Date Group: Date Data: 2026 – Name: TypeDocument Label: Document Type Group: TypDoc Data: Journal Articles<br />Reports - Research – Name: Audience Label: Education Level Group: Audnce Data: <searchLink fieldCode="EL" term="%22High+Schools%22">High Schools</searchLink><br /><searchLink fieldCode="EL" term="%22Secondary+Education%22">Secondary Education</searchLink> – Name: Subject Label: Descriptors Group: Su Data: <searchLink fieldCode="DE" term="%22Student+Attitudes%22">Student Attitudes</searchLink><br /><searchLink fieldCode="DE" term="%22Artificial+Intelligence%22">Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Technology+Uses+in+Education%22">Technology Uses in Education</searchLink><br /><searchLink fieldCode="DE" term="%22High+School+Students%22">High School Students</searchLink><br /><searchLink fieldCode="DE" term="%22Credibility%22">Credibility</searchLink><br /><searchLink fieldCode="DE" term="%22Value+Judgment%22">Value Judgment</searchLink><br /><searchLink fieldCode="DE" term="%22Trust+%28Psychology%29%22">Trust (Psychology)</searchLink> – Name: DOI Label: DOI Group: ID Data: 10.1002/rrq.70087 – Name: ISSN Label: ISSN Group: ISSN Data: 0034-0553<br />1936-2722 – Name: Abstract Label: Abstract Group: Ab Data: This experimental study investigated how high-school students judged the credibility of ChatGPT-generated versus human-generated texts on two different topics, as well as how they justified their text credibility judgments and used the texts in a post-reading integrative writing task. Results showed that, across both topics, students judged the texts to be less credible when they were presented as generated by ChatGPT than when they were presented as generated by a human, and they also justified their credibility judgments more by referring to how the texts were generated when they were presented as generated by ChatGPT. However, on the post-reading writing task, students reading ChatGPT-generated texts displayed a more integrated understanding of the texts on the two topics than did students reading human-generated texts. These findings may have not only theoretical but also practical implications, suggesting that although students may put less trust in model-generated texts due to the way they are created, it may be possible to harness their critical stance toward such texts in the service of deeper and more integrated understanding of the issues discussed in the texts. – Name: AbstractInfo Label: Abstractor Group: Ab Data: As Provided – Name: DateEntry Label: Entry Date Group: Date Data: 2026 – Name: AN Label: Accession Number Group: ID Data: EJ1494624 |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1494624 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1002/rrq.70087 Languages: – Text: English PhysicalDescription: Pagination: PageCount: 21 Subjects: – SubjectFull: Student Attitudes Type: general – SubjectFull: Artificial Intelligence Type: general – SubjectFull: Technology Uses in Education Type: general – SubjectFull: High School Students Type: general – SubjectFull: Credibility Type: general – SubjectFull: Value Judgment Type: general – SubjectFull: Trust (Psychology) Type: general Titles: – TitleFull: Mind over Model? Students' Evaluation and Use of ChatGPT-Generated versus Human-Generated Texts Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Natalia Latini – PersonEntity: Name: NameFull: Ivar Bråten – PersonEntity: Name: NameFull: Helge I. Strømsø IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 2026 Identifiers: – Type: issn-print Value: 0034-0553 – Type: issn-electronic Value: 1936-2722 Numbering: – Type: volume Value: 61 – Type: issue Value: 1 Titles: – TitleFull: Reading Research Quarterly Type: main |
| ResultId | 1 |