Affordances and Limitations of Using Large Language Models to Generate Qualitative Data about Mental Health Perceptions in Engineering

Saved in:
Bibliographic Details
Title: Affordances and Limitations of Using Large Language Models to Generate Qualitative Data about Mental Health Perceptions in Engineering
Language: English
Authors: Jeanne Sanders (ORCID 0000-0002-8865-5444), John Mobley IV (ORCID 0000-0003-0828-3896), Isabel Miller (ORCID 0000-0002-9774-5812), Nicola W. Sochacka (ORCID 0000-0002-9731-6911), Paul A. Jensen (ORCID 0000-0002-1257-9836), Karin J. Jensen (ORCID 0000-0001-9456-5042)
Source: Journal of Engineering Education. 2026 115(1).
Availability: Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
Peer Reviewed: Y
Page Count: 30
Publication Date: 2026
Sponsoring Agency: National Science Foundation (NSF)
Contract Number: 2342384
Document Type: Journal Articles
Reports - Research
Education Level: Higher Education
Postsecondary Education
Descriptors: Artificial Intelligence, Engineering Education, Educational Research, Affordances, Barriers, College Faculty, Undergraduate Students, Stress Management, Attitudes, Brainstorming, Bias
DOI: 10.1002/jee.70037
ISSN: 1069-4730
2168-9830
Abstract: Background: Generative artificial intelligence (AI) large-language models (LLMs) have significant potential as research tools. However, the broader implications of using these tools are still emerging. Few studies have explored using LLMs to generate data for qualitative engineering education research. Purpose/Hypothesis: We explore the following questions: (i) What are the affordances and limitations of using LLMs to generate qualitative data in engineering education, and (ii) in what ways might these data reproduce and reinforce dominant cultural narratives in engineering education, including narratives of high stress? Design/Methods: We analyzed similarities and differences between LLM-generated conversational data (ChatGPT) and qualitative interviews with engineering faculty and undergraduate engineering students from multiple institutions. We identified patterns, affordances, limitations, and underlying biases in generated data. Results: LLM-generated content contained similar responses to interview content. Varying the prompt persona (e.g., demographic information) increased the response variety. When prompted for ways to decrease stress in engineering education, LLM responses more readily described opportunities for structural change, while participants' responses more often described personal changes. LLM data more frequently stereotyped a response than participants did, meaning that LLM responses lacked the nuance and variation that naturally occurs in interviews. Conclusions: LLMs may be a useful tool in brainstorming, for example, during protocol development and refinement. However, the bias present in the data indicates that care must be taken when engaging with LLMs to generate data. Specially trained LLMs that are based only on data from engineering education hold promise for future research.
Abstractor: As Provided
Entry Date: 2026
Accession Number: EJ1495847
Database: ERIC
Description
Abstract:Background: Generative artificial intelligence (AI) large-language models (LLMs) have significant potential as research tools. However, the broader implications of using these tools are still emerging. Few studies have explored using LLMs to generate data for qualitative engineering education research. Purpose/Hypothesis: We explore the following questions: (i) What are the affordances and limitations of using LLMs to generate qualitative data in engineering education, and (ii) in what ways might these data reproduce and reinforce dominant cultural narratives in engineering education, including narratives of high stress? Design/Methods: We analyzed similarities and differences between LLM-generated conversational data (ChatGPT) and qualitative interviews with engineering faculty and undergraduate engineering students from multiple institutions. We identified patterns, affordances, limitations, and underlying biases in generated data. Results: LLM-generated content contained similar responses to interview content. Varying the prompt persona (e.g., demographic information) increased the response variety. When prompted for ways to decrease stress in engineering education, LLM responses more readily described opportunities for structural change, while participants' responses more often described personal changes. LLM data more frequently stereotyped a response than participants did, meaning that LLM responses lacked the nuance and variation that naturally occurs in interviews. Conclusions: LLMs may be a useful tool in brainstorming, for example, during protocol development and refinement. However, the bias present in the data indicates that care must be taken when engaging with LLMs to generate data. Specially trained LLMs that are based only on data from engineering education hold promise for future research.
ISSN:1069-4730
2168-9830
DOI:10.1002/jee.70037