Spectral Degradation and Speaking Style Effects on Emotional Prosody Perception Are Largely Independent of Cross-Modal Dual-Tasking.

Saved in:

Bibliographic Details
Title:	Spectral Degradation and Speaking Style Effects on Emotional Prosody Perception Are Largely Independent of Cross-Modal Dual-Tasking.
Authors:	Xie, Zilong^1,2 zx22c@fsu.edu
Source:	American Journal of Audiology. Mar2026, Vol. 35 Issue 1, p218-230. 13p.
Subject Terms:	Data analysis, Emotions, Attention, Speech perception, Auditory perception, Visual perception, Task performance, Research funding, Long short-term memory, Descriptive statistics, Physiological aspects of speech, Psycholinguistics, Statistics, Human voice, Data analysis software, Reaction time
Abstract:	Purpose: This study examined the extent to which spectral degradation and speaking style (child-directed vs. adult-directed speech) affect emotion recognition from prosodic cues and how these effects are modulated by concurrent tasks involving nonauditory sensory input. Method: Adults with normal hearing completed an emotion recognition task under three conditions: alone (auditory single-task), concurrently with a low-load visual memory task (four identical images), and with a high-load visual memory task (four different images). Stimuli consisted of semantically neutral sentences spoken in five emotions (angry, happy, neutral, sad, and scared) and two speaking styles (child-directed and adult-directed). All sentences were vocoded to simulate spectral degradation. Emotion recognition was assessed using a single-interval, five-alternative, forced-choice paradigm, in which the participants were asked to indicate which of five emotions was associated with each heard sentence. Results: Emotion recognition was significantly reduced for vocoded stimuli, as indicated by lower sensitivity (d') and prolonged reaction times (RTs). Childdirected speech led to better performance than adult-directed speech, although its facilitative effect was reduced under vocoded conditions. Dual-tasking impaired performance, with lower d' values in both dual-task conditions and slower RTs under high-load dual-task conditions. Crucially, dual-task effects did not significantly vary with spectral degradation or speaking style. Conclusions: Top-down cognitive demands from cross-modal dual-tasking and bottom-up stimulus factors, such as spectral degradation and speaking style, independently influence emotion recognition from prosodic cues. These findings provide insight into how cochlear implant users perceive emotional speech in complex, multimodal environments. [ABSTRACT FROM AUTHOR]
	Copyright of American Journal of Audiology is the property of American Speech-Language-Hearing Association and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database:	Education Research Complete

Description
Abstract:	Purpose: This study examined the extent to which spectral degradation and speaking style (child-directed vs. adult-directed speech) affect emotion recognition from prosodic cues and how these effects are modulated by concurrent tasks involving nonauditory sensory input. Method: Adults with normal hearing completed an emotion recognition task under three conditions: alone (auditory single-task), concurrently with a low-load visual memory task (four identical images), and with a high-load visual memory task (four different images). Stimuli consisted of semantically neutral sentences spoken in five emotions (angry, happy, neutral, sad, and scared) and two speaking styles (child-directed and adult-directed). All sentences were vocoded to simulate spectral degradation. Emotion recognition was assessed using a single-interval, five-alternative, forced-choice paradigm, in which the participants were asked to indicate which of five emotions was associated with each heard sentence. Results: Emotion recognition was significantly reduced for vocoded stimuli, as indicated by lower sensitivity (d') and prolonged reaction times (RTs). Childdirected speech led to better performance than adult-directed speech, although its facilitative effect was reduced under vocoded conditions. Dual-tasking impaired performance, with lower d' values in both dual-task conditions and slower RTs under high-load dual-task conditions. Crucially, dual-task effects did not significantly vary with spectral degradation or speaking style. Conclusions: Top-down cognitive demands from cross-modal dual-tasking and bottom-up stimulus factors, such as spectral degradation and speaking style, independently influence emotion recognition from prosodic cues. These findings provide insight into how cochlear implant users perceive emotional speech in complex, multimodal environments. [ABSTRACT FROM AUTHOR]
ISSN:	10590889
DOI:	10.1044/2025_AJA-25-00190