View in EDS HTML Full Text PDF Full Text

Atypical Vocal Imitation of Speech and Song in Autism Spectrum Disorder: Evidence from Mandarin Speakers

Saved in:

Bibliographic Details
Title:	Atypical Vocal Imitation of Speech and Song in Autism Spectrum Disorder: Evidence from Mandarin Speakers
Language:	English
Authors:	Li Wang (ORCID 0000-0001-5318-2408), Peter Q. Pfordresher, Cunmei Jiang (ORCID 0000-0002-0264-5924), Fang Liu (ORCID 0000-0002-7776-0222)
Source:	Autism: The International Journal of Research and Practice. 2025 29(2):408-423.
Availability:	SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com
Peer Reviewed:	Y
Page Count:	16
Publication Date:	2025
Sponsoring Agency:	National Science Foundation (NSF), Division of Behavioral and Cognitive Sciences (BCS)
Contract Number:	1848930
Document Type:	Journal Articles Reports - Research
Descriptors:	Mandarin Chinese, Singing, Autism Spectrum Disorders, Imitation, Speech Communication, Tone Languages, Children, Adolescents, Foreign Countries, Intonation
Geographic Terms:	China
Assessment and Survey Identifiers:	Autism Diagnostic Observation Schedule, Peabody Picture Vocabulary Test, Raven Progressive Matrices
DOI:	10.1177/13623613241275395
ISSN:	1362-3613 1461-7005
Abstract:	Vocal imitation in English-speaking autistic individuals has been shown to be atypical. Speaking a tone language such as Mandarin facilitates vocal imitation skills among non-autistic individuals, yet no studies have examined whether this effect holds for autistic individuals. To address this question, we compared vocal imitation of speech and song between 33 autistic Mandarin speakers and 30 age-matched non-autistic peers. Participants were recorded while imitating 40 speech and song stimuli with varying pitch and duration patterns. Acoustic analyses showed that autistic participants imitated relative pitch (but not absolute pitch) less accurately than non-autistic participants for speech, whereas for song the two groups performed comparably on both absolute and relative pitch matching. Regarding duration matching, autistic participants imitated relative duration (inter-onset interval between consecutive notes/syllables) less accurately than non-autistic individuals for both speech and song, while their lower performance on absolute duration matching of the notes/syllables was presented only in the song condition. These findings indicate that experience with tone languages does not mitigate the challenges autistic individuals face in imitating speech and song, highlighting the importance of considering the domains and features of investigation and individual differences in cognitive abilities and language backgrounds when examining imitation in autism.
Abstractor:	As Provided
Entry Date:	2025
Accession Number:	EJ1465400
Database:	ERIC
Full text is not displayed to guests. Login for full access.

FullText	Links: – Type: pdflink Url: https://content.ebscohost.com/cds/retrieve?content=AQICAHj0k_4E0hTGH8RJwT4gCJyBsGNe_WN95AvKlDbXJGqwxwHRnZDiNF8lYzrWDEVA9HxdAAAA4jCB3wYJKoZIhvcNAQcGoIHRMIHOAgEAMIHIBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDDwRHrF0NTCRWeAdmgIBEICBmoVLxPJYtjV2K81rU457nECw5tf0zyzk6msqxXm_MBitZVrS5vFr0uCvNdGoHu50EwhJKhI8RJUewNAKtLkhg9S1QjGkDW6KgJtFMDtfj_lgYjODRAShGxH-9nN0Eyy9dom_awIYIPkaKcKVztl18oaH3GwNJgBMze3ONLs5QZZIQl8PKXb8BNo2EORh6yjJsibl7w0M2w80DGI= Text: Availability: 1 Value: <anid>AN0183028988;f9d01feb.25;2025Feb17.01:29;v2.2.500</anid> <title id="AN0183028988-1">Atypical vocal imitation of speech and song in autism spectrum disorder: Evidence from Mandarin speakers </title> <p>Vocal imitation in English-speaking autistic individuals has been shown to be atypical. Speaking a tone language such as Mandarin facilitates vocal imitation skills among non-autistic individuals, yet no studies have examined whether this effect holds for autistic individuals. To address this question, we compared vocal imitation of speech and song between 33 autistic Mandarin speakers and 30 age-matched non-autistic peers. Participants were recorded while imitating 40 speech and song stimuli with varying pitch and duration patterns. Acoustic analyses showed that autistic participants imitated relative pitch (but not absolute pitch) less accurately than non-autistic participants for speech, whereas for song the two groups performed comparably on both absolute and relative pitch matching. Regarding duration matching, autistic participants imitated relative duration (inter-onset interval between consecutive notes/syllables) less accurately than non-autistic individuals for both speech and song, while their lower performance on absolute duration matching of the notes/syllables was presented only in the song condition. These findings indicate that experience with tone languages does not mitigate the challenges autistic individuals face in imitating speech and song, highlighting the importance of considering the domains and features of investigation and individual differences in cognitive abilities and language backgrounds when examining imitation in autism. Atypical vocal imitation has been identified in English-speaking autistic individuals, whereas the characteristics of vocal imitation in tone-language-speaking autistic individuals remain unexplored. By comparing speech and song imitation, the present study reveals a unique pattern of atypical vocal imitation across speech and music domains among Mandarin-speaking autistic individuals. The findings suggest that tone language experience does not compensate for difficulties in vocal imitation in autistic individuals and extends our understanding of vocal imitation in autism across different languages.</p> <p>Keywords: acoustics; autism; song; speech; vocal imitation</p> <hd id="AN0183028988-2">Introduction</hd> <p>Imitation is an essential aspect of skill development ([<reflink idref="bib31" id="ref1">31</reflink>]). In the first few years of life, children rapidly learn new skills, such as the typical uses of certain things and the basics of their mother tongue. The rapid learning abilities of young children can be attributed, in part, to humans' remarkable capacity to imitate what they see and hear ([<reflink idref="bib80" id="ref2">80</reflink>]). Starting from infancy, typically developing children learn to imitate others' object-directed actions, gestures, body movements, and sounds or words ([<reflink idref="bib53" id="ref3">53</reflink>]). The process of imitating others or being imitated not only facilitates the development of skills but also lays the foundation for interaction and communication with others, for example, by expressing interests in their caregivers or peers, sharing emotions as well as paying attention to others ([<reflink idref="bib32" id="ref4">32</reflink>]; [<reflink idref="bib81" id="ref5">81</reflink>]).</p> <p>However, deviations in imitation, especially in the vocal domain, can exert a profound impact on the development of social interaction and communication, as exemplified in autism spectrum disorder (ASD; [<reflink idref="bib12" id="ref6">12</reflink>]; [<reflink idref="bib19" id="ref7">19</reflink>]; [<reflink idref="bib23" id="ref8">23</reflink>]; [<reflink idref="bib30" id="ref9">30</reflink>]; [<reflink idref="bib63" id="ref10">63</reflink>]; [<reflink idref="bib82" id="ref11">82</reflink>]; [<reflink idref="bib84" id="ref12">84</reflink>]). Research has shown that autistic and non-autistic individuals differ in how they vocally imitate sounds and speech, particularly in terms of pitch and duration patterns. For example, when autistic individuals try to imitate prosodic patterns, such as making a sentence sound like a question or a statement, or expressing likes or dislikes, they often exhibit prolonged durations of the sentences compared to their non-autistic peers ([<reflink idref="bib19" id="ref13">19</reflink>]; [<reflink idref="bib63" id="ref14">63</reflink>]). In addition, autistic individuals tend to use a higher pitch when imitating the stress patterns in nonsense words (i.e. make-up words, like "<emph>tauveeb</emph>") than non-autistic individuals ([<reflink idref="bib82" id="ref15">82</reflink>]). Studies also find that when autistic individuals imitate speech to convey statements, questions, or emotions, their patterns are different from those of non-autistic individuals in both pitch and duration characteristics ([<reflink idref="bib23" id="ref16">23</reflink>]; [<reflink idref="bib30" id="ref17">30</reflink>]; [<reflink idref="bib83" id="ref18">83</reflink>]). Understanding these acoustic differences (e.g. pitch and duration) in vocal imitation can inform the development of more effective communication strategies and interventions for autistic individuals ([<reflink idref="bib51" id="ref19">51</reflink>]).</p> <p>Notably, the majority of these investigations have been conducted with speakers of non-tonal languages, and the literature lacks representation from speakers of tone languages. The world's languages can be classified into tone (e.g. Mandarin, Cantonese) versus non-tonal (e.g. English) languages, depending on how they use pitch to convey meaning ([<reflink idref="bib89" id="ref20">89</reflink>]; [<reflink idref="bib91" id="ref21">91</reflink>]). Specifically, across tone and non-tonal languages, pitch is used to convey prosodic meaning ([<reflink idref="bib36" id="ref22">36</reflink>]), including intonation such as statement-question intonation ([<reflink idref="bib83" id="ref23">83</reflink>]) and emotions like excitement and sadness ([<reflink idref="bib71" id="ref24">71</reflink>]). However, pitch additionally serves a lexical function of distinguishing different word meanings in tone languages ([<reflink idref="bib35" id="ref25">35</reflink>]). For example, with the same syllable /ma/, the word 妈 with a high-level tone (i.e. Tone 1 in Mandarin) means "mother," whereas the word 马 with a falling-rising tone (i.e. Tone 3 in Mandarin) means "horse." Thus, unlike in English, the imitation of pitch-related features in Mandarin occurs in parallel, with prosodic meaning represented at the sentence level and lexical meaning at the syllable or word level ([<reflink idref="bib46" id="ref26">46</reflink>]; [<reflink idref="bib92" id="ref27">92</reflink>]). Due to the additional role pitch plays in tone languages, enhanced pitch processing abilities in tone language speakers have been widely demonstrated (see [<reflink idref="bib47" id="ref28">47</reflink>] for review), underscoring the need to explore vocal imitation in autistic individuals within tonal linguistic contexts.</p> <p>This study, therefore, seeks to provide a more nuanced exploration of vocal imitation, specifically among autistic Mandarin speakers. In addition to addressing the lacunae in existing literature, we also examined the matching between the model and imitated sounds, a critical measure of imitation accuracy that is often overlooked in previous acoustic studies. As depicted in Figure 1, without considering model sounds, a direct comparison of the acoustic features (e.g. pitch and duration) of the imitated sounds between the autistic and non-autistic groups provided insights solely into the characteristics of imitated sounds, rather than imitation accuracy. This oversight failed to capture participants' vocal imitation ability <emph>per se</emph>, that is, the ability to match the acoustic features of the model sounds through imitation ([<reflink idref="bib54" id="ref29">54</reflink>]; [<reflink idref="bib83" id="ref30">83</reflink>]). Comparing imitated sounds to the original targets offers valuable insights into the nature of vocal imitation differences in autism. This, in turn, can inform targeted clinical interventions and contribute to the broader understanding of vocal imitation abilities in autism. In an effort to fill this gap, our previous study examined speech and song imitation in an English-speaking sample, who were instructed to imitate exactly the pitch and timing patterns of the sentences they heard (i.e. Model sounds) while their voices were being recorded (i.e. Imitated sounds) ([<reflink idref="bib83" id="ref31">83</reflink>]). The vocal imitation ability was measured by comparing the pitch- and duration-related parameters between the model and the imitated sounds, with smaller differences indicating more accurate imitation. Results revealed that vocal imitation differences exist among English-speaking autistic individuals across speech and music domains, especially in terms of absolute pitch and duration matching ([<reflink idref="bib83" id="ref32">83</reflink>]).</p> <p>Graph: Figure 1. The illustration of vocal imitation process.</p> <p>Using the same paradigm, the current study strived to deepen the insights into vocal imitation among Mandarin-speaking autistic individuals. Through acoustic analysis, we aimed to quantify speech and song imitation abilities of Mandarin-speaking autistic and non-autistic individuals, addressing the following questions: (<reflink idref="bib1" id="ref33">1</reflink>) Do imitation abilities of Mandarin-speaking autistic individuals differ from non-autistic individuals in terms of pitch-related features across speech and music domains? (<reflink idref="bib2" id="ref34">2</reflink>) Do Mandarin-speaking autistic individuals differ from non-autistic individuals with respect to duration-related feature matching in vocal imitation? Based on the differences in how pitch is used in Mandarin and English speech, we hypothesized that vocal imitation of pitch-related features in Mandarin-speaking autistic individuals may not be affected, unlike English speakers. This expectation arose from the elevated sensitivity and proficiency in processing pitch observed in Mandarin speakers ([<reflink idref="bib47" id="ref35">47</reflink>]). Regarding duration-related features, a cross-linguistic study found that machine learning using speech rhythm can differentiate autistic from non-autistic individuals across English and Cantonese, suggesting that speech rhythm is an important feature of autism that is evident in multiple languages ([<reflink idref="bib41" id="ref36">41</reflink>]). We therefore predicted that, like English speakers, Mandarin-speaking autistic individuals may have difficulty in imitating duration patterns in both speech and music. Based on previous findings on English speakers ([<reflink idref="bib83" id="ref37">83</reflink>]), we also hypothesized that Mandarin-speaking autistic participants would show poorer performance on absolute feature matching, but not relative feature matching as compared to non-autistic participants.</p> <hd id="AN0183028988-3">Method</hd> <p></p> <hd id="AN0183028988-4">Participants</hd> <p>A group of 33 autistic children (aged between 7 and 16) and 30 age-matched non-autistic children took part in the study. All were native speakers of Mandarin and reported no history of other neurological or psychiatric disorders. They were recruited from special educational facilities and mainstream schools in Nanchang and Nanjing, China. The autistic children all had a clinical diagnosis of autism using either <emph>DSM</emph>-IV or <emph>DSM</emph>-5 ([<reflink idref="bib2" id="ref38">2</reflink>], [<reflink idref="bib3" id="ref39">3</reflink>]) which was further supported by the Autism Diagnostic Observation Schedule—Second Edition (ADOS-2) ([<reflink idref="bib48" id="ref40">48</reflink>]) conducted by the first author (with research reliability for administration and scoring). All autistic participants were administered the ADOS-2 Module 3 according to their developmental and language levels. Total scores on the ADOS-2 were converted to a comparative score (CS) of 1–10, with 10 representing the highest severity of autism-related symptoms ([<reflink idref="bib21" id="ref41">21</reflink>]; [<reflink idref="bib25" id="ref42">25</reflink>]). All participants had normal hearing in both ears, with pure-tone air conduction thresholds of 25 dB HL or better at frequencies of 0.5, 1, 2, and 4 kHz, as assessed using an Amplivox manual audiometer (Model 116). Participants completed a nonverbal IQ test using the Raven's Standard Progressive Matrices Test (RSPM) ([<reflink idref="bib70" id="ref43">70</reflink>]) and a receptive vocabulary test using the Chinese version of the Peabody Picture Vocabulary Test-Revised (PPVT-R) ([<reflink idref="bib22" id="ref44">22</reflink>]; [<reflink idref="bib74" id="ref45">74</reflink>]). The standardized scores for RSPM and PPVT-R were calculated as described by [<reflink idref="bib85" id="ref46">85</reflink>]. For RSPM, the standardized scores were derived using the means and standard deviations from a Chinese normative study ([<reflink idref="bib93" id="ref47">93</reflink>]). As the Chinese norms for PPVT-R covered only ages 3.5 to 9 ([<reflink idref="bib74" id="ref48">74</reflink>]), we used American norms ([<reflink idref="bib22" id="ref49">22</reflink>]) to calculate the standardized scores. A correlation analysis showed a significant positive relationship (<emph>r</emph> = 0.95) between the standardized scores based on the Chinese norms and those based on the American norms for participants aged 9 and below, validating this methodology. The Chinese version of the forward digit span task was used to assess verbal short-term memory ([<reflink idref="bib87" id="ref50">87</reflink>]). Participants' musical training background and their ability to identify a musical note without a reference tone (i.e. absolute pitch or perfect pitch) ([<reflink idref="bib17" id="ref51">17</reflink>]) were collected using a caregiver-reported questionnaire, and their years of formal musical training were summed across all instruments including voice ([<reflink idref="bib83" id="ref52">83</reflink>]). Participants' perceptual skills were assessed using a statement-question intonation discrimination task, taken from a comparative study investigating speech and music perception ([<reflink idref="bib45" id="ref53">45</reflink>]; [<reflink idref="bib85" id="ref54">85</reflink>]). As can be seen in Table 1, the results of Welch's <emph>t</emph>-test showed that the autistic and non-autistic groups were comparable on all background measures, except the PPVT-R scores, which were taken into account in the statistical models.</p> <p>Table 1. Characteristics of the autism (n = 33) and non-autism groups (n = 30).</p> <p>Graph</p> <p> <ephtml> &lt;table&gt;&lt;colgroup&gt;&lt;col align="left" /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;/colgroup&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="left"&gt;Background measures&lt;/th&gt;&lt;th align="left"&gt;Autism&lt;/th&gt;&lt;th align="left"&gt;Non-Autism&lt;/th&gt;&lt;th align="left"&gt;&lt;italic&gt;t&lt;/italic&gt;&lt;/th&gt;&lt;th align="left"&gt;&lt;italic&gt;p&lt;/italic&gt;&lt;/th&gt;&lt;th align="left"&gt;Cohen's &lt;italic&gt;d&lt;/italic&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Gender (F:M)&lt;/td&gt;&lt;td&gt;5:28&lt;/td&gt;&lt;td&gt;4:26&lt;/td&gt;&lt;td /&gt;&lt;td /&gt;&lt;td /&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;10.29 (2.50)&lt;/td&gt;&lt;td&gt;11.50 (2.83)&lt;/td&gt;&lt;td&gt;1.79&lt;/td&gt;&lt;td&gt;0.08&lt;/td&gt;&lt;td&gt;0.45&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Musical training&lt;/td&gt;&lt;td&gt;0.88 (1.32)&lt;/td&gt;&lt;td&gt;0.50 (1.11)&lt;/td&gt;&lt;td&gt;1.24&lt;/td&gt;&lt;td&gt;0.22&lt;/td&gt;&lt;td&gt;0.31&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;RSPM&lt;/td&gt;&lt;td&gt;110.12 (15.77)&lt;/td&gt;&lt;td&gt;112.72 (10.26)&lt;/td&gt;&lt;td&gt;0.78&lt;/td&gt;&lt;td&gt;0.44&lt;/td&gt;&lt;td&gt;0.20&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&lt;bold&gt;124.33 (25.87)&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;141.77 (12.80)&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;3.44&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.001&lt;xref ref-type="table-fn" rid="tfn1"&gt;*&lt;/xref&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.85&lt;/bold&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Digit span&lt;/td&gt;&lt;td&gt;8.49 (0.91)&lt;/td&gt;&lt;td&gt;8.07 (1.11)&lt;/td&gt;&lt;td&gt;1.63&lt;/td&gt;&lt;td&gt;0.11&lt;/td&gt;&lt;td&gt;0.41&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Self-reported absolute pitch&lt;/td&gt;&lt;td&gt;&lt;italic&gt;n&lt;/italic&gt; = 2&lt;/td&gt;&lt;td&gt;&lt;italic&gt;n&lt;/italic&gt; = 3&lt;/td&gt;&lt;td /&gt;&lt;td /&gt;&lt;td /&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Perception-Natural speech&lt;/td&gt;&lt;td&gt;1.57 (0.87)&lt;/td&gt;&lt;td&gt;1.81 (0.75)&lt;/td&gt;&lt;td&gt;1.19&lt;/td&gt;&lt;td&gt;0.24&lt;/td&gt;&lt;td&gt;0.30&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Perception-Gliding tone&lt;/td&gt;&lt;td&gt;1.60 (0.86)&lt;/td&gt;&lt;td&gt;1.93 (0.63)&lt;/td&gt;&lt;td&gt;1.71&lt;/td&gt;&lt;td&gt;0.09&lt;/td&gt;&lt;td&gt;0.43&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;ADOS-CS&lt;/td&gt;&lt;td&gt;6.97 (2.31)&lt;/td&gt;&lt;td&gt;NA&lt;/td&gt;&lt;td /&gt;&lt;td /&gt;&lt;td /&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <p>1 <emph>Note</emph>. Musical training: years of musical training; RSPM: standard score of Raven's Standard Progressive Matrices Test; PPVT-R: standard score of Peabody Picture Vocabulary Test-Revised; Digit span: raw score of verbal short-term memory; Perception-Natural speech and Perception-Gliding tone: D-prime values for subtest scores, with higher values representing better perception skill; ADOS-CS: comparative score of ADOS, with 10 representing the highest severity of autism-related symptoms. Bold values indicate statistical significance at <emph>p</emph> &lt; 0.05. <emph>p</emph> &lt; 0.05, <emph>p</emph> &lt; 0.01, <emph>p</emph> &lt; 0.001.</p> <hd id="AN0183028988-5">Community involvement</hd> <p>There was no community involvement in the present study.</p> <hd id="AN0183028988-6">Stimuli</hd> <p>The model stimuli were 10 sentences either spoken or sung with an early focus or a late focus from [<reflink idref="bib44" id="ref55">44</reflink>], yielding 40 sentences with two to six syllables each (see Table 2 for the list of sentences and Supplementary Table 1 for musical notations of the sung stimuli). The inclusion of different sentence lengths was to control for the effect of stimulus length on imitation performance ([<reflink idref="bib44" id="ref56">44</reflink>]). The manipulation of the different focus conditions of the sentences ensured the inclusion of a variety of pitch and duration patterns in the speech stimuli, as focused words normally show a higher pitch and longer duration than their unfocused counterparts in Mandarin speech ([<reflink idref="bib46" id="ref57">46</reflink>]; [<reflink idref="bib92" id="ref58">92</reflink>]). For example, in the top right panel of Figure 2, the sentence "<bold>她</bold>的包?" ["<bold>Her</bold> bag?"] has an initial focus on the word "<bold>她</bold>" ["<bold>Her</bold>"] which has a higher pitch and longer duration than the same unfocused word in the bottom right panel of Figure 2, where the sentence "她的<bold>包</bold>?" ["'Her <bold>bag</bold>?"] has a final focus on the word "<bold>包</bold>" ["<bold>bag</bold>"]. As can be seen from the top and bottom left panels of Figure 2, the corresponding song stimuli approximated the global melodic contours and timing variations of the speech stimuli. To both accommodate participants' vocal range and to ensure that participants of different ages or gender were exposed to the same pitch and duration patterns of the speech/song stimuli, we adopted the male and female versions of the stimuli from [<reflink idref="bib44" id="ref59">44</reflink>]. The female model was originally recorded by a 27-year-old Mandarin-speaking female student who was born and raised in Beijing. To ensure that the stimuli encountered by male and female participants have identical pitch intervals and rhythmic patterns, the female model was synthesized (preserving the absolute pitches and formant frequencies of the original recordings) and the male model was generated from the female model by changing the original pitches to one octave lower and shifting the frequencies of the original formants by.78 to achieve male voice characteristics, using the "change gender" command in Praat ([<reflink idref="bib8" id="ref60">8</reflink>]). The ecological validity of the synthesized female and male models was tested and confirmed in [<reflink idref="bib44" id="ref61">44</reflink>], where Mandarin-speaking female and male adult participants with and without congenital amusia performed the same imitation task using the same stimulus set. None of the participants in [<reflink idref="bib44" id="ref62">44</reflink>] noted any unnaturalness of the stimuli, and no significant differences were found in imitation performance between the participants of different genders for either the amusic or the non-amusic group. Thus, the current study adopted the same stimulus set as in [<reflink idref="bib44" id="ref63">44</reflink>]. We also did not observe any significant differences in imitation performance across female and male participants in the current sample (see Supplementary Table 2). The male version was used for male participants ⩾12 years old, and the female version was used for female participants regardless of age as well as male participants &lt; 12 years old, as research indicates that children up to 12 show similar pitch ranges ([<reflink idref="bib52" id="ref64">52</reflink>]; [<reflink idref="bib56" id="ref65">56</reflink>]; [<reflink idref="bib76" id="ref66">76</reflink>]).</p> <p>Table 2. Stimuli used in the experiment.</p> <p>Graph</p> <p> <ephtml> &lt;table&gt;&lt;colgroup&gt;&lt;col align="left" /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;/colgroup&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="left"&gt;Stimuli with an early focus&lt;/th&gt;&lt;th align="left"&gt;Stimuli with a late focus&lt;/th&gt;&lt;th align="left"&gt;Chinese Pinyin&lt;/th&gt;&lt;th align="left"&gt;English translation&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;bold&gt;&amp;#40657;&lt;/bold&gt;&amp;#36710;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#40657;&lt;bold&gt;&amp;#36710;&lt;/bold&gt;&amp;#65311;&lt;/td&gt;&lt;td&gt;Hei1 che1?&lt;/td&gt;&lt;td&gt;Black car?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;bold&gt;&amp;#38738;&lt;/bold&gt;&amp;#22825;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#38738;&lt;bold&gt;&amp;#22825;&lt;/bold&gt;&amp;#65311;&lt;/td&gt;&lt;td&gt;Qing1 tian1?&lt;/td&gt;&lt;td&gt;Blue sky?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;bold&gt;&amp;#22905;&lt;/bold&gt;&amp;#30340;&amp;#21253;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#22905;&amp;#30340;&lt;bold&gt;&amp;#21253;&lt;/bold&gt;&amp;#65311;&lt;/td&gt;&lt;td&gt;Ta1 de0 bao1?&lt;/td&gt;&lt;td&gt;Her bag?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;bold&gt;&amp;#19977;&lt;/bold&gt;&amp;#39063;&amp;#26143;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#19977;&lt;bold&gt;&amp;#39063;&lt;/bold&gt;&amp;#26143;&amp;#65311;&lt;/td&gt;&lt;td&gt;San1 ke1 xing1?&lt;/td&gt;&lt;td&gt;Three stars?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;bold&gt;&amp;#20908;&amp;#22825;&lt;/bold&gt;&amp;#30340;&amp;#39118;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#20908;&amp;#22825;&amp;#30340;&lt;bold&gt;&amp;#39118;&lt;/bold&gt;&amp;#65311;&lt;/td&gt;&lt;td&gt;Dong1 tian1 de0 feng1?&lt;/td&gt;&lt;td&gt;The winter's wind?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&amp;#20889;&lt;bold&gt;&amp;#20182;&lt;/bold&gt;&amp;#20070;&amp;#19978;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#20889;&amp;#20182;&lt;bold&gt;&amp;#20070;&lt;/bold&gt;&amp;#19978;&amp;#65311;&lt;/td&gt;&lt;td&gt;Xie3 ta1 shu1 shang0?&lt;/td&gt;&lt;td&gt;Write on his book?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;bold&gt;&amp;#28422;&amp;#40657;&lt;/bold&gt;&amp;#30340;&amp;#22825;&amp;#31354;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#28422;&amp;#40657;&amp;#30340;&lt;bold&gt;&amp;#22825;&amp;#31354;&lt;/bold&gt;&amp;#65311;&lt;/td&gt;&lt;td&gt;Qi1 hei1 de0 tian1 kong1?&lt;/td&gt;&lt;td&gt;Pitch-black sky?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&amp;#23567;&lt;bold&gt;&amp;#19969;&lt;/bold&gt;&amp;#38271;&amp;#39640;&amp;#20102;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#23567;&amp;#19969;&amp;#38271;&lt;bold&gt;&amp;#39640;&lt;/bold&gt;&amp;#20102;&amp;#65311;&lt;/td&gt;&lt;td&gt;Xiao3 ding1 zhang3 gao1 le0&lt;/td&gt;&lt;td&gt;Xiao Ding grew taller?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&amp;#32769;&lt;bold&gt;&amp;#37101;&lt;/bold&gt;&amp;#30340;&amp;#29483;&amp;#20002;&amp;#20102;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#32769;&amp;#37101;&amp;#30340;&lt;bold&gt;&amp;#29483;&lt;/bold&gt;&amp;#20002;&amp;#20102;&amp;#65311;&lt;/td&gt;&lt;td&gt;Lao3 guo1 de0 mao1 diu1 le0&lt;/td&gt;&lt;td&gt;Lao Guo's cat is lost?&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&amp;#23567;&amp;#26041;&lt;bold&gt;&amp;#22825;&amp;#22825;&lt;/bold&gt;&amp;#21152;&amp;#29677;&amp;#65311;&lt;/td&gt;&lt;td&gt;&amp;#23567;&amp;#26041;&amp;#22825;&amp;#22825;&lt;bold&gt;&amp;#21152;&amp;#29677;&lt;/bold&gt;&amp;#65311;&lt;/td&gt;&lt;td&gt;Xiao3 Fang1 tian1 tian1 jia1 ban1?&lt;/td&gt;&lt;td&gt;Xiao Fang works overtime every day?&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <p>Graph: Figure 2. The pitch-time trajectory of the sentence "她的包? vs. 她的包? (Ta1 de0 bao1?/ Her bag?) under different conditions by female/male model speakers.</p> <hd id="AN0183028988-7">Procedure</hd> <p>The presentation of the model stimuli and the recording of the imitations were both done using Praat ([<reflink idref="bib8" id="ref67">8</reflink>]). Participants were seated in a quiet room and were presented with four practice trials (with items different from those in experimental trials: 2 speech vs 2 song) to familiarize themselves with the task and the recording environment. Following the practice section, participants were presented with each of the 40 speech/song sentences one at a time in pseudorandom order and were instructed to imitate exactly the pitch and timing patterns of the sentences to the best of their ability, while their voices were recorded via a Roland RUBIX22 USB Audio Interface. Each sentence was played once and only replayed when participants failed to catch the words, and not when they wanted to listen to it again so they could imitate it better.</p> <hd id="AN0183028988-8">Data analysis</hd> <p>Recordings were analyzed in Praat using ProsodyPro, a software tool designed for the automatic analysis of extensive speech data ([<reflink idref="bib88" id="ref68">88</reflink>]). To ensure precise acoustic measurements, we adopted a hybrid approach. This involved initial automated processes using ProsodyPro and subsequent manual verification by trained phoneticians (authors LW and FL) to extract the pitch and duration of each syllable rhyme. Syllable/note duration was calculated as the length of the syllable rhyme, and the onset of syllable rhyme was defined as the syllable/note onset time. The median F0s (fundamental frequencies) of the syllable rhymes were extracted to indicate pitch heights. Octave errors in pitch imitation were corrected, that is, when the imitated pitch was more than 6 semitones (half octave) apart from the model pitch, the value was adjusted as 12—imitated pitch. In total, less than 4.11% of the data samples needed to be adjusted, and most of these errors were caused by creaky voices, resulting in decreased F0 ([<reflink idref="bib34" id="ref69">34</reflink>]). Trained phoneticians manually added these missed vocal pulse marks for F0 based on the waveforms and spectrograms, to avoid having erroneous outliers misleading imitation results.</p> <p>We used absolute pitch and duration matching to refer to the ability to imitate individual syllables/notes based on their acoustic features, irrespective of their relationship with surrounding syllables/notes. In addition, following [<reflink idref="bib44" id="ref70">44</reflink>] and previous singing or pitch-matching studies ([<reflink idref="bib15" id="ref71">15</reflink>], [<reflink idref="bib16" id="ref72">16</reflink>], [<reflink idref="bib14" id="ref73">14</reflink>]; [<reflink idref="bib65" id="ref74">65</reflink>]; [<reflink idref="bib67" id="ref75">67</reflink>]; [<reflink idref="bib86" id="ref76">86</reflink>]), we also measured the number of pitch contour, pitch interval, and time errors that deviated from the corresponding model's pitch direction or specific pitch interval or duration value. The pitch was measured in "cents" (100 cents = one semitone), a unit of measure based on the equal-tempered scale in music, to facilitate a more nuanced representation of pitch distinctions and a finer resolution in the assessment of pitch differences. Detailed definitions of these measures are provided below.</p> <p> <bold>The absolute pitch deviation (in cents):</bold> Median F0 was extracted from each syllable rhyme and then subtracted from that of their matched model to find the pitch deviation (in absolute value) for each imitated rhyme. The deviations were averaged over all syllables/notes in each utterance/melody and the bigger the value, the less accurate the imitation in terms of absolute pitch matching.</p> <p> <bold>The relative pitch deviation (in cents):</bold> The pitch interval was calculated as the absolute difference in median F0 between two consecutive syllables/notes, and then subtracted from their matched model's pitch interval (in absolute value). The deviations were averaged over all intervals in each utterance/melody and the bigger the value, the less accurate the imitation in terms of relative pitch matching.</p> <p> <bold>The number of pitch contour errors:</bold> Pitch contour errors were defined as imitated pitch intervals that differed from the corresponding model pitch intervals regarding pitch directions (up, down, or level). Pitch direction was considered to be up or down if the difference in pitch interval was higher or lower by 50 cents or more; otherwise (the difference was within 50 cents), the pitch intervals were considered to form a level/flat pitch direction. The number of contour errors was summed over each utterance/melody.</p> <p> <bold>The number of pitch interval errors:</bold> Pitch interval errors were defined as imitated pitch intervals that were larger or smaller than the corresponding model pitch intervals by 100 cents without considering the pitch direction. Specifically, imitated and model pitch intervals were compared using absolute values. The number of pitch interval errors was summed over each utterance/melody.</p> <p> <bold>The absolute duration difference (in milliseconds):</bold> Duration was extracted from each syllable rhyme and then subtracted from their matched model's production to find the absolute difference for each rhyme. The differences were averaged over all rhymes in each utterance/melody and the larger the value, the less accurate the imitation in terms of absolute duration matching.</p> <p> <bold>The relative duration difference (in milliseconds):</bold> Interonset interval (IOI) was calculated as the difference between the onsets of two consecutive syllables/notes, and then subtracted from their matched model's IOI (in absolute value). The differences were averaged over all IOIs in each utterance/melody and the larger the value, the less accurate the imitation in terms of relative duration matching.</p> <p> <bold>The number of time errors:</bold> Time errors were defined as imitated syllables/notes that were more than 25% longer or shorter than the corresponding model syllables/notes ([<reflink idref="bib15" id="ref77">15</reflink>], [<reflink idref="bib16" id="ref78">16</reflink>]; [<reflink idref="bib68" id="ref79">68</reflink>]). In Western tonal music, the durations of different events such as sixteenth notes (1/4 a beat), eighth notes (1/2 a beat), and quarter notes (1 beat) are in simple integer ratio relationships ([<reflink idref="bib20" id="ref80">20</reflink>]). Similarly, speech rhythm can also be measured in relative terms ([<reflink idref="bib61" id="ref81">61</reflink>]; [<reflink idref="bib62" id="ref82">62</reflink>]). Thus, using a 25% deviation to count time errors not only captures the violation of the time signature in music but also makes the comparison of spoken and musical rhythm possible. The number of time errors was summed over each utterance/melody.</p> <p>All statistical analyses were conducted using Rstudio ([<reflink idref="bib73" id="ref83">73</reflink>]). We performed linear mixed-effects analysis, which is robust to violations of statistical assumptions ([<reflink idref="bib24" id="ref84">24</reflink>]; [<reflink idref="bib75" id="ref85">75</reflink>]). The <emph>lme4</emph> ([<reflink idref="bib6" id="ref86">6</reflink>]; [<reflink idref="bib9" id="ref87">9</reflink>]) and <emph>lmerTest</emph> ([<reflink idref="bib38" id="ref88">38</reflink>]) packages were used with the above-mentioned pitch and duration variables as the dependent variable and Group (effect-coded: Non-autism vs Autism), and Condition (effect-coded: Speech vs Music) as well as the interaction between Group and Condition as fixed effects. To take into account the significant group differences in receptive vocabulary and the relatively wide age range, we also added PPVT-R scores and age (both variables were mean-centered) in the models. Years of musical training were significantly associated with only one of the pitch metrics: More musical training was associated with fewer pitch interval errors (<emph>B</emph> = -0.06, SE<emph>B</emph> = 0.03, <emph>t</emph>(61.41) = -2.26, <emph>p</emph> = 0.03). Thus, in the interest of space, musical experience was not considered in the models. All models were fit using the maximal random effects structure that converged with two random factors (subject vs item) ([<reflink idref="bib4" id="ref89">4</reflink>]; [<reflink idref="bib5" id="ref90">5</reflink>]). When the maximal model failed to converge, the random correlations were removed first. If the model still failed to converge, the random effect with the lowest variance was iteratively removed until the model converged. Subsequent post hoc comparisons, if any, were conducted using the <emph>emmeans</emph> package with Holm-Bonferroni correction for multiple comparisons ([<reflink idref="bib42" id="ref91">42</reflink>]).</p> <hd id="AN0183028988-9">Results</hd> <p></p> <hd id="AN0183028988-10">Absolute pitch deviation</hd> <p>Figure 3(a) shows the distribution of absolute pitch deviations for each group in both the Speech and the Music conditions. These values were obtained by averaging the absolute pitch deviations across the syllables/notes (ranging from two to six) within each of the 20 utterances/melodies produced by each participant. These averages captured participants' performance across the entire stimuli while minimizing the variations caused by extreme values (e.g. due to creaky voice). Results revealed a main effect of Condition (<emph>B</emph> = −22.55, SE<emph>B</emph> = 5.49, <emph>t</emph>(31.54) = −4.11, <emph>p</emph> &lt; 0.001) and a Group Condition interaction (<emph>B</emph> = −7.78, SE<emph>B</emph> = 3.36, <emph>t</emph>(51.99) = −2.31, <emph>p</emph> = 0.02). Post hoc analyses with Holm-Bonferroni correction for multiple comparisons suggested no group differences in either condition (Speech: <emph>t</emph>(72.8) = 0.17, <emph>p</emph> = 0.88; Music: <emph>t</emph>(72.6) = 1.67, <emph>p</emph> = 0.10); instead, the interaction was driven by both groups performing better on absolute pitch matching for music than for speech, with the trend being more pronounced in the autism group (<emph>t</emph>(48.5) = 4.76, <emph>p</emph> &lt; 0.001, Music: M(<emph>SD</emph>) = 142.8(108.29); Speech: M(<emph>SD</emph>) = 201.96(110.98)) than in the non-autism group (<emph>t</emph>(50.9) = 2.27, <emph>p</emph> = 0.03, Music: M(<emph>SD</emph>) = 163.31(113.68); Speech: M(<emph>SD</emph>) = 192.22(112.98)). No other remaining main effects were significant (Table 3).</p> <p>Graph: Figure 3. Pitch-related measures for the autism and non-autism groups. (a) Absolute pitch deviations (in cents), with black lines representing mean values. (b) Relative pitch deviations (in cents), with black lines representing mean values. (c) The number of pitch contour errors, with error bars representing the standard deviation. (d) Number of pitch interval errors, with error bars representing the standard deviation. Different plots are selected depending on the nature of the data type, with (a) and (b) representing continuous data, (c) and (d) representing discrete data.</p> <p>Table 3. Coefficients for the linear mixed-effects models for pitch-related measures.</p> <p>Graph</p> <p> <ephtml> &lt;table&gt;&lt;colgroup&gt;&lt;col align="left" /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;/colgroup&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="left"&gt;Measure&lt;/th&gt;&lt;th align="left"&gt;Effect&lt;/th&gt;&lt;th align="left"&gt;Estimate&lt;/th&gt;&lt;th align="left"&gt;Std. Error&lt;/th&gt;&lt;th align="left"&gt;df&lt;/th&gt;&lt;th align="left"&gt;&lt;italic&gt;t&lt;/italic&gt;&lt;/th&gt;&lt;th align="left"&gt;&lt;italic&gt;p&lt;/italic&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td rowspan="5"&gt;Absolute pitch deviation&lt;/td&gt;&lt;td&gt;Group&lt;/td&gt;&lt;td&gt;&amp;#8211;9.47&lt;/td&gt;&lt;td&gt;9.76&lt;/td&gt;&lt;td&gt;59.03&lt;/td&gt;&lt;td&gt;&amp;#8211;0.97&lt;/td&gt;&lt;td&gt;0.34&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Condition&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;22.55&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;5.49&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;31.54&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;4.11&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#60;&lt;/bold&gt; 0.001&lt;xref ref-type="table-fn" rid="tfn2"&gt;**&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&amp;#8211;0.64&lt;/td&gt;&lt;td&gt;0.43&lt;/td&gt;&lt;td&gt;59.01&lt;/td&gt;&lt;td&gt;&amp;#8211;1.48&lt;/td&gt;&lt;td&gt;0.14&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;&amp;#8211;3.44&lt;/td&gt;&lt;td&gt;3.34&lt;/td&gt;&lt;td&gt;59.04&lt;/td&gt;&lt;td&gt;&amp;#8211;1.03&lt;/td&gt;&lt;td&gt;0.31&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Group &amp;#215; Condition&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;7.78&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;3.36&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;51.99&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;2.31&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.02&lt;xref ref-type="table-fn" rid="tfn2"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td rowspan="5"&gt;Relative pitch deviation&lt;/td&gt;&lt;td&gt;Group&lt;/td&gt;&lt;td&gt;7.04&lt;/td&gt;&lt;td&gt;5.25&lt;/td&gt;&lt;td&gt;58.70&lt;/td&gt;&lt;td&gt;1.34&lt;/td&gt;&lt;td&gt;0.19&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Condition&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;44.77&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;9.02&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;23.89&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;4.97&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#60;&lt;/bold&gt; 0.001&lt;xref ref-type="table-fn" rid="tfn2"&gt;**&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&amp;#8211;0.36&lt;/td&gt;&lt;td&gt;0.23&lt;/td&gt;&lt;td&gt;58.64&lt;/td&gt;&lt;td&gt;&amp;#8211;1.55&lt;/td&gt;&lt;td&gt;0.13&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;&amp;#8211;2.89&lt;/td&gt;&lt;td&gt;1.79&lt;/td&gt;&lt;td&gt;58.76&lt;/td&gt;&lt;td&gt;&amp;#8211;1.61&lt;/td&gt;&lt;td&gt;0.11&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Group &amp;#215; Condition&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;7.20&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;3.41&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;60.62&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&amp;#8211;&lt;bold&gt;2.11&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.04&lt;xref ref-type="table-fn" rid="tfn2"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td rowspan="5"&gt;Pitch contour errors&lt;/td&gt;&lt;td&gt;Group&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.06&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.02&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;57.72&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;2.63&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.01&lt;xref ref-type="table-fn" rid="tfn2"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Condition&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;0.18&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.04&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;24.82&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;4.13&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#60;&lt;/bold&gt; 0.001&lt;xref ref-type="table-fn" rid="tfn2"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&amp;#8211;0.0006&lt;/td&gt;&lt;td&gt;0.001&lt;/td&gt;&lt;td&gt;57.64&lt;/td&gt;&lt;td&gt;&amp;#8211;0.60&lt;/td&gt;&lt;td&gt;0.55&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;&amp;#8211;0.009&lt;/td&gt;&lt;td&gt;0.008&lt;/td&gt;&lt;td&gt;57.86&lt;/td&gt;&lt;td&gt;&amp;#8211;1.09&lt;/td&gt;&lt;td&gt;0.28&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Group &amp;#215; Condition&lt;/td&gt;&lt;td&gt;0.02&lt;/td&gt;&lt;td&gt;0.02&lt;/td&gt;&lt;td&gt;60.66&lt;/td&gt;&lt;td&gt;1.05&lt;/td&gt;&lt;td&gt;0.30&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td rowspan="5"&gt;Pitch interval errors&lt;/td&gt;&lt;td&gt;Group&lt;/td&gt;&lt;td&gt;&amp;#8211;0.01&lt;/td&gt;&lt;td&gt;0.04&lt;/td&gt;&lt;td&gt;55.82&lt;/td&gt;&lt;td&gt;&amp;#8211;0.30&lt;/td&gt;&lt;td&gt;0.76&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Condition&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;0.24&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.06&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;23.50&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;4.16&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#60;&lt;/bold&gt; 0.001&lt;xref ref-type="table-fn" rid="tfn2"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&amp;#8211;0.002&lt;/td&gt;&lt;td&gt;0.002&lt;/td&gt;&lt;td&gt;58.91&lt;/td&gt;&lt;td&gt;&amp;#8211;1.06&lt;/td&gt;&lt;td&gt;0.29&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;0.03&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.01&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;59.10&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;2.60&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.01&lt;xref ref-type="table-fn" rid="tfn2"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Group &amp;#215; Condition&lt;/td&gt;&lt;td&gt;&amp;#8211;0.04&lt;/td&gt;&lt;td&gt;0.02&lt;/td&gt;&lt;td&gt;60.91&lt;/td&gt;&lt;td&gt;&amp;#8211;1.68&lt;/td&gt;&lt;td&gt;0.10&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <p>2 Bold values indicate statistical significance at <emph>p</emph> &lt; 0.05. <emph>p</emph> &lt; 0.05, <emph>p</emph> &lt; 0.01, *<emph>p</emph> &lt; 0.001.</p> <hd id="AN0183028988-11">Relative pitch deviation</hd> <p>Figure 3(b) shows the distribution of the relative pitch deviations for each group in both the Speech and the Music conditions. Results revealed a significant main effect of Condition (<emph>B</emph> = −44.77, SE<emph>B</emph> = 9.02, <emph>t</emph>(23.89) = −4.97, <emph>p</emph> &lt; 0.001) and a significant interaction between Group and Condition (<emph>B</emph> = −7.20, SE<emph>B</emph> = 3.41, <emph>t</emph>(60.62) = −2.11, <emph>p</emph> = 0.04). Post hoc analyses with Holm-Bonferroni correction for multiple comparisons suggested that both groups showed better relative pitch matching for music than for speech (Autism: <emph>t</emph>(29.9) = 5.42, <emph>p</emph> &lt; 0.001; Non-autism: <emph>t</emph>(31.1) = 3.88, <emph>p</emph> &lt; 0.001), and the autism group performed worse than the non-autism group in the speech condition (<emph>t</emph>(<reflink idref="bib102" id="ref92">102</reflink>) = −2.27, <emph>p</emph> = 0.03, Autism: M(<emph>SD</emph>) = 215.08(133.52); Non-autism: M(<emph>SD</emph>) = 179.73(107.35)) but not in the music condition (<emph>t</emph>(<reflink idref="bib102" id="ref93">102</reflink>) = 0.03, <emph>p</emph> = 0.98, Autism: M(<emph>SD</emph>) = 113.88(86.71); Non-autism: M(<emph>SD</emph>) = 104.87(68.31)). No other remaining main effects were significant (see Table 3).</p> <hd id="AN0183028988-12">Number of pitch contour errors</hd> <p>Figure 3(c) shows the distribution of the number of pitch contour errors for each group in both the Speech and Music conditions. These values were obtained by summing errors over two to six syllables/notes within each of the 20 utterances/melodies produced by each participant. Results revealed, as shown in Table 3, significant main effects of Group (<emph>B</emph> = 0.06, SE<emph>B</emph> = 0.02, <emph>t</emph>(57.72) = 2.63, <emph>p</emph> = 0.01) and Condition (<emph>B</emph> = −0.18, SE<emph>B</emph> = 0.04, <emph>t</emph>(24.82) = -4.13, <emph>p</emph> &lt; 0.001), as both groups made fewer contour errors with the music condition (Autism: M(<emph>SD</emph>) = 6.88(5.37), Non-autism: M(<emph>SD</emph>) = 3.20(2.73)) than the speech condition (Autism: M(<emph>SD</emph>) = 12.88(4.85), Non-autism: M(<emph>SD</emph>) = 11.33(3.90)), and the autism group exhibited more pitch contour errors than the non-autism group across both conditions. The interaction between Group Condition and the effects of PPVT-R and Age were not significant.</p> <hd id="AN0183028988-13">Number of pitch interval errors</hd> <p>Figure 3(d) shows the distribution of the number of pitch interval errors for each group in both the Speech and Music conditions. As shown in Table 3, the linear mixed-effects model revealed a significant main effect of Condition (<emph>B</emph> = -0.24, SE<emph>B</emph> = 0.06, <emph>t</emph>(23.50) = -4.16, <emph>p</emph> &lt; 0.001), as both groups showed fewer pitch interval errors in the music condition (M(<emph>SD</emph>) = 23.3(7.37)) than in the speech condition (M(<emph>SD</emph>) = 32.24(5.46)). Age was a significant predictor of the performance on pitch interval errors (<emph>B</emph> = −0.03, SE<emph>B</emph> = 0.01, <emph>t</emph>(59.10) = -2.60, <emph>p</emph> = 0.01), with older age associated with fewer interval errors. No other remaining main effects or interactions were significant. In addition, Pearson correlations confirmed the significant association between pitch interval errors and age (<emph>r</emph>(<reflink idref="bib124" id="ref94">124</reflink>) = −0.21, <emph>p</emph> = 0.02), but not with PPVT-<emph>R</emph> (<emph>r</emph>(<reflink idref="bib124" id="ref95">124</reflink>) = −0.04, <emph>p</emph> = 0.59).</p> <hd id="AN0183028988-14">Absolute duration difference</hd> <p>Figure 4(a) shows the distribution of the absolute duration differences for each group in both the Speech and Music conditions. The linear mixed-effects model revealed, as shown in Table 4, significant main effects of Group (<emph>B</emph> = 13.95, SE<emph>B</emph> = 4.91, <emph>t</emph>(58.81) = 2.84, <emph>p</emph> = 0.006), Condition (<emph>B</emph> = 64.65, SE<emph>B</emph> = 4.61, <emph>t</emph>(61.02) = 14.03, <emph>p</emph> &lt; 0.001), as well as a Group * Condition interaction (<emph>B</emph> = 14.35, SE<emph>B</emph> = 4.61, <emph>t</emph>(61.02) = 3.11, <emph>p</emph> = 0.003). Post hoc analyses with Holm-Bonferroni correction for multiple comparisons suggested that both groups showed larger absolute duration differences in the music condition than in the Speech condition (Autism: <emph>t</emph>(61.1) = −12.41, <emph>p</emph> &lt; 0.001; Non-Autism: <emph>t</emph>(60.9) = −7.54, <emph>p</emph> &lt; 0.001, and the autism group produced larger absolute duration differences than did the non-autism group in the music condition (<emph>t</emph>(<reflink idref="bib119" id="ref96">119</reflink>) = −4.21, <emph>p</emph> &lt; 0.001, Autism: M(<emph>SD</emph>) = 222.42(121.64); Non-Autism: M(<emph>SD</emph>) = 156.5(84.82)) but not in the speech condition (<emph>t</emph>(<reflink idref="bib119" id="ref97">119</reflink>) = 0.06, <emph>p</emph> = 0.95, Autism: M(<emph>SD</emph>) = 64.99(50.05); Non-Autism: M(<emph>SD</emph>) = 56.08(29.31)). Receptive vocabulary was a significant predictor of the performance on absolute duration matching (<emph>B</emph> = −0.8, SE<emph>B</emph> = 0.22, <emph>t</emph>(58.76) = −2.71, <emph>p</emph> = 0.009), with larger vocabulary associated with greater accuracy in absolute duration matching. The effect of Age was not significant. Again, Pearson correlations confirmed the significant association between the absolute duration differences and PPVT-<emph>R</emph> (<emph>r</emph>(<reflink idref="bib124" id="ref98">124</reflink>) = −0.22, <emph>p</emph> = 0.02), but not with age (<emph>r</emph>(<reflink idref="bib124" id="ref99">124</reflink>) = −0.02, <emph>p</emph> = 0.79).</p> <p>Graph: Figure 4. Duration-related measures for the autism and non-autism groups. (a) Absolute duration differences (in milliseconds), with black lines representing mean values. (b) Relative duration differences (in milliseconds), with black lines representing mean values. (c) Number of time errors, with error bars representing the standard deviation. Different plots are selected depending on the nature of the data type, with (a) and (b) representing continuous data, and (c) representing discrete data.</p> <p>Table 4. Coefficients for the linear mixed-effects models for duration-related measures.</p> <p>Graph</p> <p> <ephtml> &lt;table&gt;&lt;colgroup&gt;&lt;col align="left" /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;col align="char" char="." /&gt;&lt;/colgroup&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="left"&gt;Measure&lt;/th&gt;&lt;th align="left"&gt;Effect&lt;/th&gt;&lt;th align="left"&gt;Estimate&lt;/th&gt;&lt;th align="left"&gt;Std. Error&lt;/th&gt;&lt;th align="left"&gt;df&lt;/th&gt;&lt;th align="left"&gt;&lt;italic&gt;t&lt;/italic&gt;&lt;/th&gt;&lt;th align="left"&gt;&lt;italic&gt;p&lt;/italic&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td rowspan="5"&gt;Absolute duration difference&lt;/td&gt;&lt;td&gt;Group&lt;/td&gt;&lt;td&gt;&lt;bold&gt;13.95&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;4.91&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;58.81&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;2.84&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.006&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Condition&lt;/td&gt;&lt;td&gt;&lt;bold&gt;64.65&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;4.61&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;61.02&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;14.03&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#60;&lt;/bold&gt; 0.001&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;0.58&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.22&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;58.76&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;2.71&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.009&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;0.19&lt;/td&gt;&lt;td&gt;1.68&lt;/td&gt;&lt;td&gt;58.85&lt;/td&gt;&lt;td&gt;0.11&lt;/td&gt;&lt;td&gt;0.91&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Group &amp;#215; Condition&lt;/td&gt;&lt;td&gt;&lt;bold&gt;14.35&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;4.61&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;61.02&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;3.11&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.003&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td rowspan="5"&gt;Relative duration difference&lt;/td&gt;&lt;td&gt;Group&lt;/td&gt;&lt;td&gt;&lt;bold&gt;10.12&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;4.90&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;58.94&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;2.07&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.04&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Condition&lt;/td&gt;&lt;td&gt;&lt;bold&gt;42.86&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;4.67&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;60.96&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;9.18&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#60;&lt;/bold&gt; 0.001&lt;xref ref-type="table-fn" rid="tfn3"&gt;*&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;0.63&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.22&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;58.88&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;2.90&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.005&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;0.80&lt;/td&gt;&lt;td&gt;1.68&lt;/td&gt;&lt;td&gt;58.98&lt;/td&gt;&lt;td&gt;0.48&lt;/td&gt;&lt;td&gt;0.63&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Group &amp;#215; Condition&lt;/td&gt;&lt;td&gt;8.53&lt;/td&gt;&lt;td&gt;4.67&lt;/td&gt;&lt;td&gt;60.96&lt;/td&gt;&lt;td&gt;1.83&lt;/td&gt;&lt;td&gt;0.07&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td rowspan="5"&gt;Time errors&lt;/td&gt;&lt;td&gt;Group&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.15&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.07&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;68.00&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;2.08&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.04&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Condition&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;0.48&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.07&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;69.65&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#8211;6.68&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;&amp;#60;&lt;/bold&gt; 0.001&lt;xref ref-type="table-fn" rid="tfn3"&gt;*&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PPVT-R&lt;/td&gt;&lt;td&gt;&amp;#8211;0.005&lt;/td&gt;&lt;td&gt;0.003&lt;/td&gt;&lt;td&gt;58.92&lt;/td&gt;&lt;td&gt;&amp;#8211;1.67&lt;/td&gt;&lt;td&gt;0.10&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Age&lt;/td&gt;&lt;td&gt;&amp;#8211;0.001&lt;/td&gt;&lt;td&gt;0.02&lt;/td&gt;&lt;td&gt;59.04&lt;/td&gt;&lt;td&gt;&amp;#8211;0.44&lt;/td&gt;&lt;td&gt;0.67&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Group &amp;#215; Condition&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.14&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;0.06&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;61.13&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;&lt;bold&gt;2.15&lt;/bold&gt;&lt;/td&gt;&lt;td&gt;0.04&lt;xref ref-type="table-fn" rid="tfn3"&gt;&lt;/xref&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <p>3 Bold values indicate statistical significance at <emph>p</emph> &lt; 0.05. <emph>p</emph> &lt; 0.05, <emph>p</emph> &lt; 0.01, **<emph>p</emph> &lt; 0.001.</p> <hd id="AN0183028988-15">Relative duration difference</hd> <p>Figure 4(b) shows the distribution of the relative duration differences for each group in both the Speech and Music conditions. The linear mixed-effects model revealed significant main effects of Group (<emph>B</emph> = 10.12, SE<emph>B</emph> = 4.90, <emph>t</emph>(58.94) = 2.07, <emph>p</emph> = 0.04) and Condition (<emph>B</emph> = 42.86, SE<emph>B</emph> = 4.67, <emph>t</emph>(60.96) = 9.18, <emph>p</emph> &lt; 0.001). Both groups showed larger relative duration differences in the music condition than in the speech condition, and the autism group produced larger relative duration differences than did the non-autism group not only in the music condition (Autism: M(<emph>SD</emph>) = 163.15(129.26); Non-Autism: M(<emph>SD</emph>) = 116.66(80.55)) but also in the speech condition (Autism: M(<emph>SD</emph>) = 60.64(53.99); Non-Autism: M(<emph>SD</emph>) = 48.2(30.89)). Similarly, receptive vocabulary was a significant predictor of performance on relative duration matching (<emph>B</emph> = −0.63, SE<emph>B</emph> = 0.22, <emph>t</emph>(58.88) = −2.90, <emph>p</emph> = 0.005): the larger the receptive vocabulary of the participants, the greater the accuracy in their relative duration matching. The interaction between Group and Condition, and the effect of Age were not significant (see Table 4). Pearson correlations confirmed the significant association between the relative duration differences and PPVT-<emph>R</emph> (<emph>r</emph>(<reflink idref="bib124" id="ref100">124</reflink>) = -0.26, <emph>p</emph> = 0.003), but not with age (<emph>r</emph>(<reflink idref="bib124" id="ref101">124</reflink>) = 0.007, <emph>p</emph> = 0.94).</p> <hd id="AN0183028988-16">Number of time errors</hd> <p>Figure 4(c) shows the distribution of the number of time errors for each group in both the Speech and Music conditions. The linear mixed-effects model revealed significant main effects of Group (<emph>B</emph> = 0.15, SE<emph>B</emph> = 0.07, <emph>t</emph>(<reflink idref="bib68" id="ref102">68</reflink>) = 2.08, <emph>p</emph> = 0.04) and Condition (<emph>B</emph> = -0.48, SE<emph>B</emph> = 0.07, <emph>t</emph>(69.65) = -6.68, <emph>p</emph> &lt; 0.001). The interaction between Group and Condition was also significant (<emph>B</emph> = 0.14, SE<emph>B</emph> = 0.06, <emph>t</emph>(61.13) = 2.15, <emph>p</emph> = 0.04). Post hoc analyses with Holm-Bonferroni correction for multiple comparisons suggested that both groups showed fewer time errors in music imitation than in speech imitation (Autism: <emph>t</emph>(70.6) = 3.61, <emph>p</emph> &lt; 0.001; Non-Autism: <emph>t</emph>(69.8) = 6.28, <emph>p</emph> &lt; 0.001), and the autism group performed worse than the non-autism group in the music condition (<emph>t</emph>(<reflink idref="bib128" id="ref103">128</reflink>) = -2.98, <emph>p</emph> = 0.003, Autism: M(<emph>SD</emph>) = 33.67(16.53); Non-Autism: M(<emph>SD</emph>) = 20.7(15.31)), but not in the speech condition (<emph>t</emph>(<reflink idref="bib129" id="ref104">129</reflink>) = -0.12, <emph>p</emph> = 0.90, Autism: M(<emph>SD</emph>) = 46.27(11.22); Non-Autism: M(<emph>SD</emph>) = 45.03(10.63)). The effects of PPVT-R and Age did not reach significance (see Table 4).</p> <hd id="AN0183028988-17">Discussion</hd> <p>Using matched speech and song stimuli, the present study investigated vocal imitation in Mandarin-speaking autistic and non-autistic individuals. Our acoustic analysis unveiled distinct patterns in vocal imitation performance between the two groups.</p> <p>For speech imitation, Mandarin-speaking autistic participants were less accurate than non-autistic individuals in matching relative pitch and duration. For song imitation, they showed reduced performance on both relative and absolute duration matching. These results are inconsistent with the patterns observed in English speakers ([<reflink idref="bib83" id="ref105">83</reflink>]), where English-speaking autistic individuals exhibited differences with absolute but not relative pitch and duration matching in both speech and music conditions. Specifically, we did not observe reduced absolute pitch matching in Mandarin-speaking autistic individuals, for either speech or song, contrary to the evidence presented by English-speaking individuals. The reason for this may be related to the tone language background. Indeed, [<reflink idref="bib18" id="ref106">18</reflink>] found that tone language speakers display a remarkably precise and stable form of absolute pitch when reproducing words. This may be because absolute pitch originally evolved as a feature of speech, similar to other features such as vowel quality, and speakers of tone languages naturally acquire this feature during critical periods of speech acquisition ([<reflink idref="bib18" id="ref107">18</reflink>]). Moreover, when using machine learning-based analysis to differentiate speech produced by autistic and non-autistic individuals, variations of voice pitch (e.g. absolute features) were significant between the two groups only for English speakers but not for Cantonese speakers ([<reflink idref="bib41" id="ref108">41</reflink>]). Thus, our Mandarin-speaking autistic participants, despite their relatively smaller receptive vocabularies compared to their peers, still had the advantage of a tone language background and showed comparable performance to non-autistic participants in terms of absolute pitch matching.</p> <p>Regarding duration matching, the present findings complement those of [<reflink idref="bib41" id="ref109">41</reflink>], where both English- and Cantonese-speaking autistic individuals exhibited atypical rhythm production relative to non-autistic individuals. Our results from Mandarin speakers further reveal that such rhythmic differences may be primarily driven by relative rather than absolute duration-matching abilities. In contrast, for English speakers, speech rhythm differences between autistic and non-autistic groups were evident in absolute rather than relative duration matching ([<reflink idref="bib83" id="ref110">83</reflink>]). Consequently, although differences with speech duration matching are shared across linguistic groups in autism, the underlying cause as related to absolute versus relative duration feature matching may vary across languages. In addition, consistent with previous studies ([<reflink idref="bib11" id="ref111">11</reflink>]; [<reflink idref="bib39" id="ref112">39</reflink>]), the current results showed that participants with higher receptive vocabulary abilities performed better in imitating the absolute and relative duration of notes/syllables. This relationship suggests that a larger receptive vocabulary may be linked to better temporal processing and timing control, which are crucial for accurate duration imitation and speech production. Therefore, future research should incorporate receptive verbal skills, along with expressive language, to provide a more holistic understanding of language abilities and their impact on duration imitation skills. Consistent with the hypothesis linking linguistic and musical rhythm ([<reflink idref="bib61" id="ref113">61</reflink>]; [<reflink idref="bib62" id="ref114">62</reflink>]), atypical duration matching in the autism group was observed not only in speech but also in song imitation.</p> <p>In terms of the research questions posed and our predictions, our finding of reduced duration matching but intact pitch matching during song imitation in autism is consistent with our hypothesis. Contrary to our hypothesis, however, both reduced relative pitch and duration matching were present during speech imitation in autism. This finding is to some extent in line with previous results showing atypical pitch and duration production of speech in autism ([<reflink idref="bib12" id="ref115">12</reflink>]; [<reflink idref="bib23" id="ref116">23</reflink>]; [<reflink idref="bib30" id="ref117">30</reflink>]). Our results further indicate that imitation differences in speech might only be observed in relative rather than absolute features in Mandarin-speaking autistic individuals. As speaking a tone language is one of the most robust ways to improve the ability to process pitch, including both perception and production ([<reflink idref="bib7" id="ref118">7</reflink>]; [<reflink idref="bib10" id="ref119">10</reflink>]; [<reflink idref="bib13" id="ref120">13</reflink>]; [<reflink idref="bib43" id="ref121">43</reflink>]; [<reflink idref="bib66" id="ref122">66</reflink>]), we hypothesized that experience with a native tone language might have a compensatory effect on possible pitch matching difficulties in Mandarin-speaking autistic individuals. That is, we expected that in the current imitation tasks, autistic participants would show reduced duration but not pitch imitation in both speech and song compared to non-autistic participants. However, the results revealed that this compensatory effect appears to be present only when imitating song stimuli.</p> <p>To the best of our knowledge, pitch and duration matching in speech and song imitation has not been previously studied in Mandarin-speaking autistic individuals, making it difficult to find evidence to explain why Mandarin-speaking autistic individuals show preservation of relative pitch in music but not in a speech during vocal imitation. One possibility might relate to the different precision requirements for pitch processing between speech and music. There has been ample evidence suggesting that, to achieve adequate communication, a higher degree of pitch precision is required in conveying musical meaning than speech meaning ([<reflink idref="bib44" id="ref123">44</reflink>]; [<reflink idref="bib59" id="ref124">59</reflink>], [<reflink idref="bib60" id="ref125">60</reflink>]). Indeed, the present study, together with previous studies ([<reflink idref="bib44" id="ref126">44</reflink>]; [<reflink idref="bib49" id="ref127">49</reflink>]; [<reflink idref="bib83" id="ref128">83</reflink>]), found that both autistic and non-autistic individuals imitated song more accurately than speech on all pitch-related measures. Thus, the compensatory effect of experience with a native tone language on autistic individuals seems to work only when pitch precision is required, as in music; but not when pitch approximation is needed, as in speech. The inactivated compensatory effect of pitch in speech led to reduced performance in the autism group compared to the non-autism group. Another possibility may be linked to the multi-role of pitch in tone languages. As aforementioned, unlike in intonation languages, the imitation of pitch in tone languages occurs in parallel including prosodic meaning at the sentence level and lexical meaning at the syllable or word level, which increases the complexity and difficulty of pitch imitation in the speech condition ([<reflink idref="bib46" id="ref129">46</reflink>]; [<reflink idref="bib92" id="ref130">92</reflink>]). Finally, extensive research has shown a dissociation between musical (enhanced or intact) and linguistic (reduced) skills in autism (for reviews, see [<reflink idref="bib57" id="ref131">57</reflink>]; [<reflink idref="bib58" id="ref132">58</reflink>]; [<reflink idref="bib69" id="ref133">69</reflink>]). Autistic individuals also show typical brain activations and connectivity to musical stimuli but not to speech stimuli ([<reflink idref="bib40" id="ref134">40</reflink>]; [<reflink idref="bib77" id="ref135">77</reflink>]). Thus, typical pitch imitation for songs among autistic Mandarin speakers is in line with the existing wider literature. Further studies are needed to explore these possibilities.</p> <p>Interestingly, autistic participants made more pitch contour errors than non-autistic participants across speech and music domains. There are four lexical tones in Mandarin, high level, high rising, falling-rising, and high falling, which correspond to four different shapes of pitch contour ([<reflink idref="bib28" id="ref136">28</reflink>]). Research has found that Mandarin speakers are more sensitive to pitch contours than speakers of intonation languages ([<reflink idref="bib29" id="ref137">29</reflink>]; [<reflink idref="bib43" id="ref138">43</reflink>]; [<reflink idref="bib90" id="ref139">90</reflink>]). In addition, a recent study examined the pitch production of Cantonese tones (CT) in Cantonese- and Mandarin-speaking autistic and non-autistic children ([<reflink idref="bib12" id="ref140">12</reflink>]). They found that autistic children exhibited atypical pitch production for contour tones with steeper slopes (i.e. CT25 in the study) but not for level tones (i.e. CT55, CT33, and CT22) or contour tones with flatter slopes (i.e. CT21, CT23). In the present study, pitch contours were defined based on the pitch heights of two consecutive syllables/notes: up or down if the difference in pitch interval was higher or lower by 50 cents or more; otherwise, flat. Each participant had 60 values of pitch contour errors for each condition. Out of a total of 120 values created by the male/female model, only six were flat contours. Thus, the current results extended the findings of [<reflink idref="bib12" id="ref141">12</reflink>], suggesting that autistic children who speak a tone language might differ in producing pitch contours across syllables in both speech and music domains compared to their peers. In addition, older participants were associated with fewer pitch interval errors, suggesting that age-related maturation positively influences the accuracy of pitch interval imitation. Evidence from the music domain suggests that there are learning and transfer effects in vocal matching of pitch intervals ([<reflink idref="bib27" id="ref142">27</reflink>]), which aligns with our findings. However, the effect of age was only observed in the matching of pitch intervals among the pitch-related parameters, indicating that these results should be interpreted with caution and warrant further investigation.</p> <p>Moreover, in line with previous studies ([<reflink idref="bib44" id="ref143">44</reflink>]; [<reflink idref="bib49" id="ref144">49</reflink>]; [<reflink idref="bib59" id="ref145">59</reflink>], [<reflink idref="bib60" id="ref146">60</reflink>]) both autistic and non-autistic Mandarin speakers showed greater sensitivity to duration in speech than in song, while exhibiting greater sensitivity to pitch in song compared to speech. This suggests that pitch imitation is independent of the imitation of duration across different domains (speech vs music) ([<reflink idref="bib15" id="ref147">15</reflink>], [<reflink idref="bib16" id="ref148">16</reflink>]; [<reflink idref="bib20" id="ref149">20</reflink>]; [<reflink idref="bib49" id="ref150">49</reflink>]). These results support previous findings in perception research that suggest the perception of speech content is most affected by degradation in the temporal dimension, while the perception of melodic content is most affected by degradation in the spectral dimension ([<reflink idref="bib1" id="ref151">1</reflink>]).</p> <p>While our study provides valuable insights into vocal imitation in autistic individuals within tonal linguistic contexts, several limitations should be acknowledged. First, due to our task demands, we recruited participants whose cognitive functioning lay on the typical to the higher end of the distribution on the autism spectrum. This limited the generalizability of our current findings to individuals with cognitive disadvantages, a research area that remains to be explored. In addition, given the severe shortage of reliable and standardized speech and language assessment tools available in the Chinese language, especially in Mandarin ([<reflink idref="bib33" id="ref152">33</reflink>]), the PPVT-R was chosen to measure receptive vocabulary skills. While the PPVT-R is a well-established instrument for assessing vocabulary, it focuses specifically on receptive vocabulary and does not fully capture the participants' overall language abilities. In particular, without a measure of expressive language, we cannot rule out the possibility that group differences may be influenced by variations in expressive language abilities. It should also be noted that due to the limitation of available Chinese norms of the PPVT-R, we supplemented our analysis with the American norms ([<reflink idref="bib22" id="ref153">22</reflink>]) for standardization purposes. This reliance on over 40-year-old norms may explain the higher receptive vocabulary abilities observed in our sample. Future research would benefit from the development and validation of comprehensive assessments of both receptive and expressive language, as well as pragmatic skills that are tailored to the linguistic characteristics of the Mandarin-speaking population ([<reflink idref="bib94" id="ref154">94</reflink>]) to provide a more holistic understanding of language abilities and vocal imitation skills. Finally, the age range of our participants was relatively wide, including both children and adolescents. While we incorporated age as a factor in the statistical analysis to account for potential age-related variations, the observed nonsignificant age effects in most results suggest that, within the current sample, age may not be a prominent factor influencing vocal imitation abilities. However, it is crucial to recognize that puberty introduces substantial alterations to the vocal apparatus, along with developmental changes in the vocal tract and vocal fold length ([<reflink idref="bib26" id="ref155">26</reflink>]). Despite our efforts to control for age-related differences, the variability in the timing and the extent to which development-related voice changes may contribute to the nuanced outcomes in vocal imitation remains to be assessed. Future investigations with a more refined age focus or additional measures to directly assess and control for development could offer a more comprehensive understanding of the intricate interplay between vocal imitation abilities in autism and developmental changes.</p> <p>Finally, it is worth exploring the potential clinical relevance of the current results on relative versus absolute feature matching during vocal imitation in autism. Research has shown that effective imitation of vocal features enhances language acquisition in both typical development ([<reflink idref="bib37" id="ref156">37</reflink>]; [<reflink idref="bib50" id="ref157">50</reflink>]) and in autism ([<reflink idref="bib72" id="ref158">72</reflink>]; [<reflink idref="bib78" id="ref159">78</reflink>]). It has been suggested that social reinforcement through caregivers' vocal imitation can facilitate infants' vocalizations ([<reflink idref="bib55" id="ref160">55</reflink>]; [<reflink idref="bib64" id="ref161">64</reflink>]), and slowing down the presentation of vocal sounds can better induce vocal imitation in autistic children ([<reflink idref="bib79" id="ref162">79</reflink>]). Thus, autistic children's language learning may benefit from vocal imitation of sung materials, an area of research that warrants experimental investigations.</p> <hd id="AN0183028988-18">Conclusion</hd> <p>This study assessed, for the first time, the vocal imitation ability of Mandarin-speaking autistic individuals, using speech and song stimuli matched for linguistic content and pitch contour. The results indicated that Mandarin-speaking autistic individuals showed atypical duration but not pitch matching during song imitation, whereas for speech imitation only relative but not absolute pitch and duration matching was atypical. In addition, Mandarin-speaking autistic individuals showed differences in imitating pitch contours across speech and song. These findings reveal a vocal imitation atypicality across speech and music domains among Mandarin-speaking autistic individuals, with a unique pattern that differs from previous studies focusing on non-tonal language speakers. This study therefore extends our understanding of vocal imitation in autism across different languages. Future research should examine vocal imitation from other linguistic contexts to consolidate the current results.</p> <hd id="AN0183028988-19">Supplemental Material</hd> <p>Graph: Supplemental material, sj-docx-1-aut-10.1177_13623613241275395 for Atypical vocal imitation of speech and song in autism spectrum disorder: Evidence from Mandarin speakers by Li Wang, Peter Q Pfordresher, Cunmei Jiang and Fang Liu in Autism</p> <ref id="AN0183028988-20"> <title> References </title> <blist> <bibl id="bib1" idref="ref33" type="bt">1</bibl> <bibtext> Albouy P., Benjamin L., Morillon B., Zatorre R. J. (2020). Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody. Science, 367(6481), 1043–1047. https://doi.org/10.1126/science.aaz3468</bibtext> </blist> <blist> <bibl id="bib2" idref="ref34" type="bt">2</bibl> <bibtext> American Psychiatric Association. (1994). American Psychiatric Association diagnostic and statistical manual of mental disorders (DSM-IV) (4th ed.).</bibtext> </blist> <blist> <bibl id="bib3" idref="ref39" type="bt">3</bibl> <bibtext> American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (DSM-5). <ulink href="http://dsm.psychiatryonline.org/doi/book/10.1176/appi.books.9780890425596">http://dsm.psychiatryonline.org/doi/book/10.1176/appi.books.9780890425596</ulink></bibtext> </blist> <blist> <bibl id="bib4" idref="ref89" type="bt">4</bibl> <bibtext> Barr D. J. (2013). Random effects structure for testing interactions in linear mixed-effects models. Frontiers in Psychology, 4, Article 328. https://doi.org/10.3389/fpsyg.2013.00328</bibtext> </blist> <blist> <bibl id="bib5" idref="ref90" type="bt">5</bibl> <bibtext> Barr D. J., Levy R., Scheepers C., Tily H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001</bibtext> </blist> <blist> <bibl id="bib6" idref="ref86" type="bt">6</bibl> <bibtext> Bates D., Maechler M., Bolker B. (2012). lme4: Linear mixed-effects models using S4 classes (R package version 0.999999-0). R Core Team.</bibtext> </blist> <blist> <bibl id="bib7" idref="ref118" type="bt">7</bibl> <bibtext> Bidelman G. M., Hutka S., Moreno S. (2013). Tone language speakers and musicians share enhanced perceptual and cognitive abilities for musical pitch: Evidence for bidirectionality between the domains of language and music. PLOS ONE, 8(4), Article e60676. https://doi.org/10.1371/journal.pone.0060676</bibtext> </blist> <blist> <bibl id="bib8" idref="ref60" type="bt">8</bibl> <bibtext> Boersma P., Weenink D. (2001). Praat, a system for doing phonetics by computer. Glot International, 5, 341–345.</bibtext> </blist> <blist> <bibl id="bib9" idref="ref87" type="bt">9</bibl> <bibtext> Brauer M., Curtin J. J. (2018). Linear mixed-effects models and the analysis of nonindependent data: A unified framework to analyze categorical and continuous independent variables that vary within-subjects and/or within-items. Psychological Methods, 23(3), 389–411. https://doi.org/10.1037/met0000159</bibtext> </blist> <blist> <bibtext> Burnham D., Kasisopa B., Reid A., Luksaneeyanawin S., Lacerda F., Attina V., Rattanasone N. X., Schwarz I.-C., Webster D. (2015). Universality and language-specific experience in the perception of lexical tone and pitch. Applied Psycholinguistics, 36(6), 1459–1491. https://doi.org/10.1017/S0142716414000496</bibtext> </blist> <blist> <bibtext> Carello C., LeVasseur V. M., Schmidt R. C. (2002). Movement sequencing and phonological fluency in (putatively) nonimpaired readers. Psychological Science, 13(4), 375–379. https://doi.org/10.1111/1467-9280.00467</bibtext> </blist> <blist> <bibtext> Chen F., Cheung C.-H., Peng G. (2022). Linguistic tone and non-linguistic pitch imitation in children with autism spectrum disorders: A cross-linguistic investigation. Journal of Autism and Developmental Disorders, 52(5), 2325–2343. https://doi.org/10.1007/s10803-021-05123-4</bibtext> </blist> <blist> <bibtext> Creel S. C., Weng M., Fu G., Heyman G. D., Lee K. (2018). Speaking a tone language enhances musical pitch perception in 3–5-year-olds. Developmental Science, 21(1), Article e12503. https://doi.org/10.1111/desc.12503</bibtext> </blist> <blist> <bibtext> Dalla Bella S., Berkowska M., Sowiński J. (2011). Disorders of pitch production in tone deafness. Frontiers in Psychology, 2, Article 164. https://doi.org/10.3389/fpsyg.2011.00164</bibtext> </blist> <blist> <bibtext> Dalla Bella S., Deutsch D., Giguère J.-F., Peretz I., Deutsch D. (2007). Singing proficiency in the general population. The Journal of the Acoustical Society of America, 121(2), 1182–1189. https://doi.org/10.1121/1.2427111</bibtext> </blist> <blist> <bibtext> Dalla Bella S., Giguère J.-F., Peretz I. (2009). Singing in congenital amusia. The Journal of the Acoustical Society of America, 126(1), 414–424. https://doi.org/10.1121/1.3132504</bibtext> </blist> <blist> <bibtext> Deutsch D. (2013). Absolute pitch. In Deutsch D. (Ed.), The psychology of music (pp. 141–182). Elsevier. https://doi.org/10.1016/B978-0-12-381460-9.00005-5</bibtext> </blist> <blist> <bibtext> Deutsch D., Henthorn T., Dolson M. (2004). Absolute pitch, speech, and tone language: Some experiments and a proposed framework. Music Perception, 21(3), 339–356.</bibtext> </blist> <blist> <bibtext> Diehl J. J., Paul R. (2012). Acoustic differences in the imitation of prosodic patterns in children with autism spectrum disorders. Research in Autism Spectrum Disorders, 6(1), 123–134. https://doi.org/10.1016/j.rasd.2011.03.012</bibtext> </blist> <blist> <bibtext> Drake C., Palmer C. (2000). Skill acquisition in music performance: Relations between planning and temporal control. Cognition, 74(1), 1–32. https://doi.org/10.1016/S0010-0277(99)00061-X</bibtext> </blist> <blist> <bibtext> Duda M., Kosmicki J. A., Wall D. P. (2014). Testing the accuracy of an observation-based classifier for rapid detection of autism risk. Translational Psychiatry, 4(8), Article e424. https://doi.org/10.1038/tp.2014.65</bibtext> </blist> <blist> <bibtext> Dunn L. M., Dunn L. M. (1981). Peabody Picture Vocabulary Test-Revised. American Guidance Service.</bibtext> </blist> <blist> <bibtext> Fosnot S. M., Jun S.-A. (1999). Prosodic characteristics in children with stuttering or autism during reading and imitation. https://<ulink href="http://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS1999/papers/p14%5f1925.pdf">www.internationalphoneticassociation.org/icphs-proceedings/ICPhS1999/papers/p14%5f1925.pdf</ulink></bibtext> </blist> <blist> <bibtext> Gelman A., Hill J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.</bibtext> </blist> <blist> <bibtext> Gotham K., Pickles A., Lord C. (2009). Standardizing ADOS scores for a measure of severity in autism spectrum disorders. Journal of Autism and Developmental Disorders, 39(5), 693–705. https://doi.org/10.1007/s10803-008-0674-3</bibtext> </blist> <blist> <bibtext> Harries M. L. L., Walker J. M., Williams D. M., Hawkins S., Hughes I. A. (1997). Changes in the male voice at puberty. Archives of Disease in Childhood, 77(5), 445–447. https://doi.org/10.1136/adc.77.5.445</bibtext> </blist> <blist> <bibtext> Harvey N., Garwood J., Palencia M. (1987). Vocal matching of pitch intervals: Learning and transfer effects'. Psychology of Music, 15(1), 90–106. https://doi.org/10.1177/0305735687151007</bibtext> </blist> <blist> <bibtext> Howie J. M. (1976). Acoustical studies of Mandarin vowels and tones. Cambridge University Press.</bibtext> </blist> <blist> <bibtext> Huang T., Johnson K. (2011). Language specificity in speech perception: Perception of Mandarin tones by native and nonnative listeners. Phonetica, 67(4), 243–267. https://doi.org/10.1159/000327392</bibtext> </blist> <blist> <bibtext> Hubbard K., Trauner D. A. (2007). Intonation and emotion in autistic spectrum disorders. Journal of Psycholinguistic Research, 36(2), 159–173. https://doi.org/10.1007/s10936-006-9037-4</bibtext> </blist> <blist> <bibtext> Hurley S. L., Chater N. (2005). Perspectives on imitation: Imitation, human development, and culture. MIT Press.</bibtext> </blist> <blist> <bibtext> Ingersoll B. (2008). The social role of imitation in autism: Implications for the treatment of imitation deficits. Infants &amp; Young Children, 21(2), 107–119. https://doi.org/10.1097/01.IYC.0000314482.24087.14</bibtext> </blist> <blist> <bibtext> Jin L., Zhu H. (2023). Developing standardized speech and language assessment tools in Mandarin Chinese: A context for improving reading and writing. Journal of Chinese Writing Systems, 7(3), 150–160. https://doi.org/10.1177/25138502231195119</bibtext> </blist> <blist> <bibtext> Johnson K. (2011). Acoustic and auditory phonetics. John Wiley &amp; Sons.</bibtext> </blist> <blist> <bibtext> Klein D., Zatorre R. J., Milner B., Zhao V. (2001). A cross-linguistic PET study of tone perception in Mandarin Chinese and English speakers. NeuroImage, 13(4), 646–653. https://doi.org/10.1006/nimg.2000.0738</bibtext> </blist> <blist> <bibtext> Krishnan A., Gandour J. T. (2009). The role of the auditory brainstem in processing linguistically-relevant pitch patterns. Brain and Language, 110(3), 135–148. https://doi.org/10.1016/j.bandl.2009.03.005</bibtext> </blist> <blist> <bibtext> Kuhl P. K., Meltzoff A. N. (1996). Infant vocalizations in response to speech: Vocal imitation and developmental change. The Journal of the Acoustical Society of America, 100(4), 2425–2438. https://doi.org/10.1121/1.417951</bibtext> </blist> <blist> <bibtext> Kuznetsova A., Brockhoff P. B., Christensen R. H. B. (2017). LmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13</bibtext> </blist> <blist> <bibtext> Ladányi E., Persici V., Fiveash A., Tillmann B., Gordon R. L. (2020). Is atypical rhythm a risk factor for developmental speech and language disorders? Wires Cognitive Science, 11(5), Article e1528. https://doi.org/10.1002/wcs.1528</bibtext> </blist> <blist> <bibtext> Lai G., Pantazatos S. P., Schneider H., Hirsch J. (2012). Neural systems for speech and song in autism. Brain, 135(3), 961–975. https://doi.org/10.1093/brain/awr335</bibtext> </blist> <blist> <bibtext> Lau J. C. Y., Patel S., Kang X., Nayar K., Martin G. E., Choy J., Wong P. C. M., Losh M. (2022). Cross-linguistic patterns of speech prosodic differences in autism: A machine learning study. PLOS ONE, 17(6), Article e0269637. https://doi.org/10.1371/journal.pone.0269637</bibtext> </blist> <blist> <bibtext> Lenth R., Singmann H., Love J., Buerkner P., Herve M. (2018). Emmeans: Estimated marginal means, aka least-squares means (1(2)) (R package version). https://github.com/rvlenth/emmeans</bibtext> </blist> <blist> <bibtext> Li Y., Tang C., Lu J., Wu J., Chang E. F. (2021). Human cortical encoding of pitch in tonal and non-tonal languages. Nature Communications, 12(1), Article 1. https://doi.org/10.1038/s41467-021-21430-x</bibtext> </blist> <blist> <bibtext> Liu F., Jiang C., Pfordresher P. Q., Mantell J. T., Xu Y., Yang Y., Stewart L. (2013). Individuals with congenital amusia imitate pitches more accurately in singing than in speaking: Implications for music and language processing. Attention, Perception, &amp; Psychophysics, 75(8), 1783–1798. https://doi.org/10.3758/s13414-013-0506-1</bibtext> </blist> <blist> <bibtext> Liu F., Jiang C., Thompson W. F., Xu Y., Yang Y., Stewart L. (2012). The mechanism of speech processing in congenital amusia: Evidence from Mandarin speakers. PLOS ONE, 7(2), Article e30374. https://doi.org/10.1371/journal.pone.0030374</bibtext> </blist> <blist> <bibtext> Liu F., Xu Y. (2005). Parallel encoding of focus and interrogative meaning in Mandarin intonation. Phonetica, 62(2–4), 70–87. https://doi.org/10.1159/000090090</bibtext> </blist> <blist> <bibtext> Liu J., Hilton C. B., Bergelson E., Mehr S. A. (2023). Language experience predicts music processing in a half-million speakers of fifty-four languages. Current Biology, 33(10), 1916.e–1925.e4. https://doi.org/10.1016/j.cub.2023.03.067</bibtext> </blist> <blist> <bibtext> Lord C., Rutter M., DiLavore P., Risi S., Gotham K., Bishop S. (2012). Autism diagnostic observation schedule–2nd edition (ADOS-2). Western Psychological Corporation.</bibtext> </blist> <blist> <bibtext> Mantell J. T., Pfordresher P. Q. (2013). Vocal imitation of song and speech. Cognition, 127(2), 177–202. https://doi.org/10.1016/j.cognition.2012.12.008</bibtext> </blist> <blist> <bibtext> Masur E. F., Olson J. (2008). Mothers' and infants' responses to their partners' spontaneous action and vocal/verbal imitation. Infant Behavior &amp; Development, 31(4), 704–715. https://doi.org/10.1016/j.infbeh.2008.04.005</bibtext> </blist> <blist> <bibtext> Mazaheri S., Soleymani Z. (2018). Imitation skill in children with autism spectrum disorder and its influence on their language acquisition and communication skills. Journal of Modern Rehabilitation, 12(3), 141–148.</bibtext> </blist> <blist> <bibtext> Mecke A.-C., Sundberg J. (2010). Gender differences in children's singing voices: Acoustic analyses and results of a listening test. The Journal of the Acoustical Society of America, 127(5), 3223–3231. https://doi.org/10.1121/1.3372730</bibtext> </blist> <blist> <bibtext> Meltzoff A. N. (2017). Elements of a comprehensive theory of infant imitation. Behavioral and Brain Sciences, 40, Article e396. https://doi.org/10.1017/S0140525X1600193X</bibtext> </blist> <blist> <bibtext> Mercado E., Mantell J. T., Pfordresher P. Q. (2014). Imitating sounds: A cognitive approach to understanding vocal imitation. Comparative Cognition &amp; Behavior Reviews, 9, 17–74. https://doi.org/10.3819/ccbr.2014.90002</bibtext> </blist> <blist> <bibtext> Neimy H., Pelaez M., Carrow J., Monlux K., Tarbox J. (2017). Infants at risk of autism and developmental disorders: Establishing early social skills. Behavioral Development Bulletin, 22(1), 6–22. https://doi.org/10.1037/bdb0000046</bibtext> </blist> <blist> <bibtext> Nicollas R., Garrel R., Ouaknine M., Giovanni A., Nazarian B., Triglia J.-M. (2008). Normal voice in children between 6 and 12 years of age: Database and nonlinear analysis. Journal of Voice, 22(6), 671–675. https://doi.org/10.1016/j.jvoice.2007.01.009</bibtext> </blist> <blist> <bibtext> O'Connor K. (2012). Auditory processing in autism spectrum disorder: A review. Neuroscience &amp; Biobehavioral Reviews, 36(2), 836–854. https://doi.org/10.1016/j.neubiorev.2011.11.008</bibtext> </blist> <blist> <bibtext> Ouimet T., Foster N. E. V., Tryfon A., Hyde K. L. (2012). Auditory-musical processing in autism spectrum disorders: A review of behavioral and brain imaging studies. Annals of the New York Academy of Sciences, 1252, 325–331. https://doi.org/10.1111/j.1749-6632.2012.06453.x</bibtext> </blist> <blist> <bibtext> Patel A. D. (2008). Music, language, and the brain. Oxford University Press.</bibtext> </blist> <blist> <bibtext> Patel A. D. (2011). Why would musical training benefit the neural encoding of speech? The OPERA hypothesis. Frontiers in Psychology, 2, Article 142. https://doi.org/10.3389/fpsyg.2011.00142</bibtext> </blist> <blist> <bibtext> Patel A. D., Daniele J. R. (2003). An empirical comparison of rhythm in language and music. Cognition, 87(1), B35–B45. https://doi.org/10.1016/S0010-0277(02)00187-7</bibtext> </blist> <blist> <bibtext> Patel A. D., Iversen J. R., Rosenberg J. C. (2006). Comparing the rhythm and melody of speech and music: The case of British English and French. The Journal of the Acoustical Society of America, 119(5), 3034–3047. https://doi.org/10.1121/1.2179657</bibtext> </blist> <blist> <bibtext> Paul R., Bianchi N., Augustyn A., Klin A., Volkmar F. (2008). Production of syllable stress in speakers with autism spectrum disorders. Research in Autism Spectrum Disorders, 2(1), 110–124. https://doi.org/10.1016/j.rasd.2007.04.001</bibtext> </blist> <blist> <bibtext> Pelaez M., Borroto A. R., Carrow J. (2018). Infant vocalizations and imitation as a result of adult contingent imitation. Behavioral Development, 23(1), 81–88. https://doi.org/10.1037/bdb0000074</bibtext> </blist> <blist> <bibtext> Pfordresher P. Q., Brown S. (2007). Poor-pitch singing in the absence of 'tone deafness'. Music Perception, 25(2), 95–115. https://doi.org/10.1525/mp.2007.25.2.95</bibtext> </blist> <blist> <bibtext> Pfordresher P. Q., Brown S. (2009). Enhanced production and perception of musical pitch in tone language speakers. Attention, Perception, &amp; Psychophysics, 71(6), 1385–1398. https://doi.org/10.3758/APP.71.6.1385</bibtext> </blist> <blist> <bibtext> Pfordresher P. Q., Brown S., Meier K. M., Belyk M., Liotti M. (2010). Imprecise singing is widespread. The Journal of the Acoustical Society of America, 128(4), 2182–2190. https://doi.org/10.1121/1.3478782</bibtext> </blist> <blist> <bibtext> Prince J. B., Pfordresher P. Q. (2012). The role of pitch and temporal diversity in the perception and production of musical sequences. Acta Psychologica, 141(2), 184–198. https://doi.org/10.1016/j.actpsy.2012.07.013</bibtext> </blist> <blist> <bibtext> Quintin E.-M. (2019). Music-evoked reward and emotion: Relative strengths and response to intervention of people with ASD. Frontiers in Neural Circuits, 13, Article 49. https://doi.org/10.3389/fncir.2019.00049</bibtext> </blist> <blist> <bibtext> Raven J., Raven J. C., Court J. H. (1998). Raven manual: Section 3 – Standard progressive matrices. Oxford Psychologists Press.</bibtext> </blist> <blist> <bibtext> Rodero E. (2011). Intonation and emotion: Influence of pitch levels and contour type on creating emotions. Journal of Voice, 25(1), e25–e34. https://doi.org/10.1016/j.jvoice.2010.02.002</bibtext> </blist> <blist> <bibtext> Ross D. E., Greer R. D. (2003). Generalized imitation and the mand: Inducing first instances of speech in young children with autism. Research in Developmental Disabilities, 24(1), 58–74. https://doi.org/10.1016/S0891-4222(02)00167-1</bibtext> </blist> <blist> <bibtext> RStudio Team. (2020). RStudio: Integrated Development for R. RStudio, PBC. <ulink href="http://www.rstudio.com/">http://www.rstudio.com/</ulink></bibtext> </blist> <blist> <bibtext> Sang B., Miao X. (1990). The revision of trail norm of Peabody picture vocabulary test revised (PPVT-R) in Shanghai proper. Psychological Science, 5, 20–25.</bibtext> </blist> <blist> <bibtext> Schielzeth H., Dingemanse N. J., Nakagawa S., Westneat D. F., Allegue H., Teplitsky C., Réale D., Dochtermann N. A., Garamszegi L. Z., Araya-Ajoy Y. G. (2020). Robustness of linear mixed-effects models to violations of distributional assumptions. Methods in Ecology and Evolution, 11(9), 1141–1152. https://doi.org/10.1111/2041-210X.13434</bibtext> </blist> <blist> <bibtext> Sergeant D. C., Welch G. F. (2009). Gender differences in long-term average spectra of children's singing voices. Journal of Voice, 23(3), 319–336. https://doi.org/10.1016/j.jvoice.2007.10.010</bibtext> </blist> <blist> <bibtext> Sharda M., Midha R., Malik S., Mukerji S., Singh N. C. (2015). Fronto-temporal connectivity is preserved during sung but not spoken word listening, across the autism spectrum. Autism Research, 8(2), 174–186. https://doi.org/10.1002/aur.1437</bibtext> </blist> <blist> <bibtext> Tarbox J., Madrid W., Aguilar B., Jacobo W., Schiff A., Ninness C. (2009). Use of chaining to increase complexity of echoics in children with autism. Journal of Applied Behavior Analysis, 42(4), 901–906. https://doi.org/10.1901/jaba.2009.42-901</bibtext> </blist> <blist> <bibtext> Tardif C., Lainé F., Rodriguez M., Gepner B. (2007). Slowing down presentation of facial movements and vocal sounds enhances facial expression recognition and induces facial–vocal imitation in children with autism. Journal of Autism and Developmental Disorders, 37(8), 1469–1484. https://doi.org/10.1007/s10803-006-0223-x</bibtext> </blist> <blist> <bibtext> Tomasello M., Kruger A. C., Ratner H. H. (1993). Cultural learning. Behavioral and Brain Sciences, 16(3), 495–511. https://doi.org/10.1017/S0140525X0003123X</bibtext> </blist> <blist> <bibtext> Uzgiris I. C. (1981). Two functions of imitation during infancy. International Journal of Behavioral Development, 4(1), 1–12. https://doi.org/10.1177/016502548100400101</bibtext> </blist> <blist> <bibtext> Van Santen J. P. H., Prud'hommeaux E. T., Black L. M., Mitchell M. (2010). Computational prosodic markers for autism. Autism: The International Journal of Research and Practice, 14(3), 215–236. https://doi.org/10.1177/1362361309363281</bibtext> </blist> <blist> <bibtext> Wang L., Beaman C. P., Jiang C., Liu F. (2021). Perception and production of statement-question intonation in autism spectrum disorder: A developmental investigation. Journal of Autism and Developmental Disorders, 52, 3456–3472. https://doi.org/10.1007/s10803-021-05220-4</bibtext> </blist> <blist> <bibtext> Wang L., Pfordresher P. Q., Jiang C., Liu F. (2021). Individuals with autism spectrum disorder are impaired in absolute but not relative pitch and duration matching in speech and song imitation. Autism Research, 14(11), 2355–2372. https://doi.org/10.1002/aur.2569</bibtext> </blist> <blist> <bibtext> Wang L., Xiao S., Jiang C., Hou Q., Chan A. H. D., Wong P. C. M., Liu F. (2023). The form and function processing of lexical tone and intonation in tone-language-speaking children with autism spectrum disorder. The Journal of the Acoustical Society of America, 154(1), 467–481. https://doi.org/10.1121/10.0020271</bibtext> </blist> <blist> <bibtext> Ward W. D., Burns E. M. (1978). Singing without auditory feedback. Journal of Research in Singing, 1(2), 4–44.</bibtext> </blist> <blist> <bibtext> Wechsler D. (2003). Wechsler Intelligence Scale for Children–Fourth Edition (WISC-IV). The Psychological Corporation.</bibtext> </blist> <blist> <bibtext> Xu Y. (2013). ProsodyPro - a tool for large-scale systematic prosody analysis. In Tools and resources for the analysis of speech Prosody (TRASP 2013) (pp. 7–10). Aix-en-Provence.</bibtext> </blist> <blist> <bibtext> Xu Y. (2019). Prosody, tone, and intonation. In Katz W. F., Assmann P. F. (Eds.), The Routledge handbook of phonetics (pp. 314–356). Routledge.</bibtext> </blist> <blist> <bibtext> Xu Y., Gandour J. T., Francis A. L. (2006). Effects of language experience and stimulus complexity on the categorical perception of pitch direction. The Journal of the Acoustical Society of America, 120(2), 1063–1074. https://doi.org/10.1121/1.2213572</bibtext> </blist> <blist> <bibtext> Yip M. (2002). Tone. Cambridge University Press.</bibtext> </blist> <blist> <bibtext> Yuan J. (2011). Perception of intonation in Mandarin Chinese. The Journal of the Acoustical Society of America, 130(6), 4063–4069. https://doi.org/10.1121/1.3651818</bibtext> </blist> <blist> <bibtext> Zhang H. (1989). Standardization research on Raven's standard progressive matrices in China. Acta Psychologica Sinica, 21(2), 3–11.</bibtext> </blist> <blist> <bibtext> Zhang Y., Dai X., Zhou J. (2021). The development of lexical semantics for Mandarin-speaking children in China: An exploratory study based on the East China Normal University Vocabulary Test. Journal of Chinese Writing Systems, 5(3), 205–217. https://doi.org/10.1177/25138502211025645</bibtext> </blist> </ref> <ref id="AN0183028988-21"> <title> Footnotes </title> <blist> <bibtext> The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a European Research Council (ERC) Starting Grant, ERC-StG-2015, CAASD, 678733, to F.L. and C.J., and a National Science Foundation (NSF) Grant, BCS-1848930, to P.Q.P.</bibtext> </blist> <blist> <bibtext> Li Wang</bibtext> </blist> <blist> <bibtext>Graph</bibtext> </blist> <blist> <bibtext>https://orcid.org/0000-0001-5318-2408 Cunmei Jiang</bibtext> </blist> <blist> <bibtext>Graph</bibtext> </blist> <blist> <bibtext>https://orcid.org/0000-0002-0264-5924 Fang Liu</bibtext> </blist> <blist> <bibtext>Graph https://orcid.org/0000-0002-7776-0222</bibtext> </blist> <blist> <bibtext> Supplemental material for this article is available online.</bibtext> </blist> </ref> <aug> <p>By Li Wang; Peter Q Pfordresher; Cunmei Jiang and Fang Liu</p> <p>Reported by Author; Author; Author; Author</p> </aug> <nolink nlid="nl1" bibid="bib31" firstref="ref1"></nolink> <nolink nlid="nl2" bibid="bib80" firstref="ref2"></nolink> <nolink nlid="nl3" bibid="bib53" firstref="ref3"></nolink> <nolink nlid="nl4" bibid="bib32" firstref="ref4"></nolink> <nolink nlid="nl5" bibid="bib81" firstref="ref5"></nolink> <nolink nlid="nl6" bibid="bib12" firstref="ref6"></nolink> <nolink nlid="nl7" bibid="bib19" firstref="ref7"></nolink> <nolink nlid="nl8" bibid="bib23" firstref="ref8"></nolink> <nolink nlid="nl9" bibid="bib30" firstref="ref9"></nolink> <nolink nlid="nl10" bibid="bib63" firstref="ref10"></nolink> <nolink nlid="nl11" bibid="bib82" firstref="ref11"></nolink> <nolink nlid="nl12" bibid="bib84" firstref="ref12"></nolink> <nolink nlid="nl13" bibid="bib83" firstref="ref18"></nolink> <nolink nlid="nl14" bibid="bib51" firstref="ref19"></nolink> <nolink nlid="nl15" bibid="bib89" firstref="ref20"></nolink> <nolink nlid="nl16" bibid="bib91" firstref="ref21"></nolink> <nolink nlid="nl17" bibid="bib36" firstref="ref22"></nolink> <nolink nlid="nl18" bibid="bib71" firstref="ref24"></nolink> <nolink nlid="nl19" bibid="bib35" firstref="ref25"></nolink> <nolink nlid="nl20" bibid="bib46" firstref="ref26"></nolink> <nolink nlid="nl21" bibid="bib92" firstref="ref27"></nolink> <nolink nlid="nl22" bibid="bib47" firstref="ref28"></nolink> <nolink nlid="nl23" bibid="bib54" firstref="ref29"></nolink> <nolink nlid="nl24" bibid="bib41" firstref="ref36"></nolink> <nolink nlid="nl25" bibid="bib48" firstref="ref40"></nolink> <nolink nlid="nl26" bibid="bib21" firstref="ref41"></nolink> <nolink nlid="nl27" bibid="bib25" firstref="ref42"></nolink> <nolink nlid="nl28" bibid="bib70" firstref="ref43"></nolink> <nolink nlid="nl29" bibid="bib22" firstref="ref44"></nolink> <nolink nlid="nl30" bibid="bib74" firstref="ref45"></nolink> <nolink nlid="nl31" bibid="bib85" firstref="ref46"></nolink> <nolink nlid="nl32" bibid="bib93" firstref="ref47"></nolink> <nolink nlid="nl33" bibid="bib87" firstref="ref50"></nolink> <nolink nlid="nl34" bibid="bib17" firstref="ref51"></nolink> <nolink nlid="nl35" bibid="bib45" firstref="ref53"></nolink> <nolink nlid="nl36" bibid="bib44" firstref="ref55"></nolink> <nolink nlid="nl37" bibid="bib52" firstref="ref64"></nolink> <nolink nlid="nl38" bibid="bib56" firstref="ref65"></nolink> <nolink nlid="nl39" bibid="bib76" firstref="ref66"></nolink> <nolink nlid="nl40" bibid="bib88" firstref="ref68"></nolink> <nolink nlid="nl41" bibid="bib34" firstref="ref69"></nolink> <nolink nlid="nl42" bibid="bib15" firstref="ref71"></nolink> <nolink nlid="nl43" bibid="bib16" firstref="ref72"></nolink> <nolink nlid="nl44" bibid="bib14" firstref="ref73"></nolink> <nolink nlid="nl45" bibid="bib65" firstref="ref74"></nolink> <nolink nlid="nl46" bibid="bib67" firstref="ref75"></nolink> <nolink nlid="nl47" bibid="bib86" firstref="ref76"></nolink> <nolink nlid="nl48" bibid="bib68" firstref="ref79"></nolink> <nolink nlid="nl49" bibid="bib20" firstref="ref80"></nolink> <nolink nlid="nl50" bibid="bib61" firstref="ref81"></nolink> <nolink nlid="nl51" bibid="bib62" firstref="ref82"></nolink> <nolink nlid="nl52" bibid="bib73" firstref="ref83"></nolink> <nolink nlid="nl53" bibid="bib24" firstref="ref84"></nolink> <nolink nlid="nl54" bibid="bib75" firstref="ref85"></nolink> <nolink nlid="nl55" bibid="bib38" firstref="ref88"></nolink> <nolink nlid="nl56" bibid="bib42" firstref="ref91"></nolink> <nolink nlid="nl57" bibid="bib102" firstref="ref92"></nolink> <nolink nlid="nl58" bibid="bib124" firstref="ref94"></nolink> <nolink nlid="nl59" bibid="bib119" firstref="ref96"></nolink> <nolink nlid="nl60" bibid="bib128" firstref="ref103"></nolink> <nolink nlid="nl61" bibid="bib129" firstref="ref104"></nolink> <nolink nlid="nl62" bibid="bib18" firstref="ref106"></nolink> <nolink nlid="nl63" bibid="bib11" firstref="ref111"></nolink> <nolink nlid="nl64" bibid="bib39" firstref="ref112"></nolink> <nolink nlid="nl65" bibid="bib10" firstref="ref119"></nolink> <nolink nlid="nl66" bibid="bib13" firstref="ref120"></nolink> <nolink nlid="nl67" bibid="bib43" firstref="ref121"></nolink> <nolink nlid="nl68" bibid="bib66" firstref="ref122"></nolink> <nolink nlid="nl69" bibid="bib59" firstref="ref124"></nolink> <nolink nlid="nl70" bibid="bib60" firstref="ref125"></nolink> <nolink nlid="nl71" bibid="bib49" firstref="ref127"></nolink> <nolink nlid="nl72" bibid="bib57" firstref="ref131"></nolink> <nolink nlid="nl73" bibid="bib58" firstref="ref132"></nolink> <nolink nlid="nl74" bibid="bib69" firstref="ref133"></nolink> <nolink nlid="nl75" bibid="bib40" firstref="ref134"></nolink> <nolink nlid="nl76" bibid="bib77" firstref="ref135"></nolink> <nolink nlid="nl77" bibid="bib28" firstref="ref136"></nolink> <nolink nlid="nl78" bibid="bib29" firstref="ref137"></nolink> <nolink nlid="nl79" bibid="bib90" firstref="ref139"></nolink> <nolink nlid="nl80" bibid="bib27" firstref="ref142"></nolink> <nolink nlid="nl81" bibid="bib33" firstref="ref152"></nolink> <nolink nlid="nl82" bibid="bib94" firstref="ref154"></nolink> <nolink nlid="nl83" bibid="bib26" firstref="ref155"></nolink> <nolink nlid="nl84" bibid="bib37" firstref="ref156"></nolink> <nolink nlid="nl85" bibid="bib50" firstref="ref157"></nolink> <nolink nlid="nl86" bibid="bib72" firstref="ref158"></nolink> <nolink nlid="nl87" bibid="bib78" firstref="ref159"></nolink> <nolink nlid="nl88" bibid="bib55" firstref="ref160"></nolink> <nolink nlid="nl89" bibid="bib64" firstref="ref161"></nolink> <nolink nlid="nl90" bibid="bib79" firstref="ref162"></nolink>
Header	DbId: eric DbLabel: ERIC An: EJ1465400 AccessLevel: 3 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0
IllustrationInfo
Items	– Name: Title Label: Title Group: Ti Data: Atypical Vocal Imitation of Speech and Song in Autism Spectrum Disorder: Evidence from Mandarin Speakers – Name: Language Label: Language Group: Lang Data: English – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Li+Wang%22">Li Wang</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0001-5318-2408">0000-0001-5318-2408</externalLink>)<br /><searchLink fieldCode="AR" term="%22Peter+Q%2E+Pfordresher%22">Peter Q. Pfordresher</searchLink><br /><searchLink fieldCode="AR" term="%22Cunmei+Jiang%22">Cunmei Jiang</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-0264-5924">0000-0002-0264-5924</externalLink>)<br /><searchLink fieldCode="AR" term="%22Fang+Liu%22">Fang Liu</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-7776-0222">0000-0002-7776-0222</externalLink>) – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="SO" term="%22Autism%3A+The+International+Journal+of+Research+and+Practice%22"><i>Autism: The International Journal of Research and Practice</i></searchLink>. 2025 29(2):408-423. – Name: Avail Label: Availability Group: Avail Data: SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com – Name: PeerReviewed Label: Peer Reviewed Group: SrcInfo Data: Y – Name: Pages Label: Page Count Group: Src Data: 16 – Name: DatePubCY Label: Publication Date Group: Date Data: 2025 – Name: SourceSuprt Label: Sponsoring Agency Group: SrcSuprt Data: National Science Foundation (NSF), Division of Behavioral and Cognitive Sciences (BCS) – Name: NumberContract Label: Contract Number Group: NumCntrct Data: 1848930 – Name: TypeDocument Label: Document Type Group: TypDoc Data: Journal Articles<br />Reports - Research – Name: Subject Label: Descriptors Group: Su Data: <searchLink fieldCode="DE" term="%22Mandarin+Chinese%22">Mandarin Chinese</searchLink><br /><searchLink fieldCode="DE" term="%22Singing%22">Singing</searchLink><br /><searchLink fieldCode="DE" term="%22Autism+Spectrum+Disorders%22">Autism Spectrum Disorders</searchLink><br /><searchLink fieldCode="DE" term="%22Imitation%22">Imitation</searchLink><br /><searchLink fieldCode="DE" term="%22Speech+Communication%22">Speech Communication</searchLink><br /><searchLink fieldCode="DE" term="%22Tone+Languages%22">Tone Languages</searchLink><br /><searchLink fieldCode="DE" term="%22Children%22">Children</searchLink><br /><searchLink fieldCode="DE" term="%22Adolescents%22">Adolescents</searchLink><br /><searchLink fieldCode="DE" term="%22Foreign+Countries%22">Foreign Countries</searchLink><br /><searchLink fieldCode="DE" term="%22Intonation%22">Intonation</searchLink> – Name: Subject Label: Geographic Terms Group: Su Data: <searchLink fieldCode="DE" term="%22China%22">China</searchLink> – Name: SubjectThesaurus Label: Assessment and Survey Identifiers Group: Su Data: <searchLink fieldCode="SU" term="%22Autism+Diagnostic+Observation+Schedule%22">Autism Diagnostic Observation Schedule</searchLink><br /><searchLink fieldCode="SU" term="%22Peabody+Picture+Vocabulary+Test%22">Peabody Picture Vocabulary Test</searchLink><br /><searchLink fieldCode="SU" term="%22Raven+Progressive+Matrices%22">Raven Progressive Matrices</searchLink> – Name: DOI Label: DOI Group: ID Data: 10.1177/13623613241275395 – Name: ISSN Label: ISSN Group: ISSN Data: 1362-3613<br />1461-7005 – Name: Abstract Label: Abstract Group: Ab Data: Vocal imitation in English-speaking autistic individuals has been shown to be atypical. Speaking a tone language such as Mandarin facilitates vocal imitation skills among non-autistic individuals, yet no studies have examined whether this effect holds for autistic individuals. To address this question, we compared vocal imitation of speech and song between 33 autistic Mandarin speakers and 30 age-matched non-autistic peers. Participants were recorded while imitating 40 speech and song stimuli with varying pitch and duration patterns. Acoustic analyses showed that autistic participants imitated relative pitch (but not absolute pitch) less accurately than non-autistic participants for speech, whereas for song the two groups performed comparably on both absolute and relative pitch matching. Regarding duration matching, autistic participants imitated relative duration (inter-onset interval between consecutive notes/syllables) less accurately than non-autistic individuals for both speech and song, while their lower performance on absolute duration matching of the notes/syllables was presented only in the song condition. These findings indicate that experience with tone languages does not mitigate the challenges autistic individuals face in imitating speech and song, highlighting the importance of considering the domains and features of investigation and individual differences in cognitive abilities and language backgrounds when examining imitation in autism. – Name: AbstractInfo Label: Abstractor Group: Ab Data: As Provided – Name: DateEntry Label: Entry Date Group: Date Data: 2025 – Name: AN Label: Accession Number Group: ID Data: EJ1465400
PLink	https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1465400
RecordInfo	BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1177/13623613241275395 Languages: – Text: English PhysicalDescription: Pagination: PageCount: 16 StartPage: 408 Subjects: – SubjectFull: Mandarin Chinese Type: general – SubjectFull: Singing Type: general – SubjectFull: Autism Spectrum Disorders Type: general – SubjectFull: Imitation Type: general – SubjectFull: Speech Communication Type: general – SubjectFull: Tone Languages Type: general – SubjectFull: Children Type: general – SubjectFull: Adolescents Type: general – SubjectFull: Foreign Countries Type: general – SubjectFull: Intonation Type: general – SubjectFull: China Type: general – SubjectFull: Autism Diagnostic Observation Schedule Type: general – SubjectFull: Peabody Picture Vocabulary Test Type: general – SubjectFull: Raven Progressive Matrices Type: general Titles: – TitleFull: Atypical Vocal Imitation of Speech and Song in Autism Spectrum Disorder: Evidence from Mandarin Speakers Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Li Wang – PersonEntity: Name: NameFull: Peter Q. Pfordresher – PersonEntity: Name: NameFull: Cunmei Jiang – PersonEntity: Name: NameFull: Fang Liu IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 02 Type: published Y: 2025 Identifiers: – Type: issn-print Value: 1362-3613 – Type: issn-electronic Value: 1461-7005 Numbering: – Type: volume Value: 29 – Type: issue Value: 2 Titles: – TitleFull: Autism: The International Journal of Research and Practice Type: main
ResultId	1