Style-Content progressive aggregation network with stable diffusion.
Saved in:
| Title: | Style-Content progressive aggregation network with stable diffusion. |
|---|---|
| Authors: | Yuan, Tiebiao1 (AUTHOR), Yu, Yangyang1,2 (AUTHOR) yuantb@tjrac.edu.cn, Ji, Ning1 (AUTHOR) |
| Source: | Applied Intelligence. Aug2025, Vol. 55 Issue 12, p1-16. 16p. |
| Abstract: | The task of text-to-image generation has matured significantly with the advancement of diffusion models; however, achieving precise control over the details and style of generated images remains a challenge. Existing methods often rely on complex text prompts to describe details while attempting to integrate style information. Nevertheless, the single-stage attention mechanism in diffusion models struggles to effectively capture multi-scale features and the relationship between style and content, resulting in feature amalgamation that compromises the quality of generated images. To address this issue, we propose a Style-Content Progressive Aggregation (SCPA) network, which integrates and aggregates multi-scale features from style images and text prompts through the coordinated design of two complementary modules. Specifically, the Style-Content Decoupling (SCD) module disentangles the style and content features of the style image, and reconstructs a learnable content template based on the extracted style features, thereby preventing the original content features from interfering with text understanding. The Style-Content Coupling (SCC) module then extracts multi-scale pixel-level content features from the text prompt and progressively integrates style elements into the template, enabling fine-grained fusion of content and style. This progressive aggregation strategy effectively enhances the quality of prior guidance provided to the diffusion model. Extensive experimental results demonstrate that the SCPA network can generate more artistically appealing images and offers a new direction for the integration of text-to-image generation models with traditional style transfer techniques. [ABSTRACT FROM AUTHOR] |
| Copyright of Applied Intelligence is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Database: | Engineering Source |
| FullText | Text: Availability: 0 |
|---|---|
| Header | DbId: egs DbLabel: Engineering Source An: 186944971 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Style-Content progressive aggregation network with stable diffusion. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Yuan%2C+Tiebiao%22">Yuan, Tiebiao</searchLink><relatesTo>1</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Yu%2C+Yangyang%22">Yu, Yangyang</searchLink><relatesTo>1,2</relatesTo> (AUTHOR)<i> yuantb@tjrac.edu.cn</i><br /><searchLink fieldCode="AR" term="%22Ji%2C+Ning%22">Ji, Ning</searchLink><relatesTo>1</relatesTo> (AUTHOR) – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="JN" term="%22Applied+Intelligence%22">Applied Intelligence</searchLink>. Aug2025, Vol. 55 Issue 12, p1-16. 16p. – Name: Abstract Label: Abstract Group: Ab Data: The task of text-to-image generation has matured significantly with the advancement of diffusion models; however, achieving precise control over the details and style of generated images remains a challenge. Existing methods often rely on complex text prompts to describe details while attempting to integrate style information. Nevertheless, the single-stage attention mechanism in diffusion models struggles to effectively capture multi-scale features and the relationship between style and content, resulting in feature amalgamation that compromises the quality of generated images. To address this issue, we propose a Style-Content Progressive Aggregation (SCPA) network, which integrates and aggregates multi-scale features from style images and text prompts through the coordinated design of two complementary modules. Specifically, the Style-Content Decoupling (SCD) module disentangles the style and content features of the style image, and reconstructs a learnable content template based on the extracted style features, thereby preventing the original content features from interfering with text understanding. The Style-Content Coupling (SCC) module then extracts multi-scale pixel-level content features from the text prompt and progressively integrates style elements into the template, enabling fine-grained fusion of content and style. This progressive aggregation strategy effectively enhances the quality of prior guidance provided to the diffusion model. Extensive experimental results demonstrate that the SCPA network can generate more artistically appealing images and offers a new direction for the integration of text-to-image generation models with traditional style transfer techniques. [ABSTRACT FROM AUTHOR] – Name: AbstractSuppliedCopyright Label: Group: Ab Data: <i>Copyright of Applied Intelligence is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.) |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=186944971 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1007/s10489-025-06751-4 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 16 StartPage: 1 Titles: – TitleFull: Style-Content progressive aggregation network with stable diffusion. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Yuan, Tiebiao – PersonEntity: Name: NameFull: Yu, Yangyang – PersonEntity: Name: NameFull: Ji, Ning IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 08 Text: Aug2025 Type: published Y: 2025 Identifiers: – Type: issn-print Value: 0924669X Numbering: – Type: volume Value: 55 – Type: issue Value: 12 Titles: – TitleFull: Applied Intelligence Type: main |
| ResultId | 1 |