Style-Content progressive aggregation network with stable diffusion.

Saved in:
Bibliographic Details
Title: Style-Content progressive aggregation network with stable diffusion.
Authors: Yuan, Tiebiao1 (AUTHOR), Yu, Yangyang1,2 (AUTHOR) yuantb@tjrac.edu.cn, Ji, Ning1 (AUTHOR)
Source: Applied Intelligence. Aug2025, Vol. 55 Issue 12, p1-16. 16p.
Abstract: The task of text-to-image generation has matured significantly with the advancement of diffusion models; however, achieving precise control over the details and style of generated images remains a challenge. Existing methods often rely on complex text prompts to describe details while attempting to integrate style information. Nevertheless, the single-stage attention mechanism in diffusion models struggles to effectively capture multi-scale features and the relationship between style and content, resulting in feature amalgamation that compromises the quality of generated images. To address this issue, we propose a Style-Content Progressive Aggregation (SCPA) network, which integrates and aggregates multi-scale features from style images and text prompts through the coordinated design of two complementary modules. Specifically, the Style-Content Decoupling (SCD) module disentangles the style and content features of the style image, and reconstructs a learnable content template based on the extracted style features, thereby preventing the original content features from interfering with text understanding. The Style-Content Coupling (SCC) module then extracts multi-scale pixel-level content features from the text prompt and progressively integrates style elements into the template, enabling fine-grained fusion of content and style. This progressive aggregation strategy effectively enhances the quality of prior guidance provided to the diffusion model. Extensive experimental results demonstrate that the SCPA network can generate more artistically appealing images and offers a new direction for the integration of text-to-image generation models with traditional style transfer techniques. [ABSTRACT FROM AUTHOR]
Copyright of Applied Intelligence is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
FullText Text:
  Availability: 0
Header DbId: egs
DbLabel: Engineering Source
An: 186944971
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Style-Content progressive aggregation network with stable diffusion.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Yuan%2C+Tiebiao%22">Yuan, Tiebiao</searchLink><relatesTo>1</relatesTo> (AUTHOR)<br /><searchLink fieldCode="AR" term="%22Yu%2C+Yangyang%22">Yu, Yangyang</searchLink><relatesTo>1,2</relatesTo> (AUTHOR)<i> yuantb@tjrac.edu.cn</i><br /><searchLink fieldCode="AR" term="%22Ji%2C+Ning%22">Ji, Ning</searchLink><relatesTo>1</relatesTo> (AUTHOR)
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22Applied+Intelligence%22">Applied Intelligence</searchLink>. Aug2025, Vol. 55 Issue 12, p1-16. 16p.
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: The task of text-to-image generation has matured significantly with the advancement of diffusion models; however, achieving precise control over the details and style of generated images remains a challenge. Existing methods often rely on complex text prompts to describe details while attempting to integrate style information. Nevertheless, the single-stage attention mechanism in diffusion models struggles to effectively capture multi-scale features and the relationship between style and content, resulting in feature amalgamation that compromises the quality of generated images. To address this issue, we propose a Style-Content Progressive Aggregation (SCPA) network, which integrates and aggregates multi-scale features from style images and text prompts through the coordinated design of two complementary modules. Specifically, the Style-Content Decoupling (SCD) module disentangles the style and content features of the style image, and reconstructs a learnable content template based on the extracted style features, thereby preventing the original content features from interfering with text understanding. The Style-Content Coupling (SCC) module then extracts multi-scale pixel-level content features from the text prompt and progressively integrates style elements into the template, enabling fine-grained fusion of content and style. This progressive aggregation strategy effectively enhances the quality of prior guidance provided to the diffusion model. Extensive experimental results demonstrate that the SCPA network can generate more artistically appealing images and offers a new direction for the integration of text-to-image generation models with traditional style transfer techniques. [ABSTRACT FROM AUTHOR]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of Applied Intelligence is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=186944971
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1007/s10489-025-06751-4
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 16
        StartPage: 1
    Titles:
      – TitleFull: Style-Content progressive aggregation network with stable diffusion.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Yuan, Tiebiao
      – PersonEntity:
          Name:
            NameFull: Yu, Yangyang
      – PersonEntity:
          Name:
            NameFull: Ji, Ning
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 08
              Text: Aug2025
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-print
              Value: 0924669X
          Numbering:
            – Type: volume
              Value: 55
            – Type: issue
              Value: 12
          Titles:
            – TitleFull: Applied Intelligence
              Type: main
ResultId 1