Enhanced radiology report generation via comprehensive sequence rearrangement and multi-scale cross-region attention.

Saved in:
Bibliographic Details
Title: Enhanced radiology report generation via comprehensive sequence rearrangement and multi-scale cross-region attention.
Authors: Deng, Yan1 (AUTHOR) 2404273956@qq.com, Qin, Qibing2 (AUTHOR) qinbing@wfu.edu.cn, Hu, Wei1 (AUTHOR) wei.workstation@gmail.com, Hu, Jianming3 (AUTHOR) hujianming@cqnu.edu.cn, Yan, Dengwei1 (AUTHOR) dwyan@cqnu.edu.cn, Zhang, Wenfeng1 (AUTHOR) itzhangwf@cqnu.edu.cn, Qiao, Jing4 (AUTHOR) 15114585538@163.com
Source: Visual Computer. Mar2026, Vol. 42 Issue 4, p1-15. 15p.
Abstract: In the medical domain, accurate and detailed radiology reports are pivotal for disease diagnosis and treatment. Despite existing methods showing promise, challenges persist in extracting effective features and focusing on critical regions. To address these issues, we introduce a radiology report generation model, CSR-LMCA, which integrates comprehensive sequence rearrangement with multi-scale cross-region attention. Our model enhances focus on disease-related areas through Saliency-guided Discriminative Attention Mapping (SDAM), significantly improving lesion region identification and background noise suppression. Additionally, the Sequence Rearrangement Mamba (SR-Mamba) module efficiently extracts discriminative features from rearranged long sequences. The Local Multi-scale Cross-region Attention (LMCA) mechanism models local attention relationships and performs cross-region information fusion, strengthening the model’s ability to capture global features and focus on key areas. Experiments on the IU X-ray and MIMIC-CXR datasets demonstrate that CSR-LMCA outperforms state-of-the-art methods, achieving BLEU-4 scores of 0.175 and 0.118, respectively, on these datasets. Here we show that our model not only generates informative and coherent radiology reports but also offers significant improvements in text completeness, coherence, and readability. The code and datasets are available at: . [ABSTRACT FROM AUTHOR]
Copyright of Visual Computer is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Description
Abstract:In the medical domain, accurate and detailed radiology reports are pivotal for disease diagnosis and treatment. Despite existing methods showing promise, challenges persist in extracting effective features and focusing on critical regions. To address these issues, we introduce a radiology report generation model, CSR-LMCA, which integrates comprehensive sequence rearrangement with multi-scale cross-region attention. Our model enhances focus on disease-related areas through Saliency-guided Discriminative Attention Mapping (SDAM), significantly improving lesion region identification and background noise suppression. Additionally, the Sequence Rearrangement Mamba (SR-Mamba) module efficiently extracts discriminative features from rearranged long sequences. The Local Multi-scale Cross-region Attention (LMCA) mechanism models local attention relationships and performs cross-region information fusion, strengthening the model’s ability to capture global features and focus on key areas. Experiments on the IU X-ray and MIMIC-CXR datasets demonstrate that CSR-LMCA outperforms state-of-the-art methods, achieving BLEU-4 scores of 0.175 and 0.118, respectively, on these datasets. Here we show that our model not only generates informative and coherent radiology reports but also offers significant improvements in text completeness, coherence, and readability. The code and datasets are available at: . [ABSTRACT FROM AUTHOR]
ISSN:01782789
DOI:10.1007/s00371-026-04384-3