Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images

Li, Rui and Zheng, Shunyi and Zhang, Ce and Duan, Chenxi and Su, Jianlin and Atkinson, Peter (2022) Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 60: 5607713. ISSN 0196-2892

[thumbnail of Multi_Attention_Network_TGRS_accepted]
Text (Multi_Attention_Network_TGRS_accepted)
Multi_Attention_Network_TGRS_accepted.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.

Download (3MB)


Semantic segmentation of remote sensing images plays an important role in land resource management, yield estimation, and economic assessment. Although the accuracy of semantic segmentation in remote sensing images has been increased significantly by deep convolutional neural networks, there are still several limitations contained in standard models. First, for encoder-decoder architectures such as U-Net, the utilization of multi-scale features causes the overuse of information, where similar low-level features are exploited at multiple scales over multiple times. Second, long-range dependencies of feature maps are not sufficiently explored, resulting in feature representations associated with each semantic class not being optimized. Third, even though the dot-product attention mechanism has been introduced and utilized in semantic segmentation to model long-range dependencies, the high time and space complexities of attention impede the actual usage of attention in application scenarios with large-scale input. This paper proposed a Multi-Attention-Network (MANet) to handle these issues by extracting contextual dependencies through multiple efficient attention modules. A novel attention mechanism of kernel attention with linear complexity is proposed to alleviate the large computational demand in attention. We integrate local feature maps extracted by ResNeXt-101 with their corresponding global dependencies and reweight interdependent channel maps adaptively based on kernel attention and channel attention. Numerical experiments on three large-scale fine resolution remote sensing images captured by variant satellites demonstrate that the performance of the proposed MANet outperforms the DeepLab V3+, PSPNet, FastFCN, and other benchmark approaches.

Item Type:
Journal Article
Journal or Publication Title:
IEEE Transactions on Geoscience and Remote Sensing
Additional Information:
©2021 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Uncontrolled Keywords:
?? fine-resolution remote sensing imagesattention mechanismsemantic segmentationelectrical and electronic engineeringearth and planetary sciences(all) ??
ID Code:
Deposited By:
Deposited On:
13 Jul 2021 15:10
Last Modified:
12 Feb 2024 00:41