PMAA: A Progressive Multi-scale Attention Autoencoder Model for High-Performance Cloud Removal from Multi-temporal Satellite Imagery

1Qinghai University, 2Tsinghua University
ECAI 2023
method

Abstract

Satellite imagery analysis plays a pivotal role in remote sensing; however, information loss due to cloud cover significantly impedes its application. Although existing deep cloud removal models have achieved notable outcomes, they scarcely consider contextual information. This study introduces a high-performance cloud removal architecture, termed Progressive Multi-scale Attention Autoencoder (PMAA), which concurrently harnesses global and local information to construct robust contextual dependencies using a novel Multi-scale Attention Module (MAM) and a novel Local Interaction Module (LIM). PMAA establishes long-range dependencies of multi-scale features using MAM and modulates the reconstruction of fine-grained details utilizing LIM, enabling simultaneous representation of fine- and coarse-grained features at the same level. With the help of diverse and multi-scale features, PMAA consistently outperforms the previous state-of-the-art model CTGAN on two benchmark datasets. Moreover, PMAA boasts considerable efficiency advantages, with only 0.5\% and 14.6\% of the parameters and computational complexity of CTGAN, respectively. These comprehensive results underscore PMAA's potential as a lightweight cloud removal network suitable for deployment on edge devices to accomplish large-scale cloud removal tasks. Our source code and pre-trained models are available at https://github.com/XavierJiezou/PMAA

Results

MY ALT TEXT

BibTeX

@inproceedings{zou2023pmaa,
  title={PMAA: A Progressive Multi-scale Attention Autoencoder Model for High-Performance Cloud Removal from Multi-temporal Satellite Imagery},
  author={Zou, Xuechao and Li, Kai and Xing, Junliang and Tao, Pin and Cui, Yachao},
  journal={European Conference on Artificial Intelligence (ECAI)},
  year={2023}
  }

Acknowledgements

This work was supported in part by the Natural Science Foundation of China under Grant No. 62222606 and 62076238, in part by the Research on Efficiency Design of 3D Virtual Interactive Scene (k992146), and in part by the Research Foundation of the Key Laboratory of Spaceborne Information Intelligent Interpretation.