Knowledge Transfer and Domain Adaptation for
Fine-Grained Remote Sensing Image Segmentation

1Qinghai University, 2Beijing Jiaotong University, 3Tsinghua University

arXiv 2024

*Equal Contribution, Corresponding Author

Abstract

Fine-grained remote sensing image segmentation is essential for accurately identifying detailed objects in remote sensing images. Recently, vision transformer models (VTMs) pre-trained on large-scale datasets have demonstrated strong zero-shot generalization. However, directly applying them to specific tasks may lead to domain shift. We introduce a novel end-to-end learning paradigm combining knowledge guidance with domain refinement to enhance performance. We present two key components: the Feature Alignment Module (FAM) and the Feature Modulation Module (FMM). FAM aligns features from a CNN-based backbone with those from the pretrained VTM's encoder using channel transformation and spatial interpolation, and transfers knowledge via KL divergence and L2 normalization constraint. FMM further adapts the knowledge to the specific domain to address domain shift. We also introduce a fine-grained grass segmentation dataset and demonstrate, through experiments on two datasets, that our method achieves a significant improvement of 2.57 mIoU on the grass dataset and 3.73 mIoU on the cloud dataset. The results highlight the potential of combining knowledge transfer and domain adaptation to overcome domain-related challenges and data limitations. The code and checkpoints are available at GitHub.

Method

Framework: Overview of the proposed framework that integrates knowledge transfer and domain adaptation for fine-grained remote sensing image segmentation. Knowledge transfer is achieved through a feature alignment module and two loss constraints KL divergence loss \( \mathcal{L}_{\text{kl}}\) and L2 normalization loss \( \mathcal{L}_{\text{mse}}\). Domain generalization is facilitated by a feature modulation module and supervised with cross-entropy loss \( \mathcal{L}_{\text{ce}}\) and and auxiliary loss \( \mathcal{L}_{\text{aux}}\). In summary, we propose a novel end-to-end learning paradigm combining knowledge guidance with domain refinement.

Results

Visualization on the Grass Dataset.

Grass Dataset Visualization
Low
Middle Low
Middle
Middle High
High

Visualization on the Cloud Dataset.

Cloud Dataset Visualization
Clear Sky
Thick Cloud
Thin Cloud
Cloud Shadow

Quantitative Comparison with Existing Methods

Evaluation on Fine-Grained Grass Segmentation.
Method mIoU ↑ OA ↑ F1 ↑
FCN 47.47 67.85 61.99
PSPNet 47.95 69.12 62.55
DeepLabV3+ 47.95 68.97 62.50
UNet 48.17 69.77 62.34
SegFormer 48.29 68.93 62.82
Mask2Former 44.93 65.90 58.91
DINOv2 47.57 71.54 61.70
KTDA (Ours) 50.86 74.26 65.01
Evaluation on Fine-Grained Cloud Segmentation.
Method mIoU ↑ OA ↑ F1 ↑
MCDNet 33.85 69.75 42.76
SCNN 32.38 71.22 52.41
CDNetv1 34.58 68.16 45.80
KappaMask 42.12 76.63 68.47
UNetMobv2 47.76 82.00 56.91
CDNetv2 43.63 78.56 70.33
HRCloudNet 43.51 77.04 71.36
KTDA (Ours) 51.49 83.55 60.08

BibTeX

@misc{ktda,
      title={Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation}, 
      author={Shun Zhang and Xuechao Zou and Kai Li and Congyan Lang and Shiying Wang and Pin Tao and Tengfei Cao},
      year={2024},
      eprint={2412.06664},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.06664}, 
}

Acknowledgements

This work was partially supported by the Natural Science Foundation of Qinghai Province under Grant No. 2024-ZJ-708 and the National Natural Science Foundation of China under Grant No. 62072027.