Using Global Land Cover Product as Prompt for Cropland Mapping via Visual Foundation Model (2310.10219v1)
Abstract: Data-driven deep learning methods have shown great potential in cropland mapping. However, due to multiple factors such as attributes of cropland (topography, climate, crop type) and imaging conditions (viewing angle, illumination, scale), croplands under different scenes demonstrate a great domain gap. This makes it difficult for models trained in the specific scenes to directly generalize to other scenes. A common way to handle this problem is through the "Pretrain+Fine-tuning" paradigm. Unfortunately, considering the variety of features of cropland that are affected by multiple factors, it is hardly to handle the complex domain gap between pre-trained data and target data using only sparse fine-tuned samples as general constraints. Moreover, as the number of model parameters grows, fine-tuning is no longer an easy and low-cost task. With the emergence of prompt learning via visual foundation models, the "Pretrain+Prompting" paradigm redesigns the optimization target by introducing individual prompts for each single sample. This simplifies the domain adaption from generic to specific scenes during model reasoning processes. Therefore, we introduce the "Pretrain+Prompting" paradigm to interpreting cropland scenes and design the auto-prompting (APT) method based on freely available global land cover product. It can achieve a fine-grained adaptation process from generic scenes to specialized cropland scenes without introducing additional label costs. To our best knowledge, this work pioneers the exploration of the domain adaption problems for cropland mapping under prompt learning perspectives. Our experiments using two sub-meter cropland datasets from southern and northern China demonstrated that the proposed method via visual foundation models outperforms traditional supervised learning and fine-tuning approaches in the field of remote sensing.
- L. See, S. Fritz, L. You, N. Ramankutty, M. Herrero, C. Justice, I. Becker-Reshef, P. Thornton, K. Erb, P. Gong et al., “Improved global cropland data as an essential ingredient for food security,” Global Food Security, vol. 4, pp. 37–45, 2015.
- A. Kamilaris and F. X. Prenafeta-Boldú, “Deep learning in agriculture: A survey,” Computers and electronics in agriculture, vol. 147, pp. 70–90, 2018.
- M. Weiss, F. Jacob, and G. Duveiller, “Remote sensing for agricultural applications: A meta-review,” Remote sensing of environment, vol. 236, p. 111402, 2020.
- L. Santos, F. N. Santos, P. M. Oliveira, and P. Shinde, “Deep learning applications in agriculture: A short review,” in Robot 2019: Fourth Iberian Robotics Conference: Advances in Robotics, Volume 1. Springer, 2020, pp. 139–151.
- J. Liu, K. Yang, A. Tariq, L. Lu, W. Soufan, and A. El Sabagh, “Interaction of climate, topography and soil properties with cropland and cropping pattern using remote sensing data and machine learning methods,” The Egyptian Journal of Remote Sensing and Space Science, vol. 26, no. 3, pp. 415–426, 2023.
- M. Bryson, A. Reid, F. Ramos, and S. Sukkarieh, “Airborne vision-based mapping and classification of large farmland environments,” Journal of Field Robotics, vol. 27, no. 5, pp. 632–655, 2010.
- J. Zhang, S. Xu, J. Sun, D. Ou, X. Wu, and M. Wang, “Unsupervised adversarial domain adaptation for agricultural land extraction of remote sensing images,” Remote Sensing, vol. 14, no. 24, p. 6298, 2022.
- D. Zhang, J. Zhang, Y. Pan, and Y. Duan, “Fully convolutional neural networks for large scale cropland mapping with historical label dataset,” in IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2018, pp. 4659–4662.
- L. Zhang, L. Zhang, and B. Du, “Deep learning for remote sensing data: A technical tutorial on the state of the art,” IEEE Geoscience and remote sensing magazine, vol. 4, no. 2, pp. 22–40, 2016.
- D. Wang, Q. Zhang, Y. Xu, J. Zhang, B. Du, D. Tao, and L. Zhang, “Advancing plain vision transformer toward remote sensing foundation model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–15, 2022.
- M. Jia, L. Tang, B.-C. Chen, C. Cardie, S. Belongie, B. Hariharan, and S.-N. Lim, “Visual prompt tuning,” in European Conference on Computer Vision. Springer, 2022, pp. 709–727.
- P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM Computing Surveys, vol. 55, no. 9, pp. 1–35, 2023.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
- X. Li, L. Zhang, Z. Wu, Z. Liu, L. Zhao, Y. Yuan, J. Liu, G. Li, D. Zhu, P. Yan et al., “Artificial general intelligence for medical imaging,” arXiv preprint arXiv:2306.05480, 2023.
- T. Liu, B. Li, X. Du, B. Jiang, X. Jin, L. Jin, and Z. Zhao, “Component-aware anomaly detection framework for adjustable and logical industrial visual inspection,” arXiv preprint arXiv:2305.08509, 2023.
- S. Julka and M. Granitzer, “Knowledge distillation with segment anything (sam) model for planetary geological mapping,” arXiv preprint arXiv:2305.07586, 2023.
- S. Ren, F. Luzi, S. Lahrichi, K. Kassaw, L. M. Collins, K. Bradbury, and J. M. Malof, “Segment anything, from space?” arXiv preprint arXiv:2304.13000, 2023.
- J. Zhang, Z. Zhou, G. Mai, L. Mu, M. Hu, and S. Li, “Text2seg: Remote sensing image semantic segmentation via text-guided visual foundation models,” arXiv preprint arXiv:2304.10597, 2023.
- L. Liu, X. Zhang, Y. Gao, X. Chen, X. Shuai, and J. Mi, “Finer-resolution mapping of global land cover: Recent developments, consistency analysis, and prospects,” Journal of Remote Sensing, 2021.
- W. Jin, Y. Cheng, Y. Shen, W. Chen, and X. Ren, “A good prompt is worth millions of parameters: Low-resource prompt-based learning for vision-language models,” arXiv preprint arXiv:2110.08484, 2021.
- Q. Huang, X. Dong, D. Chen, W. Zhang, F. Wang, G. Hua, and N. Yu, “Diversity-aware meta visual prompting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10 878–10 887.