MT-CYP-Net: Multi-Task Network for Pixel-Level Crop Yield Prediction Under Very Few Samples

Published 17 May 2025 in cs.CV and cs.AI | (2505.12069v1)

Abstract: Accurate and fine-grained crop yield prediction plays a crucial role in advancing global agriculture. However, the accuracy of pixel-level yield estimation based on satellite remote sensing data has been constrained by the scarcity of ground truth data. To address this challenge, we propose a novel approach called the Multi-Task Crop Yield Prediction Network (MT-CYP-Net). This framework introduces an effective multi-task feature-sharing strategy, where features extracted from a shared backbone network are simultaneously utilized by both crop yield prediction decoders and crop classification decoders with the ability to fuse information between them. This design allows MT-CYP-Net to be trained with extremely sparse crop yield point labels and crop type labels, while still generating detailed pixel-level crop yield maps. Concretely, we collected 1,859 yield point labels along with corresponding crop type labels and satellite images from eight farms in Heilongjiang Province, China, in 2023, covering soybean, maize, and rice crops, and constructed a sparse crop yield label dataset. MT-CYP-Net is compared with three classical machine learning and deep learning benchmark methods in this dataset. Experimental results not only indicate the superiority of MT-CYP-Net compared to previous methods on multiple types of crops but also demonstrate the potential of deep networks on precise pixel-level crop yield prediction, especially with limited data labels.

Abstract PDF Chat (Pro)

Summary

Overview of MT-CYP-Net: A Multi-Task Network for Crop Yield Prediction

The paper "MT-CYP-Net: Multi-Task Network for Pixel-Level Crop Yield Prediction Under Very Few Samples" by Liu et al. introduces a novel approach to addressing the challenges associated with predicting crop yields at a pixel-level resolution using satellite remote sensing data. Crop yield prediction is a critical task with direct implications on global food security and agricultural policy-making. Yet, the sparsity of ground truth data has regularly stymied attempts to achieve high accuracy in this domain. The authors propose the Multi-Task Crop Yield Prediction Network (MT-CYP-Net) to tackle this issue, demonstrating its efficacy over traditional methods in a dataset collected from farms in China covering multiple crops.

Methodology

MT-CYP-Net leverages a deep learning framework that incorporates multi-task learning (MTL) principles, which facilitate the use of shared features for both yield prediction and crop classification tasks. This dual-task approach exploits a common backbone network with distinct decoders for each task, enabling efficient utilization of sparse data. The network is trained using 1,859 crop yield point labels and corresponding crop type labels gathered alongside satellite imagery from eight farms in Heilongjiang Province, China.

The model structure is designed as a unified encoder-decoder architecture where the encoder processes input image data to extract representative features. These features are then passed to two distinct decoders for tasks: one dedicated to pixel-level yield prediction (via regression) and the other to crop type classification (via segmentation). Key to the network's efficiency is the integration of Task Consistency Learning (TCL) blocks, which mediate feature sharing between these tasks enhancing the overall prediction accuracy.

Experimental Results

In the experiments conducted by Liu et al., MT-CYP-Net was benchmarked against classical machine learning methods, such as Random Forest, XGBoost, and LightGBM, and other deep learning models on its dataset. The model consistently outperformed existing methods across multiple metrics. Notably, MT-CYP-Net demonstrated a root mean square error (RMSE) of 0.1472 with the ResNest-50d backbone using all Sentinel-2 bands, surpassing FPN-DenseNet161 and Unet-based implementations. Moreover, the multi-task approach proved advantageous, as exemplified by improved mutual feature utilization between tasks, which was validated through ablation studies.

Implications and Future Directions

The design and outcomes of MT-CYP-Net highlight profound implications on precision agriculture. The approach overcomes a significant barrier encountered in crop yield prediction by requiring fewer ground-truth samples, thereby reducing the cost and effort associated with data collection. This potentially enables widespread deployment of high-resolution yield prediction systems across diverse agricultural landscapes, contributing to more informed agricultural management and policy-making.

Future developments may explore extending MT-CYP-Net's framework to integrate temporal data, allowing it to capture crop growth dynamics across growing seasons. Additionally, expansion to support multimodal inputs, such as SAR imagery alongside optical data, could further improve robustness under various environmental conditions.

In conclusion, the MT-CYP-Net presents a promising advancement in the field of agricultural remote sensing with its ability to leverage limited ground-truth data for precise crop yield predictions. Its success suggests vast potential for adoption and scale in practical, real-world agricultural systems.