A CNN-RNN Framework for Crop Yield Prediction (1911.09045v2)

Published 20 Nov 2019 in cs.LG, q-bio.QM, and stat.ML

Abstract: Crop yield prediction is extremely challenging due to its dependence on multiple factors such as crop genotype, environmental factors, management practices, and their interactions. This paper presents a deep learning framework using convolutional neural networks (CNN) and recurrent neural networks (RNN) for crop yield prediction based on environmental data and management practices. The proposed CNN-RNN model, along with other popular methods such as random forest (RF), deep fully-connected neural networks (DFNN), and LASSO, was used to forecast corn and soybean yield across the entire Corn Belt (including 13 states) in the United States for years 2016, 2017, and 2018 using historical data. The new model achieved a root-mean-square-error (RMSE) 9% and 8% of their respective average yields, substantially outperforming all other methods that were tested. The CNN-RNN have three salient features that make it a potentially useful method for other crop yield prediction studies. (1) The CNN-RNN model was designed to capture the time dependencies of environmental factors and the genetic improvement of seeds over time without having their genotype information. (2) The model demonstrated the capability to generalize the yield prediction to untested environments without significant drop in the prediction accuracy. (3) Coupled with the backpropagation method, the model could reveal the extent to which weather conditions, accuracy of weather predictions, soil conditions, and management practices were able to explain the variation in the crop yields.

Citations (391)

View on Semantic Scholar

Summary

The paper introduces a hybrid CNN-RNN framework combining spatial feature extraction via CNNs with temporal modeling through LSTM-enhanced RNNs to boost prediction accuracy.
It achieves RMSE of approximately 9% for corn and 8% for soybean, outperforming methods like Random Forests, DFNN, and Lasso regression.
Feature selection using guided backpropagation reveals that solar radiation and temperature are critical factors for effective crop yield prediction.

Crop Yield Prediction Using a CNN-RNN Framework: An Expert Analysis

The paper entitled "A CNN-RNN Framework for Crop Yield Prediction" by Saeed Khaki, Lizhi Wang, and Sotirios V. Archontoulis presents an advanced deep learning approach for predicting crop yield, specifically focusing on corn and soybean in the United States Corn Belt. This research highlights the utilization of Convolutional Neural Networks (CNNs) in conjunction with Recurrent Neural Networks (RNNs) to address the complexities inherent in crop yield prediction due to various genetic, environmental, and management factors.

Methodological Innovation

The paper introduces a nuanced hybrid model merging CNNs and RNNs, designed to effectively handle the temporal and spatial dependencies present in the data. The CNN component of the model captures the spatial dependencies of soil data, while the RNN component is tailored to account for the temporal dependencies crucial for modeling genetic improvements over time. The inclusion of Long Short-Term Memory (LSTM) cells enhances the RNN's capacity to model sequential data, improving predictive accuracy by mitigating issues like the vanishing gradient.

Experimental Results

The paper compares the CNN-RNN model against multiple established methods such as Random Forests (RF), Deep Fully Connected Neural Networks (DFNN), and Lasso regression. Empirical results indicate that the CNN-RNN model achieves a prediction accuracy with a root-mean-square error (RMSE) of approximately 9% for corn and 8% for soybean relative to their historical average yields, outperforming the other methods significantly.

A novel aspect of this work is the feature selection analysis using guided backpropagation, which elucidates the importance of various input features, including weather components and soil conditions, as well as their temporal influence on crop yield prediction. Notably, solar radiation and temperature emerged as critical factors, reflecting agronomic realities concerning photosynthesis and crop growth stages.

Implications and Future Directions

The implications of this research are multifaceted. Practically, the ability to generalize yield predictions to untested environments without genotype information enhances decision-making related to crop management and agricultural policy. The model's ability to integrate simulated weather data also underscores its potential utility in forecasting scenarios where real-time meteorological data may be unavailable.

Theoretically, this framework reinforces the value of advanced neural network architectures in capturing intricate relationships in agricultural data. Future research might explore integrating real-time genotype data to refine model accuracy further. Additionally, translating this approach to other crops and geographic regions could broaden its applicability.

Conclusion

In conclusion, this paper advances the field of crop yield prediction through a sophisticated CNN-RNN architecture that effectively models complex interactions between environmental and management factors over time. The paper's methodological innovations, coupled with its robust experimental validation, contribute significantly to the understanding and practical application of deep learning techniques in agronomy, paving the way for future advancements in predictive modeling in agricultural sciences. The exploration of generalization capabilities and feature importance analysis are commendable steps towards making these models not only more accurate but also more interpretable and actionable.

PDF Markdown