Crop Yield Prediction Using Deep Neural Networks (1902.02860v3)

Published 7 Feb 2019 in cs.LG, stat.AP, and stat.ML

Abstract: Crop yield is a highly complex trait determined by multiple factors such as genotype, environment, and their interactions. Accurate yield prediction requires fundamental understanding of the functional relationship between yield and these interactive factors, and to reveal such relationship requires both comprehensive datasets and powerful algorithms. In the 2018 Syngenta Crop Challenge, Syngenta released several large datasets that recorded the genotype and yield performances of 2,267 maize hybrids planted in 2,247 locations between 2008 and 2016 and asked participants to predict the yield performance in 2017. As one of the winning teams, we designed a deep neural network (DNN) approach that took advantage of state-of-the-art modeling and solution techniques. Our model was found to have a superior prediction accuracy, with a root-mean-square-error (RMSE) being 12% of the average yield and 50% of the standard deviation for the validation dataset using predicted weather data. With perfect weather data, the RMSE would be reduced to 11% of the average yield and 46% of the standard deviation. We also performed feature selection based on the trained DNN model, which successfully decreased the dimension of the input space without significant drop in the prediction accuracy. Our computational results suggested that this model significantly outperformed other popular methods such as Lasso, shallow neural networks (SNN), and regression tree (RT). The results also revealed that environmental factors had a greater effect on the crop yield than genotype.

Citations (494)

View on Semantic Scholar

Summary

The paper introduces a novel deep neural network model with 21 hidden layers to predict maize crop yields.
It employs techniques such as SGD, batch normalization, and guided backpropagation to efficiently capture nonlinear genotype-environment interactions.
The model achieves a validation RMSE of 11% of the average yield, emphasizing the stronger impact of environmental factors over genetic ones.

Deep Neural Networks for Crop Yield Prediction: A Comprehensive Analysis

Crop yield prediction is a critical challenge in agricultural science and policy-making, given its implications for food security, resource management, and economic decisions. The paper entitled "Crop Yield Prediction Using Deep Neural Networks" introduces an advanced model leveraging deep learning techniques to address the complexities of predicting maize yields based on genotype and environmental data. This research was noteworthy for its performance in the 2018 Syngenta Crop Challenge, highlighting the potency of deep neural networks (DNNs) in tackling multi-faceted agricultural predictions.

Methodological Insights

The paper utilizes a dataset involving 2,267 maize hybrids planted over 2,247 locations, with data spanning from 2008 to 2016. The deep neural network model developed in this paper stands out due to its depth, comprising 21 hidden layers with 50 neurons each, which enables it to capture complex, nonlinear relationships between inputs and outputs. The employment of sophisticated techniques such as SGD, batch normalization, residual shortcuts, and maxout activation functions underlines the model's robustness in learning from data with intricate GxE interactions.

The model's feature selection capability, implemented through guided backpropagation, further accentuates its efficiency by diminishing the dimensionality of input data without significantly sacrificing prediction accuracy. Such an approach allows for identifying crucial genetic markers and environmental components, thus enhancing model interpretability despite the inherent complexity typical of deep learning systems.

Key Findings and Results

The empirical results underscore the superior prediction performance of the DNN model over traditional models like Lasso, shallow neural networks, and regression trees. Specifically, the model achieves a validation RMSE of 11% of the average yield using perfect weather data, illustrating its superior accuracy. Notably, the model elucidates that environmental conditions wield a more significant impact on crop yield than genetic factors, providing valuable insights into yield determinants.

Moreover, the sensitivity analysis indicates the critical role of accurate weather prediction in improving yield estimates, emphasizing the intertwined nature of agricultural and meteorological sciences. This dependence highlights areas for future research, particularly in enhancing weather forecast models to fortify the precision of yield predictions.

Implications and Future Directions

This research presents substantial implications for agriculture and computational biology by refining crop yield model precision. Practically, deploying such models can assist policymakers and agronomists in making informed decisions related to crop management and food supply chain logistics. Theoretically, it sets a foundation for further explorations into complex environmental-genetic interplays using advanced machine learning techniques.

Future research could focus on enhancing the explainability of such models, addressing the common critique regarding the 'black box' nature of neural networks. This could involve integrating domain knowledge into the modeling process or developing hybrid models that balance accuracy with interpretability. Furthermore, expanding the dataset to include diverse crop types and incorporating more detailed environmental data could enhance the model's applicability across broader agricultural contexts.

In summary, the paper exemplifies the potential of deep learning in agricultural forecasting, providing a strong case for its integration into crop yield prediction frameworks. By advancing both the accuracy and methodological depth of such models, this research contributes significantly to the toolkit available for addressing the complex challenges of modern agriculture.

PDF Markdown