FourCropNet: Multi-Crop Disease Detection CNN
- FourCropNet is a unified convolutional neural network that detects plant diseases in four key crops using integrated residual and channel attention mechanisms.
- It features a lightweight architecture optimized for real-time inference on edge devices, ensuring efficient deployment in resource-constrained agricultural settings.
- Empirical evaluations show it outperforms models like VGG16, MobileNet-V2, and EfficientNet-B1 with accuracies exceeding 95% on combined multi-crop disease datasets.
FourCropNet is a convolutional neural network (CNN) architecture developed for high-accuracy, efficient plant disease detection across four economically significant crops: CottonLeaf, Grape, Soybean, and Corn. Designed as a unified model for multi-crop visual disease diagnosis, FourCropNet leverages residual feature extraction, channel attention mechanisms, and computationally lightweight classification heads to achieve robust generalization and scalability over multiple classes and datasets. Benchmarking studies indicate that FourCropNet consistently outperforms established models such as MobileNet, VGG16, and EfficientNet, with superior accuracy, specificity, sensitivity, and F1-score, achieved under both single-crop and combined multi-class settings. Its design is optimized for real-time inference on edge devices, facilitating practical deployment in resource-constrained agricultural scenarios (Khandagale et al., 11 Mar 2025).
1. Model Architecture
FourCropNet employs an integrated architectural scheme tailored for the detection of diverse crop diseases from RGB leaf images. The network processes 224×224×3 input images through three main feature extraction stages, followed by two fully-connected classification layers. The key architectural stages are:
- Initial convolution: A 3×3 convolution (32 filters), followed by batch normalization and ReLU activation, reduces the spatial dimension to 112×112×32 via 2×2 max pooling.
- Residual blocks: Two pre-activation residual blocks, inspired by ResNet [He et al., CVPR 2016], facilitate efficient feature reuse and deeper representational learning. Each block applies a 3×3 convolution and skip connection. The first block maintains 32 channels (output 112×112×32), while the second increases to 64 channels and down-samples to 56×56×64.
- Residual + Attention block: Two successive 3×3 convolutions (128 filters) are each followed by batch normalization and ReLU. Integrated with a Squeeze-and-Excitation (SE) channel attention mechanism [Hu et al., 2018], this block applies global pooling (squeeze), an excitation phase via a two-layer MLP (dimensionality reduction ratio r=16), and rescaling per channel. Max pooling yields an output of 28×28×128.
- Classification head: Feature maps are flattened and processed by two fully connected (256 and 128 units respectively) layers with ReLU and 0.5 dropout, culminating in a 15-unit softmax for multi-class prediction.
FourCropNet comprises approximately 6.5 million learnable parameters with a moderate computational depth (~25 layers) and a single SE-attention block, resulting in a FLOP count estimated at ≈300 million. This level balances accuracy and efficiency, positioning FourCropNet between MobileNet-V3 (5.4M params, 200M FLOPs) and EfficientNet-B1 (7.8M params, 500M FLOPs).
2. Training Setup and Datasets
FourCropNet’s training regime utilizes diverse datasets covering disease and healthy foliar images for each crop, as well as a combined 15-class dataset. The datasets are:
- CottonLeaf (5 classes), Grape (4 classes), Soybean (4 classes), Corn (4 classes), and a Combined (15-class, ~20,000 images) set derived from the Kaggle “20k+ Multi-Class Crop Disease Images”.
- Class sizes typically range from 800–1,500 images, with an 80%/10%/10% train/validation/test split.
Image preprocessing includes resizing to 224×224, application of random rotations (±15°), horizontal and vertical flipping, brightness and contrast modulation, and per-channel normalization (zero mean, unit variance). The network is trained using categorical cross-entropy loss
with Adam optimizer (initial learning rate 1e-4, reduced by ×0.1 every 10 epochs), batch size 32, weight decay 1e-5, dropout 0.5 in the fully-connected layers, and typical convergence within 30–40 epochs.
3. Empirical Performance
FourCropNet’s performance is validated on held-out test sets using accuracy, specificity, sensitivity, and F1-score as metrics. The following summarizes evaluation across datasets:
| Dataset | Accuracy | Specificity | Sensitivity | F1-Score |
|---|---|---|---|---|
| CottonLeaf | 96.8% | 95.9% | 96.2% | 96.0% |
| Grape | 99.7% | 99.5% | 99.6% | 99.6% |
| Soybean | 96.0% | 95.2% | 95.5% | 95.3% |
| Corn | 99.5% | 99.3% | 99.4% | 99.4% |
| Combined | 95.3% | 95.7% | 95.0% | 95.3% |
In comparison to baseline architectures, FourCropNet demonstrates the following performance on the combined dataset:
| Model | Accuracy | Specificity | Sensitivity | F1 | Params (M) |
|---|---|---|---|---|---|
| VGG16 | 89.2% | 88.1% | 89.0% | 88.5 | 138.4 |
| MobileNet-V2 | 91.5% | 90.8% | 91.0% | 91.0 | 5.4 |
| EfficientNet-B1 | 93.8% | 93.2% | 93.5% | 93.3 | 7.8 |
| FourCropNet | 95.3% | 95.7% | 95.0% | 95.3 | 6.5 |
FourCropNet achieves +1.5–3.8 percentage points higher accuracy than EfficientNet-B1 with comparable parameter count and reduced inference time (7 ms/image on NVIDIA GTX 1080Ti versus 9 ms/image for EfficientNet-B1).
4. Scalability, Generalization, and Ablation
FourCropNet demonstrates robust scalability and generalization as the number of classes increases:
- AUC metrics from ROC analysis: single-crop (4-class) >0.995, two crops (8-class) >0.990, three crops (12-class) >0.985, all crops (15-class) =0.983.
- Performance shows graceful degradation with increasing class count, indicating resilience to inter-class visual similarity.
Ablation studies isolate the impact of model components:
- Removing SE-attention decreases combined accuracy from 95.3% to 93.7% (–1.6 percentage points).
- Replacing residual skip connections with plain convolution yields an accuracy of 94.1%.
- Both residual and attention mechanisms each contribute approximately 1–2 percentage points to overall accuracy, highlighting the additive benefit of these architectural elements.
5. Computational Efficiency and Deployment
FourCropNet is engineered for practical inference speed and memory efficiency:
- Model size is ≈26 MB, suitable for deployment on edge devices.
- Inference latency is approximately 25–30 milliseconds per image on mid-range smartphone CPUs, and 7 milliseconds per image on NVIDIA GTX 1080Ti.
- Supports export to TensorFlow-Lite and ONNX formats for compatibility with portable and embedded hardware.
- Batch processing for offline inference on resource-limited GPUs remains feasible.
This computational profile facilitates robust field deployment for near real-time disease diagnosis.
6. Applications in Precision Agriculture
FourCropNet's application scope includes:
- Real-time in-field disease scanning via smartphone or drone-mounted cameras.
- Automated alerting to enable rapid intervention and early outbreak containment.
- Integration with farm management platforms for geo-tagged disease mapping, supporting spatial crop health analytics.
These features enable timely and precise diagnosis, reducing yield loss, unnecessary pesticide application, and improving sustainability in multi-crop agricultural contexts. The architecture is also adaptable to new crop classes via fine-tuning on augmented or novel leaf image datasets, supporting extensibility in changing agricultural environments.
7. Significance and Impact
FourCropNet provides a scalable framework for multi-crop disease detection with accuracy exceeding 95% on complex (15-class) tasks while remaining viable for edge-based deployment (Khandagale et al., 11 Mar 2025). The synthesis of residual and attention-based feature learning within a lightweight parameter and FLOP budget exemplifies advances in practical deep learning for digital agriculture. Its empirical superiority over contemporary models and adaptation to mobile inference workflows addresses key challenges in precision agriculture, contributing to improved crop health monitoring and management.