InsectMamba: Insect Pest Classification with State Space Model (2404.03611v1)
Abstract: The classification of insect pests is a critical task in agricultural technology, vital for ensuring food security and environmental sustainability. However, the complexity of pest identification, due to factors like high camouflage and species diversity, poses significant obstacles. Existing methods struggle with the fine-grained feature extraction needed to distinguish between closely related pest species. Although recent advancements have utilized modified network structures and combined deep learning approaches to improve accuracy, challenges persist due to the similarity between pests and their surroundings. To address this problem, we introduce InsectMamba, a novel approach that integrates State Space Models (SSMs), Convolutional Neural Networks (CNNs), Multi-Head Self-Attention mechanism (MSA), and Multilayer Perceptrons (MLPs) within Mix-SSM blocks. This integration facilitates the extraction of comprehensive visual features by leveraging the strengths of each encoding strategy. A selective module is also proposed to adaptively aggregate these features, enhancing the model's ability to discern pest characteristics. InsectMamba was evaluated against strong competitors across five insect pest classification datasets. The results demonstrate its superior performance and verify the significance of each model component by an ablation study.
- Faster-pestnet: A lightweight deep learning framework for crop pest detection and classification. IEEE Access, 11:104016–104027, 2023. doi: 10.1109/ACCESS.2023.3317506. URL https://doi.org/10.1109/ACCESS.2023.3317506.
- Insect recognition based on complementary features from multiple views. Scientific Reports, 13(1):2966, 2023.
- Exploring deep ensemble model for insect and pest detection from images. Procedia Computer Science, 218:2328–2337, 2023.
- Precise agriculture: Effective deep learning strategies to detect pest insects. IEEE CAA J. Autom. Sinica, 9(2):246–258, 2022. doi: 10.1109/JAS.2021.1004317. URL https://doi.org/10.1109/JAS.2021.1004317.
- Pest identification via deep residual learning in complex background. Comput. Electron. Agric., 141:351–356, 2017. doi: 10.1016/J.COMPAG.2017.08.005. URL https://doi.org/10.1016/j.compag.2017.08.005.
- Thanh-Nghi Doan. Large-scale insect pest image classification. Journal of Advances in Information Technology, 14(2):328–341, 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. URL https://openreview.net/forum?id=YicbFdNTTy.
- Mamba: Linear-time sequence modeling with selective state spaces. CoRR, abs/2312.00752, 2023. doi: 10.48550/ARXIV.2312.00752. URL https://doi.org/10.48550/arXiv.2312.00752.
- Chain-of-interaction: Enhancing large language models for psychiatric behavior understanding by dyadic contexts. arXiv preprint arXiv:2403.13786, 2024.
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770–778. IEEE Computer Society, 2016. doi: 10.1109/CVPR.2016.90. URL https://doi.org/10.1109/CVPR.2016.90.
- Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861, 2017. URL http://arxiv.org/abs/1704.04861.
- Leveraging relational graph neural network for transductive model ensemble. In Ambuj K. Singh, Yizhou Sun, Leman Akoglu, Dimitrios Gunopulos, Xifeng Yan, Ravi Kumar, Fatma Ozcan, and Jieping Ye, editors, Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023, Long Beach, CA, USA, August 6-10, 2023, pages 775–787. ACM, 2023. doi: 10.1145/3580305.3599414. URL https://doi.org/10.1145/3580305.3599414.
- Crop pest classification based on deep convolutional neural network and transfer learning. Comput. Electron. Agric., 164, 2019. doi: 10.1016/J.COMPAG.2019.104906. URL https://doi.org/10.1016/j.compag.2019.104906.
- Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
- Imagenet classification with deep convolutional neural networks. Commun. ACM, 60(6):84–90, 2017. doi: 10.1145/3065386. URL https://doi.org/10.1145/3065386.
- Language models are free boosters for biomedical imaging tasks. arXiv preprint arXiv:2403.17343, 2024a.
- Adaptive ensembles of fine-tuned transformers for llm-generated text detection, 2024b.
- A dataset for forestry pest identification. Frontiers in Plant Science, 13:857104, 2022.
- News recommendation with attention mechanism. CoRR, abs/2402.07422, 2024a. doi: 10.48550/ARXIV.2402.07422. URL https://doi.org/10.48550/arXiv.2402.07422.
- Particle filter SLAM for vehicle localization. CoRR, abs/2402.07429, 2024b. doi: 10.48550/ARXIV.2402.07429. URL https://doi.org/10.48550/arXiv.2402.07429.
- Vmamba: Visual state space model. CoRR, abs/2401.10166, 2024c. doi: 10.48550/ARXIV.2401.10166. URL https://doi.org/10.48550/arXiv.2401.10166.
- Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 9992–10002. IEEE, 2021. doi: 10.1109/ICCV48922.2021.00986. URL https://doi.org/10.1109/ICCV48922.2021.00986.
- Localization and classification of paddy field pests using a saliency map and deep convolutional neural network. Scientific reports, 6(1):20410, 2016.
- Task-agnostic detector for insertion-based backdoor attacks. arXiv preprint arXiv:2403.17155, 2024.
- Fionn Murtagh. Multilayer perceptrons for classification and regression. Neurocomputing, 2(5):183–197, 1990. doi: 10.1016/0925-2312(91)90023-5. URL https://doi.org/10.1016/0925-2312(91)90023-5.
- An introduction to convolutional neural networks. CoRR, abs/1511.08458, 2015. URL http://arxiv.org/abs/1511.08458.
- Yingshu Peng and Yi Wang. CNN and transformer framework for insect pest classification. Ecol. Informatics, 72:101846, 2022. doi: 10.1016/J.ECOINF.2022.101846. URL https://doi.org/10.1016/j.ecoinf.2022.101846.
- Feature reuse residual networks for insect pest recognition. IEEE Access, 7:122758–122768, 2019. doi: 10.1109/ACCESS.2019.2938194. URL https://doi.org/10.1109/ACCESS.2019.2938194.
- Very deep convolutional networks for large-scale image recognition. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1409.1556.
- Large language models for forecasting and anomaly detection: A systematic literature review. CoRR, abs/2402.10350, 2024. doi: 10.48550/ARXIV.2402.10350. URL https://doi.org/10.48550/arXiv.2402.10350.
- Mlp-mixer: An all-mlp architecture for vision. In Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 24261–24272, 2021. URL https://proceedings.neurips.cc/paper/2021/hash/cba0a4ee5ccd02fda0fe3f9a3e7b89fe-Abstract.html.
- Training data-efficient image transformers & distillation through attention. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pages 10347–10357. PMLR, 2021. URL http://proceedings.mlr.press/v139/touvron21a.html.
- An efficient insect pest classification using multiple convolutional neural network based models. In Hamido Fujita, Yutaka Watanobe, and Takuya Azumi, editors, New Trends in Intelligent Software Methodologies, Tools and Techniques - Proceedings of the 21st International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques, SoMeT 2022, Kitakyushu, Japan, 20-22 September, 2022, volume 355 of Frontiers in Artificial Intelligence and Applications, pages 584–595. IOS Press, 2022. doi: 10.3233/FAIA220287. URL https://doi.org/10.3233/FAIA220287.
- Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008, 2017. URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
- A crop pests image classification algorithm based on deep convolutional neural network. TELKOMNIKA (Telecommunication Computing Electronics and Control), 15(3):1239–1246, 2017.
- The new agronomists: Language models are experts in crop management. arXiv preprint arXiv:2403.19839, 2024.
- IP102: A large-scale benchmark dataset for insect pest recognition. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 8787–8796. Computer Vision Foundation / IEEE, 2019. doi: 10.1109/CVPR.2019.00899. URL http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_IP102_A_Large-Scale_Benchmark_Dataset_for_Insect_Pest_Recognition_CVPR_2019_paper.html.
- Automatic classification for field crop insects via multiple-task sparse representation and multiple-kernel learning. Comput. Electron. Agric., 119:123–132, 2015. doi: 10.1016/J.COMPAG.2015.10.015. URL https://doi.org/10.1016/j.compag.2015.10.015.
- On the trade-off of intra-/inter-class diversity for supervised pre-training. In Alice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, and Sergey Levine, editors, Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023, 2023. URL http://papers.nips.cc/paper_files/paper/2023/hash/ca9567d8ef6b2ea2da0d7eed57b933ee-Abstract-Conference.html.
- Thread of thought unraveling chaotic contexts. CoRR, abs/2311.08734, 2023. doi: 10.48550/ARXIV.2311.08734. URL https://doi.org/10.48550/arXiv.2311.08734.
- Visual in-context learning for large vision-language models. CoRR, abs/2402.11574, 2024. doi: 10.48550/ARXIV.2402.11574. URL https://doi.org/10.48550/arXiv.2402.11574.