An Efficient MLP-based Point-guided Segmentation Network for Ore Images with Ambiguous Boundary (2402.17370v1)
Abstract: The precise segmentation of ore images is critical to the successful execution of the beneficiation process. Due to the homogeneous appearance of the ores, which leads to low contrast and unclear boundaries, accurate segmentation becomes challenging, and recognition becomes problematic. This paper proposes a lightweight framework based on Multi-Layer Perceptron (MLP), which focuses on solving the problem of edge burring. Specifically, we introduce a lightweight backbone better suited for efficiently extracting low-level features. Besides, we design a feature pyramid network consisting of two MLP structures that balance local and global information thus enhancing detection accuracy. Furthermore, we propose a novel loss function that guides the prediction points to match the instance edge points to achieve clear object boundaries. We have conducted extensive experiments to validate the efficacy of our proposed method. Our approach achieves a remarkable processing speed of over 27 frames per second (FPS) with a model size of only 73 MB. Moreover, our method delivers a consistently high level of accuracy, with impressive performance scores of 60.4 and 48.9 in~$AP_{50}{box}$ and~$AP_{50}{mask}$ respectively, as compared to the currently available state-of-the-art techniques, when tested on the ore image dataset. The source code will be released at \url{https://github.com/MVME-HBUT/ORENEXT}.
- Y. Zhang, Y. Zhou, H. Pan, B. Wu, and G. Sun, “Visual fault detection of multiscale key components in freight trains,” IEEE Trans. Ind. Informat., vol. 19, no. 8, pp. 9082–9090, 2023.
- S. Xia, L. Chu, L. Pei, W. Yu, and R. C. Qiu, “A boundary consistency-aware multitask learning framework for joint activity segmentation and recognition with wearable sensors,” IEEE Trans. Ind. Informat., vol. 19, no. 3, pp. 2984–2996, 2023.
- Y. Zhang, M. Liu, Y. Yang, Y. Guo, and H. Zhang, “A unified light framework for real-time fault detection of freight train images,” IEEE Trans. Ind. Informat., vol. 17, no. 11, pp. 7423–7432, 2021.
- J. Chu, Z. Guo, and L. Leng, “Object detection based on multi-layer convolution feature fusion and online hard example mining,” IEEE Access, vol. 6, pp. 19959–19967, 2018.
- W. Lin, J. Chu, L. Leng, J. Miao, and L. Wang, “Feature disentanglement in one-stage object detection,” Pattern Recognition, vol. 145, p. 109878, 2024.
- T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Eur. Conf. Comput. Vis., pp. 740–755, 2014.
- I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, M. Lucic, and A. Dosovitskiy, “Mlp-mixer: An all-mlp architecture for vision,” in Neural Information Processing Systems, vol. 34, pp. 24261–24272, 2021.
- H. Touvron, P. Bojanowski, M. Caron, M. Cord, A. El-Nouby, E. Grave, G. Izacard, A. Joulin, G. Synnaeve, J. Verbeek, and H. Jegou, “Resmlp: Feedforward networks for image classification with data-efficient training,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 4, pp. 5314–5321, 2023.
- A. Kirillov, Y. Wu, K. He, and R. Girshick, “Pointrend: Image segmentation as rendering,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 9796–9805, 2020.
- G. Sun, D. Huang, L. Cheng, J. Jia, C. Xiong, and Y. Zhang, “Efficient and lightweight framework for real-time ore image segmentation based on deep learning,” Minerals, vol. 12, no. 5, pp. 1–18, 2022.
- D. P. Mukherjee, Y. Potapovich, I. Levner, and H. Zhang, “Ore image segmentation by learning image and shape features,” Pattern Recognition Letters, vol. 30, no. 6, pp. 615–622, 2009.
- Y. Zhang, L. Cheng, Y. Peng, C. Xu, Y. Fu, B. Wu, and G. Sun, “Faster orefsdet: A lightweight and effective few-shot object detector for ore images,” Pattern Recognition, vol. 141, p. 109664, 2023.
- H. Li, C. Pan, Z. Chen, A. Wulamu, and A. Yang, “Ore image segmentation method based on u-net and watershed,” Computers, Materials and Continua, vol. 65, no. 1, pp. 563–578, 2020.
- J. Liu, Z. Jiang, W. Gui, and Z. Chen, “A novel particle size detection system based on rgb-laser fusion segmentation with feature dual-recalibration for blast furnace materials,” IEEE Transactions on Industrial Electronics, vol. 70, no. 10, pp. 10690–10699, 2023.
- Y. Liu, Z. Zhang, X. Liu, W. Lei, and X. Xia, “Deep learning based mineral image classification combined with visual attention mechanism,” IEEE Access, vol. 9, pp. 98091–98109, 2021.
- D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: real-time instance segmentation,” in IEEE/CVF Int. Conf. Comput. Vis., pp. 9156–9165, 2019.
- H. Chen, K. Sun, Z. Tian, C. Shen, Y. Huang, and Y. Yan, “Blendmask: Top-down meets bottom-up for instance segmentation,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 8570–8578, 2020.
- K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in IEEE/CVF Int. Conf. Comput. Vis., pp. 2980–2988, 2017.
- S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017.
- Y. Zhang, J. Chu, L. Leng, and J. Miao, “Mask-refined r-cnn: A network for refining object details in instance segmentation,” Sensors, vol. 20, no. 4, 2020.
- Z. Cai and N. Vasconcelos, “Cascade R-CNN: high quality object detection and instance segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 5, pp. 1483–1498, 2021.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations, pp. 1–21, 2021.
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in IEEE/CVF Int. Conf. Comput. Vis., pp. 9992–10002, 2021.
- D. Lian, Z. Yu, X. Sun, and S. Gao, “AS-MLP: An axial shifted MLP architecture for vision,” in International Conference on Learning Representations, pp. 1–19, 2022.
- S. Chen, E. Xie, C. Ge, R. Chen, D. Liang, and P. Luo, “Cyclemlp: A mlp-like architecture for dense visual predictions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 12, pp. 14284–14300, 2023.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Neural Information Processing Systems, pp. 6000–6010, 2017.
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 248–255, 2009.
- Z. Tian, C. Shen, and H. Chen, “Conditional convolutions for instance segmentation,” in Eur. Conf. Comput. Vis., pp. 282–298, 2020.
- Z. Huang, L. Huang, Y. Gong, C. Huang, and X. Wang, “Mask scoring R-CNN,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 6402 – 6411, 2019.
- Z. Tian, C. Shen, X. Wang, and H. Chen, “Boxinst: High-performance instance segmentation with box annotations,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 5439–5448, 2021.
- T. Cheng, X. Wang, S. Chen, W. Zhang, Q. Zhang, C. Huang, Z. Zhang, and W. Liu, “Sparse instance activation for real-time instance segmentation,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 4423–4432, 2022.
- C. Lyu, W. Zhang, H. Huang, Y. Zhou, Y. Wang, Y. Liu, S. Zhang, and K. Chen, “Rtmdet: An empirical study of designing real-time object detectors,” arXiv:2212.0778, 2022.
- B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, and R. Girdhar, “Masked-attention mask transformer for universal image segmentation,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 1280–1289, 2022.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollar, and R. Girshick, “Segment anything,” in IEEE/CVF Int. Conf. Comput. Vis., pp. 4015–4026, October 2023.
- K. Chen, C. Liu, H. Chen, H. Zhang, W. Li, Z. Zou, and Z. X. Shi, “Rsprompter: Learning to prompt for remote sensing instance segmentation based on visual foundation model,” ArXiv:2306.12156, 2023.
- X. Zhao, W. Ding, Y. An, Y. Du, T. Yu, M. Li, M. Tang, and J. Wang, “Fast segment anything,” arXiv:2306.12156, 2023.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 770–778, 2016.
- I. Radosavovic, R. P. Kosaraju, R. Girshick, K. He, and P. Dollár, “Designing Network Design Spaces,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 10425–10433, 2020.
- K. Sun, B. Xiao, D. Liu, and J. Wang, “Deep high-resolution representation learning for human pose estimation,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 5686–5696, 2019.
- W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao, “Pyramid vision transformer: A versatile backbone for dense prediction without convolutions,” in IEEE/CVF Int. Conf. Comput. Vis., pp. 548–558, 2021.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.