Towards Automatic Power Battery Detection: New Challenge, Benchmark Dataset and Baseline
Abstract: We conduct a comprehensive study on a new task named power battery detection (PBD), which aims to localize the dense cathode and anode plates endpoints from X-ray images to evaluate the quality of power batteries. Existing manufacturers usually rely on human eye observation to complete PBD, which makes it difficult to balance the accuracy and efficiency of detection. To address this issue and drive more attention into this meaningful task, we first elaborately collect a dataset, called X-ray PBD, which has $1,500$ diverse X-ray images selected from thousands of power batteries of $5$ manufacturers, with $7$ different visual interference. Then, we propose a novel segmentation-based solution for PBD, termed multi-dimensional collaborative network (MDCNet). With the help of line and counting predictors, the representation of the point segmentation branch can be improved at both semantic and detail aspects.Besides, we design an effective distance-adaptive mask generation strategy, which can alleviate the visual challenge caused by the inconsistent distribution density of plates to provide MDCNet with stable supervision. Without any bells and whistles, our segmentation-based MDCNet consistently outperforms various other corner detection, crowd counting and general/tiny object detection-based solutions, making it a strong baseline that can help facilitate future research in PBD. Finally, we share some potential difficulties and works for future researches. The source code and datasets will be publicly available at \href{https://github.com/Xiaoqi-Zhao-DLUT/X-ray-PBD}{X-ray PBD}.
- Switching convolutional neural network for crowd counting. In CVPR, pages 5744–5752, 2017.
- End-to-end object detection with transformers. In ECCV, pages 213–229, 2020.
- Rich Caruana. Multitask learning: A knowledge-based source of inductive bias. In ICML, pages 41–48, 1993.
- A new sub-pixel detector for x-corners in camera calibration targets. In International Conference in Central Europe on Computer Graphics and Visualization, 2005.
- Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE TPAMI, 40:834–848, 2017.
- R-fcn: Object detection via region-based fully convolutional networks. In NeurIPS, 2016.
- R3net: Recurrent residual refinement network for saliency detection. In IJCAI, pages 684–690, 2018.
- Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser. In ACL, pages 845–850, 2015.
- Camouflaged object detection. In CVPR, pages 2777–2787, 2020a.
- Pranet: Parallel reverse attention network for polyp segmentation. In MICCAI, pages 263–273, 2020b.
- Scaling open-vocabulary image segmentation with image-level labels. In ECCV, pages 540–557. Springer, 2022.
- Ross Girshick. Fast r-cnn. In ICCV, pages 1440–1448, 2015.
- A combined corner and edge detector. In Alvey vision conference, number 50, pages 10–5244, 1988.
- Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
- Jay Hegdé. Time course of visual perception: coarse-to-fine processing and beyond. Progress in neurobiology, 84(4):405–439, 2008.
- ultralytics/yolov5: v5. 0-yolov5-p6 1280 models aws supervise. ly and youtube integrations. Zenodo, 11, 2021.
- Crowd counting by adaptively fusing predictions from an image pyramid. arXiv preprint arXiv:1805.06115, 2018.
- Adam: A method for stochastic optimization. In ICLR, 2015.
- Interactive multi-class tiny-object detection. In CVPR, pages 14136–14145, 2022.
- Feature pyramid networks for object detection. In CVPR, pages 2117–2125, 2017a.
- Focal loss for dense object detection. In ICCV, pages 2980–2988, 2017b.
- A simple pooling-based design for real-time salient object detection. In CVPR, pages 3917–3926, 2019.
- Instance segmentation for chinese character stroke extraction, datasets and benchmarks. arXiv preprint arXiv:2210.13826, 2022.
- Picanet: Learning pixel-wise contextual attention for saliency detection. In CVPR, pages 3089–3098, 2018.
- Bayesian loss for crowd count estimation with point supervision. In ICCV, pages 6142–6151, 2019.
- Multi-scale interactive network for salient object detection. In CVPR, pages 9413–9422, 2020.
- Segmentation assisted u-shaped multi-scale transformer for crowd counting. 2022.
- Mrdet: A multihead network for accurate rotated object detection in aerial images. IEEE TGRS, 60:1–12, 2021.
- Basnet: Boundary-aware salient object detection. In CVPR, pages 7479–7489, 2019.
- Iterative crowd counting. In ECCV, pages 270–285, 2018.
- Faster r-cnn: Towards real-time object detection with region proposal networks. In NeurIPS, 2015.
- U-net: Convolutional networks for biomedical image segmentation. In MICCAI, pages 234–241, 2015.
- Top-down feedback for crowd counting convolutional neural network. In AAAI, number 1, 2018.
- Jianbo Shi et al. Good features to track. In CVPR, pages 593–600, 1994.
- Crowd counting with deep negative correlation learning. In CVPR, pages 5382–5390, 2018.
- Generating high-quality crowd density maps using contextual pyramid cnns. In ICCV, pages 1861–1870, 2017.
- Indiscernible object counting in underwater scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13791–13801, 2023.
- Fcos: Fully convolutional one-stage object detection. In ICCV, pages 9627–9636, 2019.
- Adaptive density map generation for crowd counting. In ICCV, pages 1130–1139, 2019.
- Distribution matching for crowd counting. In NeurIPS, pages 1595–1607, 2020.
- RJ Watt. Scanning from coarse to fine spatial scales in the human visual system after the onset of a stimulus. JOSA A, 4(10):2006–2021, 1987.
- F33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTnet: Fusion, feedback and focus for salient object detection. In AAAI, pages 12321–12328, 2020.
- Spatiotemporal modeling for crowd counting in videos. In ICCV, pages 5151–5159, 2017.
- Detection technology for battery safety in electric vehicles: A review. Energies, 13(18):4636, 2020.
- Querydet: Cascaded sparse query for accelerating high-resolution small object detection. In CVPR, pages 13668–13677, 2022.
- Denseaspp for semantic segmentation in street scenes. In CVPR, pages 3684–3692, 2018.
- Small object detection via coarse-to-fine proposal generation and imitation learning. In ICCV, pages 6317–6327, 2023.
- Open-vocabulary object detection using captions. In CVPR, pages 14393–14402, 2021.
- A bi-directional message passing model for salient object detection. In CVPR, pages 1741–1750, 2018.
- Single-image crowd counting via multi-column convolutional neural network. In CVPR, pages 589–597, 2016.
- Pyramid scene parsing network. In CVPR, pages 2881–2890, 2017.
- Suppress and balance: A simple gated network for salient object detection. In ECCV, pages 35–51, 2020.
- Automatic polyp segmentation via multi-scale subtraction network. In MICCAI, pages 120–130, 2021.
- Joint learning of salient object detection, depth estimation and contour extraction. IEEE TIP, 31:7350–7362, 2022.
- Explore spatio-temporal aggregation for insubstantial object detection: benchmark dataset and baseline. In CVPR, pages 3104–3115, 2022.
- Specificity-preserving rgb-d saliency detection. In CVPR, pages 4681–4691, 2021.
- Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In ECCV, pages 121–136, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.