Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes (2403.11572v1)

Published 18 Mar 2024 in cs.CV and cs.MM

Abstract: Instance segmentation is a fundamental task in computer vision with broad applications across various industries. In recent years, with the proliferation of deep learning and artificial intelligence applications, how to train effective models with limited data has become a pressing issue for both academia and industry. In the Visual Inductive Priors challenge (VIPriors2023), participants must train a model capable of precisely locating individuals on a basketball court, all while working with limited data and without the use of transfer learning or pre-trained models. We propose Memory effIciency inStance Segmentation framework based on visual inductive prior flow propagation that effectively incorporates inherent prior information from the dataset into both the data preprocessing and data augmentation stages, as well as the inference phase. Our team (ACVLAB) experiments demonstrate that our model achieves promising performance (0.509 [email protected]:0.95) even under limited data and memory constraints.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Vipriors 3: Visual inductive priors for data-efficient deep learning challenges. arXiv preprint arXiv:2305.19688, 2022.
  2. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018.
  3. John Canny. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6):679–698, 1986.
  4. Kai et al. Chen. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
  5. Gridmask data augmentation. arXiv:2001.04086, 2020.
  6. Per-pixel classification is not all you need for semantic segmentation. In Advances in Neural Information Processing Systems, 2021.
  7. Use of the hough transformation to detect lines and curves in pictures. Communications of the ACM, 15(1):11–15, 1972.
  8. Hybrid task cascade for instance segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4969–4978, 2019.
  9. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. International Conference on Machine Learning, 2022.
  10. Instances as queries. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6910–6919, October 2021.
  11. Simple copy-paste is a strong data augmentation method for instance segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 2021.
  12. Mask scoring r-cnn. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019.
  13. Instance segmentation challenge track technical report, vipriors workshop at iccv 2021: Task-specific copy-paste data augmentation method for instance segmentation. arXiv preprint arXiv:2110.00470, 2021.
  14. Vipriors 2: Visual inductive priors for data-efficient deep learning challenges. arXiv preprint arXiv:2201.08625, 2021.
  15. Cbnet: A composite backbone network architecture for object detection. Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
  16. Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE/CVF international conference on computer vision, 2021.
  17. Task-specific data augmentation and inference processing for vipriors instance segmentation challenge. arXiv preprint arXiv:2211.11282, 2022.
  18. Kaiming He Yuxin Wu. Group normalization. European Conference on Computer Vision, 2018.
  19. Swa object detection. arXiv preprint arXiv:2012.12645, 2020.

Summary

We haven't generated a summary for this paper yet.