Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Annolid: Annotate, Segment, and Track Anything You Need (2403.18690v1)

Published 27 Mar 2024 in cs.CV and cs.AI

Abstract: Annolid is a deep learning-based software package designed for the segmentation, labeling, and tracking of research targets within video files, focusing primarily on animal behavior analysis. Based on state-of-the-art instance segmentation methods, Annolid now harnesses the Cutie video object segmentation model to achieve resilient, markerless tracking of multiple animals from single annotated frames, even in environments in which they may be partially or entirely concealed by environmental features or by one another. Our integration of Segment Anything and Grounding-DINO strategies additionally enables the automatic masking and segmentation of recognizable animals and objects by text command, removing the need for manual annotation. Annolid's comprehensive approach to object segmentation flexibly accommodates a broad spectrum of behavior analysis applications, enabling the classification of diverse behavioral states such as freezing, digging, pup huddling, and social interactions in addition to the tracking of animals and their body parts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. C. Yang, J. Forest, M. Einhorn, and T. A. Cleland, “Automated behavioral analysis using instance segmentation,” arXiv:2312.07723, 2023.
  2. S. Liu, Z. Zeng, T. Ren, F. Li, H. Zhang, J. Yang, C. Li, J. Yang, H. Su, J. Zhu et al., “Grounding DINO: Marrying DINO with grounded pre-training for open-set object detection,” arXiv:2303.05499, 2023.
  3. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment Anything,” arXiv:2304.02643, 2023.
  4. L. Ke, M. Ye, M. Danelljan, Y. Liu, Y.-W. Tai, C.-K. Tang, and F. Yu, “Segment Anything in high quality,” arXiv:2306.01567, 2023.
  5. C. Zhang, D. Han, Y. Qiao, J. U. Kim, S.-H. Bae, S. Lee, and C. S. Hong, “Faster Segment Anything: Towards lightweight SAM for mobile applications,” arXiv:2306.14289, 2023.
  6. H. K. Cheng, S. W. Oh, B. Price, J.-Y. Lee, and A. Schwing, “Putting the object back into video object segmentation,” arXiv:2310.12982, 2023.
  7. F. Romero-Ferrero, M. G. Bergomi, R. C. Hinz, F. J. Heras, and G. G. De Polavieja, “IdTracker.ai: tracking all individuals in small or large collectives of unmarked animals,” Nature Methods, vol. 16, no. 2, pp. 179–182, 2019.
  8. K. Wada, “Labelme: Image polygonal annotation with Python.” [Online]. Available: https://github.com/wkentaro/labelme
  9. K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in CVPR, 2017.
  10. Y. Wu, A. Kirillov, F. Massa, W.-Y. Lo, and R. Girshick, “Detectron2,” 2019. [Online]. Available: https://github.com/facebookresearch/detectron2
  11. J. Fang, C. Yang, and T. A. Cleland, “Scoring rodent digging behavior with Annolid,” Soc. Neurosci. Abstr. 512.01, 2023.
  12. C. Zhou, X. Li, C. C. Loy, and B. Dai, “Edgesam: Prompt-in-the-loop distillation for on-device deployment of SAM,” arXiv:2312.06660, 2023.
  13. W. Wang, “Advanced auto labeling solution with added features,” CVHub, 2023. [Online]. Available: https://github.com/CVHub520/X-AnyLabeling
  14. G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
  15. H. K. Cheng, S. W. Oh, B. Price, A. Schwing, and J.-Y. Lee, “Tracking anything with decoupled video segmentation,” in ICCV, 2023.
  16. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in ECCV, 2014.
  17. T. D. Pereira, N. Tabris, A. Matsliah, D. M. Turner, J. Li, S. Ravindranath, E. S. Papadoyannis, E. Normand, D. S. Deutsch, Z. Y. Wang, G. C. McKenzie-Smith, C. C. Mitelut, M. D. Castro, J. D’Uva, M. Kislin, D. H. Sanes, S. D. Kocher, S. S-H, A. L. Falkner, J. W. Shaevitz, and M. Murthy, “SLEAP: A deep learning system for multi-animal pose tracking,” Nature Methods, vol. 19, no. 4, 2022.
  18. J. Lauer, M. Zhou, S. Ye, W. Menegas, S. Schneider, T. Nath, M. M. Rahman, V. D. Santo, D. Soberanes, G. Feng, V. N. Murthy, G. Lauder, C. Dulac, M. Mathis, and A. Mathis, “Multi-animal pose estimation, identification and tracking with DeepLabCut,” Nature Methods, vol. 19, pp. 496 – 504, 2022.
  19. A. Pérez-Escudero, J. Vicente-Page, R. C. Hinz, S. Arganda, and G. G. De Polavieja, “idTracker: tracking individuals in a group by automatic identification of unmarked animals,” Nature Methods, vol. 11, no. 7, pp. 743–748, 2014.
  20. C. Kim, F. Li, A. Ciptadi, and J. M. Rehg, “Multiple hypothesis tracking revisited,” in ICCV, 2015.
  21. S. Tang, M. Andriluka, B. Andres, and B. Schiele, “Multiple people tracking by lifted multicut and person re-identification,” in CVPR, 2017.
  22. P. Bergmann, T. Meinhardt, and L. Leal-Taixe, “Tracking without bells and whistles,” in ICCV, 2019.
  23. H. K. Cheng and A. G. Schwing, “XMem: Long-term video object segmentation with an Atkinson-Shiffrin memory model,” in ECCV, 2022.
  24. A. Athar, J. Luiten, P. Voigtlaender, T. Khurana, A. Dave, B. Leibe, and D. Ramanan, “BURST: A benchmark for unifying object recognition, segmentation and tracking in video,” in WACV, 2023.
  25. X. Zou, J. Yang, H. Zhang, F. Li, L. Li, J. Gao, and Y. J. Lee, “Segment everything everywhere all at once,” arXiv:2304.06718, 2023.
  26. F. Li, H. Zhang, P. Sun, X. Zou, S. Liu, J. Yang, C. Li, L. Zhang, and J. Gao, “Semantic-SAM: Segment and recognize anything at any granularity,” arXiv:2307.04767, 2023.
  27. X. Zhao, W. Ding, Y. An, Y. Du, T. Yu, M. Li, M. Tang, and J. Wang, “Fast Segment Anything,” 2023.
  28. Y. Xiong, B. Varadarajan, L. Wu, X. Xiang, F. Xiao, C. Zhu, X. Dai, D. Wang, F. Sun, F. Iandola, R. Krishnamoorthi, and V. Chandra, “EfficientSAM: Leveraged masked image pretraining for efficient Segment Anything,” arXiv:2312.00863, 2023.
  29. H. Cai, C. Gan, and S. Han, “Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition,” arXiv:2205.14756, 2022.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com