Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CoralVOS: Dataset and Benchmark for Coral Video Segmentation (2310.01946v1)

Published 3 Oct 2023 in cs.CV

Abstract: Coral reefs formulate the most valuable and productive marine ecosystems, providing habitat for many marine species. Coral reef surveying and analysis are currently confined to coral experts who invest substantial effort in generating comprehensive and dependable reports (\emph{e.g.}, coral coverage, population, spatial distribution, \textit{etc}), from the collected survey data. However, performing dense coral analysis based on manual efforts is significantly time-consuming, the existing coral analysis algorithms compromise and opt for performing down-sampling and only conducting sparse point-based coral analysis within selected frames. However, such down-sampling will \textbf{inevitable} introduce the estimation bias or even lead to wrong results. To address this issue, we propose to perform \textbf{dense coral video segmentation}, with no down-sampling involved. Through video object segmentation, we could generate more \textit{reliable} and \textit{in-depth} coral analysis than the existing coral reef analysis algorithms. To boost such dense coral analysis, we propose a large-scale coral video segmentation dataset: \textbf{CoralVOS} as demonstrated in Fig. 1. To the best of our knowledge, our CoralVOS is the first dataset and benchmark supporting dense coral video segmentation. We perform experiments on our CoralVOS dataset, including 6 recent state-of-the-art video object segmentation (VOS) algorithms. We fine-tuned these VOS algorithms on our CoralVOS dataset and achieved observable performance improvement. The results show that there is still great potential for further promoting the segmentation accuracy. The dataset and trained models will be released with the acceptance of this work to foster the coral reef research community.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. C. B. Edwards, Y. Eynaud, G. J. Williams, N. E. Pedersen, B. J. Zgliczynski, A. C. Gleason, J. E. Smith, and S. A. Sandin, “Large-area imaging reveals biologically driven non-random spatial patterns of corals at a remote reef,” Coral Reefs, vol. 36, no. 4, pp. 1291–1305, 2017.
  2. T. Treibitz, B. P. Neal, D. I. Kline, O. Beijbom, P. L. D. Roberts, B. G. Mitchell, and D. Kriegman, “Wide field-of-view fluorescence imaging of coral reefs,” Scientific Reports, 2015.
  3. N. Levy, O. Berman, M. Yuval, Y. Loya, T. Treibitz, E. Tarazi, and O. Levy, “Emerging 3d technologies for future reformation of coral reefs: Enhancing biodiversity using biomimetic structures based on designs by nature,” Science of The Total Environment, vol. 830, p. 154749, 2022.
  4. O. Beijbom, P. J. Edmunds, D. I. Kline, B. G. Mitchell, and D. Kriegman, “Automated annotation of coral reef survey images,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1170–1177, IEEE, 2012.
  5. Z. B. Ahmad, M. I. H. B. M. Jinah, and S. B. Saad, “Comparison of 3d coral photogrammetry and coral video transect for coral lifeform analysis using low-cost underwater action camera,” ASEAN Journal on Science and Technology for Development, vol. 37, no. 1, pp. 15–20, 2020.
  6. J. E. Cinner, C. Huchery, M. A. MacNeil, N. A. Graham, T. R. McClanahan, J. Maina, E. Maire, J. N. Kittinger, C. C. Hicks, C. Mora, et al., “Bright spots among the world’s coral reefs,” Nature, vol. 535, no. 7612, pp. 416–419, 2016.
  7. A. F. Haas, M. F. Fairoz, L. W. Kelly, C. E. Nelson, E. A. Dinsdale, R. A. Edwards, S. Giles, M. Hatay, N. Hisakawa, B. Knowles, et al., “Global microbialization of coral reefs,” Nature microbiology, vol. 1, no. 6, pp. 1–7, 2016.
  8. H. Cho, B. Kim, and S.-C. Yu, “Auv-based underwater 3-d point cloud generation using acoustic lens-based multibeam sonar,” IEEE Journal of Oceanic Engineering (JOE), vol. 43, no. 4, pp. 856–872, 2017.
  9. V. A. Huvenne, K. Robert, L. Marsh, C. L. Iacono, T. Le Bas, and R. B. Wynn, “Rovs and auvs,” in Submarine Geomorphology, pp. 93–108, Springer, 2018.
  10. M. Modasshir and I. Rekleitis, “Enhancing coral reef monitoring utilizing a deep semi-supervised learning approach,” in IEEE International Conference on Robotics and Automation (ICRA), pp. 1874–1880, IEEE, 2020.
  11. K. E. Kohler and S. M. Gill, “Coral point count with excel extensions (cpce): A visual basic program for the determination of coral and substrate coverage using random point count methodology,” Computers & geosciences, vol. 32, no. 9, pp. 1259–1269, 2006.
  12. G. Pavoni, M. Corsini, F. Ponchio, A. Muntoni, C. Edwards, N. Pedersen, S. Sandin, and P. Cignoni, “Taglab: Ai-assisted annotation for the fast and accurate semantic segmentation of coral reef orthoimages,” Journal of field robotics, vol. 39, no. 3, pp. 246–262, 2022.
  13. A. King, S. M. Bhandarkar, and B. M. Hopkinson, “A comparison of deep learning methods for semantic segmentation of coral reef survey images,” in IEEE/CVF Computer Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1394–1402, 2018.
  14. J. Carleton and T. Done, “Quantitative video sampling of coral reef benthos: large-scale application,” Coral Reefs, vol. 14, pp. 35–46, 1995.
  15. S. Tabugo, “Coral reef assessment and monitoring made easy using coral point count with excel extensions (cpce) software in calangahan, lugait, misamis oriental, philippines,” Computational Ecology and Software, vol. 6, no. 1, p. 21, 2016.
  16. P. L. Jokiel, K. S. Rodgers, E. K. Brown, J. C. Kenyon, G. Aeby, W. R. Smith, and F. Farrell, “Comparison of methods used to estimate coral cover in the hawaiian islands,” PeerJ, vol. 3, p. e954, 2015.
  17. E. S. Darling, T. R. McClanahan, J. Maina, G. G. Gurney, N. A. Graham, F. Januchowski-Hartley, J. E. Cinner, C. Mora, C. C. Hicks, E. Maire, et al., “Social–environmental drivers inform strategic management of coral reefs in the anthropocene,” Nature ecology & evolution, vol. 3, no. 9, pp. 1341–1350, 2019.
  18. O. Beijbom, T. Treibitz, D. I. Kline, G. Eyal, A. Khen, B. Neal, Y. Loya, B. G. Mitchell, and D. Kriegman, “Improving automated annotation of benthic survey images using wide-band fluorescence,” Scientific reports, vol. 6, no. 1, pp. 1–11, 2016.
  19. F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung, “A benchmark dataset and evaluation methodology for video object segmentation,” in IEEE/CVF Computer Conference on Computer Vision and Pattern Recognition (CVPR), pp. 724–732, 2016.
  20. J. Pont-Tuset, F. Perazzi, S. Caelles, P. Arbeláez, A. Sorkine-Hornung, and L. Van Gool, “The 2017 davis challenge on video object segmentation,” arXiv preprint arXiv:1704.00675, 2017.
  21. N. Xu, L. Yang, Y. Fan, D. Yue, Y. Liang, J. Yang, and T. S. Huang, “Youtube-vos: A large-scale video object segmentation benchmark,” CoRR, vol. abs/1809.03327, 2018.
  22. M. Safuan, W. H. Boo, H. Y. Siang, L. H. Chark, Z. Bachok, et al., “Optimization of coral video transect technique for coral reef survey: comparison with intercept transect technique,” Open Journal of Marine Science, vol. 5, no. 04, p. 379, 2015.
  23. F. G. Rodríguez-Teiles, R. Pérez-Alcocer, A. Maldonado-Ramírez, L. A. Torres-Méndez, B. B. Dey, and E. A. Martínez-García, “Vision-based reactive autonomous navigation with obstacle avoidance: Towards a non-invasive and cautious exploration of marine habitat,” in IEEE International Conference on Robotics and Automation (ICRA), pp. 3813–3818, IEEE, 2014.
  24. B. Sadrfaridpour, Y. Aloimonos, M. Yu, Y. Tao, and D. Webster, “Detecting and counting oysters,” in IEEE International Conference on Robotics and Automation (ICRA), pp. 2156–2162, IEEE, 2021.
  25. V. Trygonis and M. Sini, “photoquad: a dedicated seabed image processing software, and a comparative error analysis of four photoquadrat methods,” Journal of experimental marine biology and ecology, vol. 424, pp. 99–108, 2012.
  26. D. Langenkämper, M. Zurowietz, T. Schoening, and T. W. Nattkemper, “Biigle 2.0-browsing and annotating large marine image collections,” Frontiers in Marine Science, vol. 4, p. 83, 2017.
  27. O. Beijbom, P. J. Edmunds, C. Roelfsema, J. Smith, D. I. Kline, B. P. Neal, M. J. Dunlap, V. Moriarty, T.-Y. Fan, C.-J. Tan, et al., “Towards automated annotation of benthic survey images: Variability of human experts and operational modes of automation,” PloS one, vol. 10, no. 7, p. e0130312, 2015.
  28. B. P. Neal, A. Khen, T. Treibitz, O. Beijbom, G. O’Connor, M. A. Coffroth, N. Knowlton, D. Kriegman, B. G. Mitchell, and D. I. Kline, “Caribbean massive corals not recovering from repeated thermal stress events during 2005–2013,” Ecology and Evolution, vol. 7, no. 5, pp. 1339–1353, 2017.
  29. Z. Zheng, T.-S. Ha, Y. Chen, H. Liang, A. P.-Y. Chui, Y.-H. Wong, and S.-K. Yeung, “Marine video cloud: A cloud-based video analytics platform for collaborative marine research,” 2023.
  30. W. Nagai, T. Katayama, T. Song, and T. Shimamoto, “High efficiency dataset generation for semantic video segmentation on road intersection,” in International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), pp. 1–4, IEEE, 2022.
  31. A. Botach, E. Zheltonozhskii, and C. Baskin, “End-to-end referring video object segmentation with multimodal transformers,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4985–4995, 2022.
  32. M. Siam, C. Jiang, S. Lu, L. Petrich, M. Gamal, M. Elhoseiny, and M. Jagersand, “Video object segmentation using teacher-student adaptation in a human robot interaction (hri) setting,” in International Conference on Robotics and Automation (ICRA), pp. 50–56, IEEE, 2019.
  33. H. S. Behl, M. Naja, A. Arnab, and P. H. Torr, “Meta-learning deep visual words for fast video object segmentation,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8484–8491, IEEE, 2020.
  34. D. Walther, D. R. Edgington, and C. Koch, “Detection and tracking of objects in underwater video,” in IEEE/CVF Computer Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. I–I, IEEE, 2004.
  35. F. Wang and K. Hauser, “In-hand object scanning via rgb-d video segmentation,” in IEEE International Conference on Robotics and Automation (ICRA), pp. 3296–3302, IEEE, 2019.
  36. P. W. Patil, A. Dudhane, A. Kulkarni, S. Murala, A. B. Gonde, and S. Gupta, “An unified recurrent video object segmentation framework for various surveillance environments,” IEEE Transactions on Image Processing (TIP), vol. 30, pp. 7889–7902, 2021.
  37. D. Zhang, N. E. O’Conner, A. J. Simpson, C. Cao, S. Little, and B. Wu, “Coastal fisheries resource monitoring through a deep learning-based underwater video analysis,” Estuarine, Coastal and Shelf Science, vol. 269, p. 107815, 2022.
  38. J. Ackermann, C. Sakaridis, and F. Yu, “Maskomaly: Zero-shot mask anomaly segmentation,” arXiv preprint arXiv:2305.16972, 2023.
  39. H. K. Cheng, Y.-W. Tai, and C.-K. Tang, “Rethinking space-time networks with improved memory coverage for efficient video object segmentation,” Advances in Neural Information Processing Systems (Neurips), vol. 34, pp. 11781–11794, 2021.
  40. Z. Yang, Y. Wei, and Y. Yang, “Associating objects with transformers for video object segmentation,” Advances in Neural Information Processing Systems (Neurips), vol. 34, pp. 2491–2502, 2021.
  41. H. K. Cheng, Y.-W. Tai, and C.-K. Tang, “Modular interactive video object segmentation: Interaction-to-mask, propagation and difference-aware fusion,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5559–5568, 2021.
  42. H. K. Cheng and A. G. Schwing, “Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model,” in European Conference on Computer Vision (ECCV), pp. 640–658, Springer, 2022.
  43. Z. Yang, Y. Wei, and Y. Yang, “Collaborative video object segmentation by multi-scale foreground-background integration,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 9, pp. 4701–4712, 2021.
  44. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, et al., “Segment anything,” IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  45. Y. Cheng, L. Li, Y. Xu, X. Li, Z. Yang, W. Wang, and Y. Yang, “Segment and track anything,” arXiv preprint arXiv:2305.06558, 2023.
  46. J. Yang, M. Gao, Z. Li, S. Gao, F. Wang, and F. Zheng, “Track anything: Segment anything meets videos,” arXiv preprint arXiv:2304.11968, 2023.
  47. Z. Yang and Y. Yang, “Decoupling features in hierarchical propagation for video object segmentation,” Advances in Neural Information Processing Systems (Neurips), vol. 35, pp. 36324–36336, 2022.
  48. H. K. Cheng, S. W. Oh, B. Price, A. Schwing, and J.-Y. Lee, “Tracking anything with decoupled video segmentation,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  49. E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,” Advances in Neural Information Processing Systems (Neurips), vol. 34, pp. 12077–12090, 2021.
  50. S. Liu, Z. Zeng, T. Ren, F. Li, H. Zhang, J. Yang, C. Li, J. Yang, H. Su, J. Zhu, et al., “Grounding dino: Marrying dino with grounded pre-training for open-set object detection,” arXiv preprint arXiv:2303.05499, 2023.
  51. D. R. Martin, C. C. Fowlkes, and J. Malik, “Learning to detect natural image boundaries using local brightness, color, and texture cues,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 26, no. 5, pp. 530–549, 2004.
  52. C. Campos, R. Elvira, J. J. G. Rodríguez, J. M. Montiel, and J. D. Tardós, “Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam,” IEEE Transactions on Robotics (TRO), 2021.
  53. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision (IJCV), vol. 60, pp. 91–110, 2004.
  54. E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: An efficient alternative to sift or surf,” in IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2564–2571, Ieee, 2011.
Citations (5)

Summary

We haven't generated a summary for this paper yet.