Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs (2401.00608v5)

Published 31 Dec 2023 in cs.CV and cs.AI

Abstract: Camera traps are important tools in animal ecology for biodiversity monitoring and conservation. However, their practical application is limited by issues such as poor generalization to new and unseen locations. Images are typically associated with diverse forms of context, which may exist in different modalities. In this work, we exploit the structured context linked to camera trap images to boost out-of-distribution generalization for species classification tasks in camera traps. For instance, a picture of a wild animal could be linked to details about the time and place it was captured, as well as structured biological knowledge about the animal species. While often overlooked by existing studies, incorporating such context offers several potential benefits for better image understanding, such as addressing data scarcity and enhancing generalization. However, effectively incorporating such heterogeneous context into the visual domain is a challenging problem. To address this, we propose a novel framework that transforms species classification as link prediction in a multimodal knowledge graph (KG). This framework enables the seamless integration of diverse multimodal contexts for visual recognition. We apply this framework for out-of-distribution species classification on the iWildCam2020-WILDS and Snapshot Mountain Zebra datasets and achieve competitive performance with state-of-the-art approaches. Furthermore, our framework enhances sample efficiency for recognizing under-represented species.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. Wildlife insights: A platform to maximize the potential of camera trap and other passive sensor wildlife data for the planet. Environmental Conservation, 47(1):1–6, 2020.
  2. Living Planet Report 2020-Bending the curve of biodiversity loss. World Wildlife Fund, 2020.
  3. Do convolutional neural networks learn class hierarchy? IEEE Trans. Vis. Comput. Graph., 24(1):152–162, 2018.
  4. The parahippocampal cortex mediates spatial and nonspatial associations. Cerebral cortex, 17(7):1493–1503, 2007.
  5. Moshe Bar. Visual objects in context. Nature Reviews Neuroscience, 5(8):617–629, 2004.
  6. Image classification with orchard metadata. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 5164–5170. IEEE, 2016.
  7. Recognition in terra incognita. In Proceedings of the European conference on computer vision (ECCV), pages 456–473, 2018.
  8. The iwildcam 2018 challenge dataset. arXiv preprint arXiv:1904.05986, 2019.
  9. The iwildcam 2020 competition dataset. CoRR, abs/2004.10340, 2020.
  10. Making better mistakes: Leveraging class hierarchies with deep networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 12503–12512. Computer Vision Foundation / IEEE, 2020.
  11. Anil Bhattacharyya. On a measure of divergence between two multinomial populations. Sankhyā: the indian journal of statistics, pages 401–406, 1946.
  12. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States, pages 2787–2795, 2013.
  13. Automated wildlife image classification: An active learning tool for ecological applications. CoRR, abs/2303.15823, 2023.
  14. Ohio Supercomputer Center. Ohio Supercomputer Center, 1987.
  15. HittER: Hierarchical transformers for knowledge graph embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10395–10407, Online and Punta Cana, Dominican Republic, 2021. Association for Computational Linguistics.
  16. Geo-aware networks for fine-grained recognition. In 2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019, Seoul, Korea (South), October 27-28, 2019, pages 247–254. IEEE, 2019.
  17. Convolutional 2d knowledge graph embeddings. In Thirty-second AAAI conference on artificial intelligence, 2018.
  18. Metaformer: A unified meta framework for fine-grained recognition. arXiv preprint arXiv:2203.02751, 2022.
  19. Pervasive human-driven decline of life on earth points to the need for transformative change. Science, 366(6471):eaax3100, 2019.
  20. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 601–610, 2014.
  21. Improving plankton image classification using context metadata. Limnology and Oceanography: Methods, 17(8):439–461, 2019.
  22. Kblrn: End-to-end learning of knowledge base representations with latent, relational, and numerical features. In Conference on Uncertainty in Artificial Intelligence, 2018.
  23. Camera-trapping version 3.0: current constraints and future priorities for development. Remote Sensing in Ecology and Conservation, 5(3):209–223, 2019.
  24. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770–778. IEEE Computer Society, 2016.
  25. The inaturalist species classification and detection dataset. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 8769–8778. Computer Vision Foundation / IEEE Computer Society, 2018.
  26. Does distributionally robust supervised learning give robust classifiers? In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, pages 2034–2042. PMLR, 2018.
  27. Ontology-based n-ball concept embeddings informing few-shot image classification. In Machine Learning with Symbolic Methods and Knowledge Graphs co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021), Virtual, September 17, 2021. CEUR-WS.org, 2021.
  28. Love thy neighbors: Image annotation by exploiting image metadata. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 4624–4632. IEEE Computer Society, 2015.
  29. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  30. WILDS: A benchmark of in-the-wild distribution shifts. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, pages 5637–5664. PMLR, 2021.
  31. Visual genome: Connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis., 123(1):32–73, 2017.
  32. Exploiting privileged information from web data for image categorization. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V, pages 437–452. Springer, 2014.
  33. Imf: Interactive multimodal fusion model for link prediction. In Proceedings of the ACM Web Conference 2023, pages 2572–2580, 2023.
  34. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA, pages 2181–2187. AAAI Press, 2015.
  35. Deep neural networks in fully connected CRF for image labeling with social network metadata. In IEEE Winter Conference on Applications of Computer Vision, WACV 2019, Waikoloa Village, HI, USA, January 7-11, 2019, pages 1607–1615. IEEE, 2019.
  36. The more you know: Using knowledge graphs for image classification. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 20–28. IEEE Computer Society, 2017.
  37. Biodiversity: The ravages of guns, nets and bulldozers. Nature, 536(7615):143–145, 2016.
  38. Image labeling on a network: Using social-network metadata for image classification. In Computer Vision - ECCV 2012 - 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part IV, pages 828–841. Springer, 2012.
  39. Insights and approaches using deep learning to classify wildlife. Scientific reports, 9(1):8137, 2019.
  40. George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41, 1995.
  41. A novel embedding model for knowledge base completion based on convolutional neural network. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 327–333, New Orleans, Louisiana, 2018. Association for Computational Linguistics.
  42. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011, pages 809–816. Omnipress, 2011.
  43. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proceedings of the National Academy of Sciences, 115(25):E5716–E5725, 2018.
  44. Camera traps in animal ecology: methods and analyses. Springer, 2011.
  45. The role of context in object recognition. Trends in cognitive sciences, 11(12):520–527, 2007.
  46. Open tree of life taxonomy, 2019.
  47. A retrieve-and-read framework for knowledge graph link prediction. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, page 1992–2002, New York, NY, USA, 2023. Association for Computing Machinery.
  48. Snapshot safari: A large-scale collaborative to monitor africa’s remarkable biodiversity. South African Journal of Science, 117(1-2):1–4, 2021.
  49. Embedding multimodal relational data for knowledge base completion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3208–3218, Brussels, Belgium, 2018. Association for Computational Linguistics.
  50. Attentional-biased stochastic gradient descent. Transactions on Machine Learning Research, 2023.
  51. You can teach an old dog new tricks! on training knowledge graph embeddings. In International Conference on Learning Representations, 2020.
  52. A deep active learning system for species identification and counting in camera trap images. Methods in Ecology and Evolution, 12(1):150–161, 2020.
  53. Sequence-to-sequence knowledge graph completion and question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2814–2828, Dublin, Ireland, 2022. Association for Computational Linguistics.
  54. Modeling relational data with graph convolutional networks. In European Semantic Web Conference, pages 593–607. Springer, 2018.
  55. A multimodal translation-based approach for knowledge graph representation learning. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, *SEM@NAACL-HLT 2018, New Orleans, Louisiana, USA, June 5-6, 2018, pages 225–234. Association for Computational Linguistics, 2018.
  56. Gradient matching for domain generalization. In International Conference on Learning Representations, 2022.
  57. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  58. Bioclip: A vision foundation model for the tree of life. arXiv preprint arXiv:2311.18803, 2023.
  59. Deep CORAL: correlation alignment for deep domain adaptation. In Computer Vision - ECCV 2016 Workshops - Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III, pages 443–450, 2016.
  60. Machine learning to classify animal species in camera trap images: Applications in ecology. Methods in Ecology and Evolution, 10(4):585–590, 2019.
  61. Improving the accessibility and transferability of machine learning algorithms for identification of animals in camera trap images: Mlwic2. Ecology and evolution, 10(19):10374–10383, 2020.
  62. Composition-based multi-relational graph convolutional networks. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
  63. The iucn red list: a key conservation tool. Wildlife in a changing world–An analysis of the 2008 IUCN Red List of Threatened Species, page 1, 2009.
  64. Camera-trapping for conservation: a guide to best-practices. WWF conservation technology series, 1(1):181, 2017.
  65. Ben G Weinstein. A computer vision for animal ecology. Journal of Animal Ecology, 87(3):533–545, 2018.
  66. End-to-end learning on multimodal knowledge graphs. Semantic Web – Interoperability, Usability, Applicability an IOS Press Journal, 2021.
  67. Embedding entities and relations for learning and inference in knowledge bases. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  68. Kg-bert: Bert for knowledge graph completion. arXiv preprint arXiv:1909.03193, 2019.
  69. Knowledge embedding based graph convolutional network. In WWW ’21: The Web Conference 2021, Virtual Event / Ljubljana, Slovenia, April 19-23, 2021, pages 1619–1628. ACM / IW3C2, 2021.
  70. Use all the labels: A hierarchical multi-label contrastive learning framework. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pages 16639–16648. IEEE, 2022.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com