Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Imbalance-aware Presence-only Loss Function for Species Distribution Modeling (2403.07472v1)

Published 12 Mar 2024 in cs.LG

Abstract: In the face of significant biodiversity decline, species distribution models (SDMs) are essential for understanding the impact of climate change on species habitats by connecting environmental conditions to species occurrences. Traditionally limited by a scarcity of species observations, these models have significantly improved in performance through the integration of larger datasets provided by citizen science initiatives. However, they still suffer from the strong class imbalance between species within these datasets, often resulting in the penalization of rare species--those most critical for conservation efforts. To tackle this issue, this study assesses the effectiveness of training deep learning models using a balanced presence-only loss function on large citizen science-based datasets. We demonstrate that this imbalance-aware loss function outperforms traditional loss functions across various datasets and tasks, particularly in accurately modeling rare species with limited observations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Overview of geolifeclef 2023: Species composition prediction with high spatial resolution at continental scale using remote sensing. Working Notes of CLEF, 2023.
  2. Overcoming limitations of modelling rare species by using ensembles of small models. Methods in Ecology and Evolution, 6(10):1210–1218, 2015.
  3. Multi-label learning from single positive labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 933–942, 2021.
  4. Spatial implicit neural representations for global-scale species mapping. arXiv preprint arXiv:2306.02564, 2023.
  5. How and why species are rare: towards an understanding of the ecological causes of rarity. Ecography, page e07037, 2024.
  6. Convolutional neural networks improve species distribution modelling by capturing the spatial structure of the environment. PLoS computational biology, 17(4):e1008856, 2021.
  7. Spatially explicit species distribution models: A missed opportunity in conservation planning? Diversity and Distributions, 25(5):758–769, 2019.
  8. How much does climate change threaten european forest tree species distributions? Global change biology, 24(3):1150–1163, 2018.
  9. J. Elith and J. R. Leathwick. Species distribution models: ecological explanation and prediction across space and time. Annual Review of Ecology, Evolution and Systematics, 40(1):677–697, 2009.
  10. Presence-only and presence-absence data for comparing species distribution modeling methods. Biodiversity informatics, 15(2):69–80, 2020.
  11. Trends and gaps in the use of citizen science derived data as input for species distribution models: A quantitative review. PLoS One, 16(3):e0234587, 2021.
  12. ebird status and trends, data version: 2018; released: 2020. Cornell Lab of Ornithology, Ithaca, New York, 10, 2020.
  13. Revisiting deep learning models for tabular data. Advances in Neural Information Processing Systems, 34:18932–18943, 2021.
  14. Predicting species distributions for conservation decisions. Ecology letters, 16(12):1424–1435, 2013.
  15. S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.
  16. Large-scale multi-label text classification—revisiting neural networks. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part II 14, pages 437–452. Springer, 2014.
  17. Modelling distribution and abundance with presence-only data. Journal of applied ecology, 43(3):405–412, 2006.
  18. Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological applications, 19(1):181–197, 2009.
  19. Overcoming the coupled climate and biodiversity crises and their societal impacts. Science, 380(6642):eabl4881, 2023.
  20. Geographic location encoding with spherical harmonics and sinusoidal representation networks. Proceedings of the International Conference on Learning Representations (ICLR), 2024.
  21. Actions to halt biodiversity loss generally benefit the climate. Global change biology, 28(9):2846–2874, 2022.
  22. Bird distribution modelling using remote sensing and citizen science data, 2023.
  23. Extinction risk from climate change. Nature, 427(6970):145–148, 2004.
  24. Leverage samples with single positive labels to train cnn-based models for multi-label plant species prediction. Working Notes of CLEF, 2023.
  25. Exploring the potential of neural networks for species distribution modeling. ICLR climate change AI workshop, 2023.
  26. On the selection and effectiveness of pseudo-absences for species distribution modeling with deep learning. arXiv preprint arXiv:2401.02989, 2024.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets