What Do You See in Common? Learning Hierarchical Prototypes over Tree-of-Life to Discover Evolutionary Traits (2409.02335v2)
Abstract: A grand challenge in biology is to discover evolutionary traits - features of organisms common to a group of species with a shared ancestor in the tree of life (also referred to as phylogenetic tree). With the growing availability of image repositories in biology, there is a tremendous opportunity to discover evolutionary traits directly from images in the form of a hierarchy of prototypes. However, current prototype-based methods are mostly designed to operate over a flat structure of classes and face several challenges in discovering hierarchical prototypes, including the issue of learning over-specific prototypes at internal nodes. To overcome these challenges, we introduce the framework of Hierarchy aligned Commonality through Prototypical Networks (HComP-Net). The key novelties in HComP-Net include a novel over-specificity loss to avoid learning over-specific prototypes, a novel discriminative loss to ensure prototypes at an internal node are absent in the contrasting set of species with different ancestry, and a novel masking module to allow for the exclusion of over-specific prototypes at higher levels of the tree without hampering classification performance. We empirically show that HComP-Net learns prototypes that are accurate, semantically consistent, and generalizable to unseen species in comparison to baselines.
- Complexity, evolvability, and the process of adaptation. Annual Review of Ecology, Evolution, and Systematics, 53, 2022.
- Morphobank: phylophenomics in the “cloud”. Cladistics, 27(5):529–537, 2011.
- Giant taxon-character matrices: quality of character constructions remains critical regardless of size. Cladistics, 33(2):198–219, 2017.
- Paul C Sereno. Logical basis for morphological characters in phylogenetics. Cladistics, 23(6):565–587, 2007.
- Computer vision, machine learning, and the promise of phenomics in ecology and evolutionary biology. Frontiers in Ecology and Evolution, 9:642774, 2021.
- The inaturalist species classification and detection dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8769–8778, 2018.
- A survey of digitized data from us fish collections in the idigbio data aggregator. PloS one, 13(12):e0207636, 2018.
- The caltech-ucsd birds-200-2011 dataset. 2011.
- Discovering novel biological traits from images using phylogeny-guided neural networks. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3966–3978, 2023.
- This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems, 32, 2019.
- Protopshare: Prototypical parts sharing for similarity discovery in interpretable image classification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 1420–1430, 2021.
- Interpretable image classification with differentiable prototypes assignment. In European Conference on Computer Vision, pages 351–368. Springer, 2022.
- Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14933–14943, 2021.
- Early bursts of body size and shape evolution are rare in comparative data. Evolution, 64(8):2385–2396, 2010.
- Model adequacy and the macroevolution of angiosperm functional traits. The American Naturalist, 186(2):E33–E50, 2015.
- Heliconius collection (cambridge butterfly), 2024.
- Interpretable image recognition with hierarchical prototypes. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 7, pages 32–40, 2019.
- Pip-net: Patch-based intuitive prototypes for interpretable image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2744–2753, 2023.
- This looks like that… does it? shortcomings of latent space prototype interpretability in deep networks. arXiv preprint arXiv:2105.02968, 2021.
- Hive: Evaluating the human interpretability of visual explanations. In European Conference on Computer Vision, pages 280–298. Springer, 2022.
- Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, pages 9929–9939. PMLR, 2020.
- Representation learning via consistent assignment of views to clusters. In Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, pages 987–994, 2022.
- Interpretable image recognition by constructing transparent embedding space. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 895–904, 2021.
- Orthogonal convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11505–11515, 2020.
- Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
- The global diversity of birds in space and time. Nature, 491:444–448, 2012.
- Cambridge butterfly wing collection batch 10, November 2020.
- Sheffield butterfly wing collection - Patricio Salazar, Nicola Nadeau, Ikiam broods batch 1 and 2, November 2020.
- Cambridge butterfly wing collection batch 2, May 2019.
- Cambridge butterfly wing collection batch 3, May 2019.
- Cambridge butterfly wing collection batch 4, May 2019.
- Cambridge butterfly wing collection batch 5, May 2019.
- Miscellaneous Heliconius wing photographs (2001-2019) Part 1, February 2019.
- Miscellaneous Heliconius wing photographs (2001-2019) Part 3, February 2019.
- Cambridge butterfly wing collection batch 6, May 2019.
- Cambridge butterfly wing collection - Chris Jiggins 2001/2 broods batch 1, January 2019.
- Cambridge butterfly wing collection - Chris Jiggins 2001/2 broods batch 2, January 2019.
- Cambridge butterfly wing collection - Patricio Salazar PhD wild specimens batch 3, October 2020.
- Cambridge butterfly wing collection batch 1- version 2, May 2019.
- Cambridge and collaborators butterfly wing collection batch 10, May 2019.
- Cambridge butterfly wing collection - Patricio Salazar PhD wild and bred specimens batch 1, December 2018.
- Cambridge butterfly wing collection batch 7, May 2019.
- Cambridge butterfly wing collection - Patricio Salazar PhD wild and bred specimens batch 2, January 2019.
- Brazilian Butterflies Collected December 2020 to January 2021, February 2022.
- Cambridge butterfly wing collection batch 8, May 2019.
- Cambridge butterfly wing collection batch 9, May 2019.
- Cambridge butterfly collection - GMK Broods Ikiam 2018, November 2020.
- Heliconius erato cyrbia, Cook Islands (New Zealand) 2016, 2019, 2021, September 2021.
- Miscellaneous Heliconius wing photographs (2001-2019) Part 2, February 2019.
- Camilo Salazar and Cambridge butterfly wing collection batch 1, May 2019.
- University of Helsinki butterfly collection - Anniina Mattila bred specimens, February 2019.
- Open tree of life synthetic tree, 2019.
- rotl: an r package to interact with the open tree of life data. Methods in Ecology and Evolution, 7(12):1476–1481, 2016.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- A simple interpretable transformer for fine-grained image classification and analysis. arXiv preprint arXiv:2311.04157, 2023.
- The dimensionality of genetic variation for wing shape in drosophila melanogaster. Evolution, 59(5):1027–1038, 2005.
- Hierarchical conditioning of diffusion models using tree-of-life for studying species evolution. arXiv preprint arXiv:2408.00160, 2024.
- Knowledge-guided machine learning: Current trends and future prospects. arXiv preprint arXiv:2403.15989, 2024.
- Linking of digital images to phylogenetic data matrices using a morphological ontology. Systematic Biology, 56(2):283–294, 2007.
- Abien Fred Agarap. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018.
- Trivialaugment: Tuning-free yet state-of-the-art data augmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 774–782, 2021.
- R. Farrell. Cub-200-2011 segmentations (1.0) [data set], 2024.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.