Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Generalization in Subitizing with Neuro-Symbolic Loss using Holographic Reduced Representations (2312.15310v1)

Published 23 Dec 2023 in cs.CV, cs.LG, and q-bio.NC

Abstract: While deep learning has enjoyed significant success in computer vision tasks over the past decade, many shortcomings still exist from a Cognitive Science (CogSci) perspective. In particular, the ability to subitize, i.e., quickly and accurately identify the small (less than 6) count of items, is not well learned by current Convolutional Neural Networks (CNNs) or Vision Transformers (ViTs) when using a standard cross-entropy (CE) loss. In this paper, we demonstrate that adapting tools used in CogSci research can improve the subitizing generalization of CNNs and ViTs by developing an alternative loss function using Holographic Reduced Representations (HRRs). We investigate how this neuro-symbolic approach to learning affects the subitizing capability of CNNs and ViTs, and so we focus on specially crafted problems that isolate generalization to specific aspects of subitizing. Via saliency maps and out-of-distribution performance, we are able to empirically observe that the proposed HRR loss improves subitizing generalization though it does not completely solve the problem. In addition, we find that ViTs perform considerably worse compared to CNNs in most respects on subitizing, except on one axis where an HRR-based loss provides improvement.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Recasting Self-Attention with Holographic Reduced Representations. In Krause, A.; Brunskill, E.; Cho, K.; Engelhardt, B.; Sabato, S.; and Scarlett, J., eds., Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, 490–507. PMLR.
  2. Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations. In Chaudhuri, K.; Jegelka, S.; Song, L.; Szepesvari, C.; Niu, G.; and Sabato, S., eds., Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, 367–393. PMLR.
  3. Nengo: a Python tool for building large-scale functional brain models. Frontiers in Neuroinformatics, 7: 48.
  4. Choo, F.-X. 2018. Spaun 2.0: Extending the World’s Largest Functional Brain Model. Ph.D. thesis, University of Waterloo.
  5. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  6. A Large-Scale Model of the Functioning Brain. Science, 338(6111): 1202–1205.
  7. Integrating structure and meaning: a distributed model of analogical mapping. Cogn. Sci., 25: 245–286.
  8. Representing Objects, Relations, and Sequences. Neural Comput., 25(8): 2038–2078.
  9. Learning with Holographic Reduced Representations. Advances in Neural Information Processing Systems, 34.
  10. Gayler, R. 1998. Multiplicative Binding, Representation Operators & Analogy. In Advances in analogy research: Integr. oftheory and data from the cogn., comp., and neural sciences.
  11. Vector-derived Transformation Binding: An Improved Binding Operation for Deep Symbol-like Processing in Neural Networks. Neural Comput., 31(5): 849–869.
  12. Delving into Salient Object Subitizing and Detection. In 2017 IEEE International Conference on Computer Vision (ICCV), 1059–1067. IEEE. ISBN 978-1-5386-1032-9.
  13. Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7142–7150. IEEE. ISBN 978-1-5386-6420-9.
  14. Kanerva, P. 1996. Binary Spatter-Coding of Ordered K-Tuples. In Proceedings of the 1996 International Conference on Artificial Neural Networks, ICANN 96, 869–873. Berlin, Heidelberg: Springer-Verlag. ISBN 3540615105.
  15. DeepFovea: Neural Reconstruction for Foveated Rendering and Video Compression Using Learned Statistics of Natural Videos. ACM Trans. Graph., 38(6).
  16. The Discrimination of Visual Number. The American Journal of Psychology, 62(4): 498.
  17. Marr, D. 2010. Vision: A computational investigation into the human representation and processing of visual information. MIT press.
  18. MIMONets: Multiple-Input-Multiple-Output Neural Networks Exploiting Computation in Superposition. Advances in Neural Information Processing Systems (NeurIPS), 36.
  19. BONGARD-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20. Red Hook, NY, USA: Curran Associates Inc. ISBN 9781713829546.
  20. Coding of cognitive magnitude: Compressed scaling of numerical information in the primate prefrontal cortex. Neuron, 37(1): 149–157.
  21. Tuning curves for approximate numerosity in the human intraparietal sulcus. Neuron, 44(3): 547–555.
  22. Plate, T. A. 1995. Holographic reduced representations. IEEE Transactions on Neural networks, 6(3): 623–641.
  23. A Neural Model of Rule Generation in Inductive Reasoning. Topics in Cognitive Science, 3(1): 140–153.
  24. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  25. Reaction Time as a Measure of Span of Attention. The Journal of Psychology, 25(2): 227–241.
  26. Lempel-Ziv Networks. In Antorán, J.; Blaas, A.; Feng, F.; Ghalebikesabi, S.; Mason, I.; Pradier, M. F.; Rohde, D.; Ruiz, F. J. R.; and Schein, A., eds., Proceedings on ”I Can’t Believe It’s Not Better! - Understanding Deep Learning Through Empirical Falsification” at NeurIPS 2022 Workshops, volume 187 of Proceedings of Machine Learning Research, 1–11. PMLR.
  27. A comparison of vector symbolic architectures. Artificial Intelligence Review.
  28. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.
  29. Smolensky, P. 1990. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence, 46(1): 159–216.
  30. How might the discrepancy in the effects of perceptual variables on numerosity judgment be reconciled? Attention, Perception, & Psychophysics, 72(7): 1839–1853.
  31. Why are small and large numbers enumerated differently? A limited-capacity preattentive stage in vision. Psychological Review, 101(1): 80–102.
  32. Cognitive deficit of deep learning in numerosity. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 1303–1310.
  33. Salient Object Subitizing. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4045–4054. IEEE. ISBN 978-1-4673-6964-0.
Citations (2)

Summary

We haven't generated a summary for this paper yet.