Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Memory Efficient Neural Processes via Constant Memory Attention Block (2305.14567v3)

Published 23 May 2023 in cs.LG and cs.CV

Abstract: Neural Processes (NPs) are popular meta-learning methods for efficiently modelling predictive uncertainty. Recent state-of-the-art methods, however, leverage expensive attention mechanisms, limiting their applications, particularly in low-resource settings. In this work, we propose Constant Memory Attentive Neural Processes (CMANPs), an NP variant that only requires constant memory. To do so, we first propose an efficient update operation for Cross Attention. Leveraging the update operation, we propose Constant Memory Attention Block (CMAB), a novel attention block that (i) is permutation invariant, (ii) computes its output in constant memory, and (iii) performs constant computation updates. Finally, building on CMAB, we detail Constant Memory Attentive Neural Processes. Empirically, we show CMANPs achieve state-of-the-art results on popular NP benchmarks while being significantly more memory efficient than prior methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Meta temporal point processes. In International Conference on Learning Representations.
  2. Autoregressive conditional neural processes. In The Eleventh International Conference on Learning Representations.
  3. Meta-learning regrasping strategies for physical-agnostic objects. arXiv preprint arXiv:2205.11110.
  4. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259.
  5. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
  6. Emnist: Extending mnist to handwritten letters. In 2017 international joint conference on neural networks (IJCNN), pages 2921–2926. IEEE.
  7. Latent bottlenecked attentive neural processes. In International Conference on Learning Representations.
  8. Meta-learning stationary stochastic process prediction with convolutional neural processes. Advances in Neural Information Processing Systems, 33:8284–8295.
  9. Conditional neural processes. In International Conference on Machine Learning, pages 1704–1713. PMLR.
  10. Neural processes. arXiv preprint arXiv:1807.01622.
  11. Convolutional conditional neural processes. In International Conference on Learning Representations.
  12. Long short-term memory. Neural computation, 9:1735–80.
  13. Equivariant learning of stochastic fields: Gaussian processes and steerable conditional neural processes. In International Conference on Machine Learning, pages 4297–4307. PMLR.
  14. Perceiver: General perception with iterative attention. In International conference on machine learning, pages 4651–4664. PMLR.
  15. The neural process family: Survey, applications and perspectives. arXiv preprint arXiv:2209.00517.
  16. Transformers in vision: A survey. ACM computing surveys (CSUR), 54(10s):1–41.
  17. Attentive neural processes.
  18. Bootstrapping neural processes. Advances in neural information processing systems, 33:6606–6615.
  19. Memory-efficient gaussian fitting for depth images in real time. In 2022 International Conference on Robotics and Automation (ICRA), pages 8003–8009.
  20. Category-agnostic 6d pose estimation with conditional neural processes. arXiv preprint arXiv:2206.07162.
  21. How neural processes improve graph link prediction. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3543–3547. IEEE.
  22. A survey of transformers. AI Open.
  23. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV).
  24. Transformer neural processes: Uncertainty-aware meta learning via sequence modeling. In International Conference on Machine Learning, pages 16569–16594. PMLR.
  25. Self-attention does not need o⁢(n2)𝑜superscript𝑛2o(n^{2})italic_o ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) memory. arXiv preprint arXiv:2112.05682.
  26. Fast and flexible multi-task classification using conditional neural adaptive processes. Advances in Neural Information Processing Systems, 32.
  27. Deep bayesian bandits showdown: An empirical comparison of bayesian deep networks for thompson sampling. arXiv preprint arXiv:1802.09127.
  28. Intensity-free learning of temporal point processes. In International Conference on Learning Representations.
  29. Sequential neural processes. Advances in Neural Information Processing Systems, 32.
  30. Attention is all you need. Advances in neural information processing systems, 30.
  31. Convolutional conditional neural processes for local climate downscaling. arXiv preprint arXiv:2101.07950.
  32. Recurrent neural processes. arXiv preprint arXiv:1906.05915.
  33. Memformer: A memory-augmented transformer for sequence modeling. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 308–318.
  34. Deep sets. Advances in neural information processing systems, 30.
  35. Transformer hawkes process. In International conference on machine learning, pages 11692–11702. PMLR.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com