Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 398 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Energy-efficiency Limits on Training AI Systems using Learning-in-Memory (2402.14878v2)

Published 21 Feb 2024 in cs.LG, cs.AI, and cs.AR

Abstract: Learning-in-memory (LIM) is a recently proposed paradigm to overcome fundamental memory bottlenecks in training machine learning systems. While compute-in-memory (CIM) approaches can address the so-called memory-wall (i.e. energy dissipated due to repeated memory read access) they are agnostic to the energy dissipated due to repeated memory writes at the precision required for training (the update-wall), and they don't account for the energy dissipated when transferring information between short-term and long-term memories (the consolidation-wall). The LIM paradigm proposes that these bottlenecks, too, can be overcome if the energy barrier of physical memories is adaptively modulated such that the dynamics of memory updates and consolidation match the Lyapunov dynamics of gradient-descent training of an AI model. In this paper, we derive new theoretical lower bounds on energy dissipation when training AI systems using different LIM approaches. The analysis presented here is model-agnostic and highlights the trade-off between energy efficiency and the speed of training. The resulting non-equilibrium energy-efficiency bounds have a similar flavor as that of Landauer's energy-dissipation bounds. We also extend these limits by taking into account the number of floating-point operations (FLOPs) used for training, the size of the AI model, and the precision of the training parameters. Our projections suggest that the energy-dissipation lower-bound to train a brain scale AI system (comprising of $10{15}$ parameters) using LIM is $108 \sim 109$ Joules, which is on the same magnitude the Landauer's adiabatic lower-bound and $6$ to $7$ orders of magnitude lower than the projections obtained using state-of-the-art AI accelerator hardware lower-bounds.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (78)
  1. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  2. Hans Moravec. When will computer hardware match the human brain. Journal of evolution and technology, 1(1):10, 1998.
  3. Alex de Vries. The growing energy footprint of artificial intelligence. Joule, 7(10):2191–2194, 2023.
  4. Suzana Herculano-Houzel. The human brain in numbers: a linearly scaled-up primate brain. Frontiers in Human Neuroscience, 3, 2009.
  5. Suzana Herculano-Houzel. Neuronal scaling rules for primate brains, page 325–340. Elsevier, 2012.
  6. Artificial intelligence. Our World in Data, 2023. https://ourworldindata.org/artificial-intelligence.
  7. Carbon emissions and large neural network training, 2021.
  8. Lamda: Language models for dialog applications, 2022.
  9. Language models are few-shot learners, 2020.
  10. Zachary Champion. Optimization could cut the carbon footprint of ai training by up to 75
  11. Scott Robbins and Aimee van Wynsberghe. Our new artificial intelligence infrastructure: Becoming locked into an unsustainable future. Sustainability, 14(8):4829, April 2022.
  12. Scientific discovery in the age of artificial intelligence. Nature, 620(7972):47–60, August 2023.
  13. Exploring galaxy evolution with generative models. Astronomy & Astrophysics, 616:L16, August 2018.
  14. Amplify scientific discovery with artificial intelligence. Science, 346(6206):171–172, October 2014.
  15. A 65nm 1mb nonvolatile computing-in-memory reram macro with sub-16ns multiply-and-accumulate for binary dnn ai edge processors. In 2018 IEEE International Solid - State Circuits Conference - (ISSCC), pages 494–496, 2018.
  16. Mark Horowitz. 1.1 computing's energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). IEEE, February 2014.
  17. Performance walls in machine learning and neuromorphic systems. In 2023 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1–4, 2023.
  18. Artificial cognitive memory—changing from density driven to functionality driven. Applied Physics A, 102(4):865–875, February 2011.
  19. Perspectives on emerging computation-in-memory paradigms. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 1925–1934, 2021.
  20. Neuro-inspired computing chips. Nature Electronics, 3(7):371–382, July 2020.
  21. A compute-in-memory chip based on resistive random-access memory. Nature, 608(7923):504–512, August 2022.
  22. Sub-microwatt analog vlsi trainable pattern classifier. IEEE Journal of Solid-State Circuits, 42(5):1169–1179, 2007.
  23. Optimizing every operation in a write-optimized file system. In 14th USENIX Conference on File and Storage Technologies (FAST 16), pages 1–14, Santa Clara, CA, February 2016. USENIX Association.
  24. Exploration of low numeric precision deep learning inference using intel® fpgas. In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 73–80, 2018.
  25. Memory technology enabling the next artificial intelligence revolution. In 2018 IEEE Nanotechnology Symposium (ANTS), pages 1–4, 2018.
  26. Benchmarking the performance and energy efficiency of ai accelerators for ai training. In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pages 744–751, 2020.
  27. Processing-in-memory for energy-efficient neural network training: A heterogeneous approach. In 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 655–668, 2018.
  28. Green ai: Do deep learning frameworks have different costs? In Proceedings of the 44th International Conference on Software Engineering, pages 1082–1094, 2022.
  29. Eric R. Kandel. The molecular biology of memory storage: A dialogue between genes and synapses. Science, 294(5544):1030–1038, November 2001.
  30. Synaptic computation. Nature, 431(7010):796–803, October 2004.
  31. Stochastic hybrid systems in cellular neuroscience. The Journal of Mathematical Neuroscience, 8(1), August 2018.
  32. Werner von Seelen and Hanspeter A. Mallot. Parallelism and Redundancy in Neural Networks, page 51–60. Springer Berlin Heidelberg, 1989.
  33. Stochastic synapses enable efficient brain-inspired learning machines. Frontiers in Neuroscience, 10, June 2016.
  34. Kinetics and dynamics of dna hybridization. Accounts of Chemical Research, 44(11):1172–1181, June 2011.
  35. Cascade models of synaptically stored memories. Neuron, 45(4):599–611, February 2005.
  36. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, March 2017.
  37. Mary B Kennedy. Synaptic signaling in learning and memory. Cold Spring Harb. Perspect. Biol., 8(2):a016824, December 2013.
  38. C Koch and I Segev. The role of single neurons in information processing. Nat. Neurosci., 3 Suppl(S11):1171–1177, November 2000.
  39. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev., 102(3):419–457, July 1995.
  40. James L McClelland. Incorporating rapid neocortical learning of new schema-consistent information into complementary learning systems theory. J. Exp. Psychol. Gen., 142(4):1190–1210, November 2013.
  41. An adaptive synaptic array using fowler–nordheim dynamic analog memory. Nature Communications, 13(1), March 2022.
  42. On-device synaptic memory consolidation using fowler-nordheim quantum-tunneling. Frontiers in Neuroscience, 16, January 2023.
  43. Charles H. Bennett. The thermodynamics of computation—a review. International Journal of Theoretical Physics, 21(12):905–940, December 1982.
  44. Feynman Lectures on Computation. Addison-Wesley Longman Publishing Co., Inc., USA, 1998.
  45. R. Landauer. Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5(3):183–191, 1961.
  46. Energy-dissipation limits in variance-based computing, 2017.
  47. Neural tangent kernel: Convergence and generalization in neural networks, 2018.
  48. Nanoionics-based resistive switching memories. Nature Materials, 6(11):833–840, November 2007.
  49. Ari Aviram. Molecules for memory, logic, and amplification. Journal of the American Chemical Society, 110(17):5687–5692, 1988.
  50. Mechanisms of surface-mediated dna hybridization. Acs Nano, 8(5):4488–4499, 2014.
  51. Surface-mediated dna hybridization: effects of dna conformation, surface chemistry, and electrostatics. Langmuir, 33(44):12651–12659, 2017.
  52. The boltzmann equation. Springer, 1988.
  53. Leon Chua. Resistance switching memories are memristors. Applied Physics A, 102(4):765–783, January 2011.
  54. Resistive random access memory (reram) based on metal oxides. Proceedings of the IEEE, 98(12):2237–2251, 2010.
  55. 15.4 a 22nm 2mb reram compute-in-memory macro with 121-28tops/w for multibit mac computing for tiny ai edge devices. In 2020 IEEE International Solid-State Circuits Conference - (ISSCC), pages 244–246, 2020.
  56. High-performance mixed-signal neurocomputing with nanoscale floating-gate memory cell arrays. IEEE Transactions on Neural Networks and Learning Systems, 29(10):4782–4790, 2018.
  57. A fefet based super-low-power ultra-fast embedded nvm technology for 22nm fdsoi and beyond. In 2017 IEEE International Electron Devices Meeting (IEDM), pages 19.7.1–19.7.4, 2017.
  58. Information-theoretic bounds on average signal transition activity [vlsi systems]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 7(3):359–368, 1999.
  59. Laszlo B. Kish. Thermal noise driven computing, 2006.
  60. The status of johnson noise thermometry. Metrologia, 33(4):325–335, August 1996.
  61. Memory consolidation from seconds to weeks: a three-stage neural network model with autonomous reinstatement dynamics. Frontiers in Computational Neuroscience, 8, July 2014.
  62. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(61):2121–2159, 2011.
  63. Adam: A method for stochastic optimization, 2014.
  64. Super-convergence: Very fast training of neural networks using large learning rates, 2017.
  65. Symmetry of learning rate in synaptic plasticity modulates formation of flexible and stable memories. Scientific reports, 7(1):5671, 2017.
  66. Beyond the regret minimization barrier: Optimal algorithms for stochastic strongly-convex optimization. Journal of Machine Learning Research, 15(71):2489–2512, 2014.
  67. Computational principles of synaptic memory consolidation. Nature Neuroscience, 19(12):1697–1706, October 2016.
  68. A 0.8 thz fmaxsubscript𝑓maxf_{\rm max}italic_f start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT sige hbt operating at 4.3 k. IEEE Electron Device Letters, 35(2):151–153, 2014.
  69. Ultrafast optical switching and data encoding on synthesized light fields. Science Advances, 9(8):eadf1015, 2023.
  70. Nvidia a100 tensor core gpu: Performance and innovation. IEEE Micro, 41(2):29–35, 2021.
  71. Saed G Younis. Asymptotically zero energy computing using split-level charge recovery logic. PhD thesis, Massachusetts Institute of Technology, 1994.
  72. Tommaso Toffoli. Reversible computing. In International colloquium on automata, languages, and programming, pages 632–644. Springer, 1980.
  73. Top eigenvalue of a random matrix: large deviations and third order phase transition. Journal of Statistical Mechanics: Theory and Experiment, 2014(1):P01012, 2014.
  74. Water–ice phase transition probed by raman spectroscopy. Journal of Raman Spectroscopy, 42(6):1408–1412, 2011.
  75. Nucleation mechanism for the direct graphite-to-diamond phase transition. Nature materials, 10(9):693–697, 2011.
  76. From a single-band metal to a high-temperature superconductor via two thermal phase transitions. Science, 331(6024):1579–1583, 2011.
  77. Gert Cauwenberghs. Reverse engineering the cognitive brain. Proceedings of the National Academy of Sciences, 110(39):15512–15513, September 2013.
  78. 1.1 tmacs/mw fine-grained stochastic resonant charge-recycling array processor. IEEE Sensors Journal, 12(4):785–792, 2012.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 1 like.

Upgrade to Pro to view all of the tweets about this paper: