Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Landscape of Compute-near-memory and Compute-in-memory: A Research and Commercial Overview (2401.14428v1)

Published 24 Jan 2024 in cs.AR

Abstract: In today's data-centric world, where data fuels numerous application domains, with machine learning at the forefront, handling the enormous volume of data efficiently in terms of time and energy presents a formidable challenge. Conventional computing systems and accelerators are continually being pushed to their limits to stay competitive. In this context, computing near-memory (CNM) and computing-in-memory (CIM) have emerged as potentially game-changing paradigms. This survey introduces the basics of CNM and CIM architectures, including their underlying technologies and working principles. We focus particularly on CIM and CNM architectures that have either been prototyped or commercialized. While surveying the evolving CIM and CNM landscape in academia and industry, we discuss the potential benefits in terms of performance, energy, and cost, along with the challenges associated with these cutting-edge computing paradigms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (100)
  1. S. Li, A. O. Glova, X. Hu, P. Gu, D. Niu, K. T. Malladi, H. Zheng, B. Brennan, and Y. Xie, “Scope: A stochastic computing engine for dram-based in-situ accelerator,” in 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).   IEEE, 2018, pp. 696–709.
  2. A. de Vries, “The growing energy footprint of artificial intelligence,” Joule, vol. 9, no. 4, p. 1–4, Oct 2023. [Online]. Available: https://doi.org/10.1016/j.joule.2023.09.004
  3. J. Calma. (2023, September) Microsoft is going nuclear to power its ai ambitions. The Verge. [Online]. Available: https://www.theverge.com/2023/9/26/23889956/microsoft-next-generation-nuclear-energy-smr-job-hiring
  4. A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, “Ai and ml accelerator survey and trends,” in 2022 IEEE High Performance Extreme Computing Conference (HPEC).   IEEE, 2022, pp. 1–10.
  5. F. Devaux, “The true processing in memory accelerator,” in 2019 IEEE Hot Chips 31 Symposium (HCS).   IEEE Computer Society, 2019, pp. 1–24.
  6. Y.-C. Kwon, S. H. Lee, J. Lee, S.-H. Kwon, J. M. Ryu, J.-P. Son, O. Seongil, H.-S. Yu, H. Lee, S. Y. Kim et al., “25.4 a 20nm 6gb function-in-memory dram, based on hbm2 with a 1.2 tflops programmable computing unit using bank-level parallelism, for machine learning applications,” in 2021 IEEE International Solid-State Circuits Conference (ISSCC), vol. 64.   IEEE, 2021, pp. 350–352.
  7. S. Lee, S.-h. Kang, J. Lee, H. Kim, E. Lee, S. Seo, H. Yoon, S. Lee, K. Lim, H. Shin et al., “Hardware architecture and software stack for pim based on commercial dram technology: Industrial product,” in 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).   IEEE, 2021, pp. 43–56.
  8. M. He, C. Song, I. Kim, C. Jeong, S. Kim, I. Park, M. Thottethodi, and T. Vijaykumar, “Newton: A dram-maker’s accelerator-in-memory (aim) architecture for machine learning,” in 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).   IEEE, 2020, pp. 372–385.
  9. S. Lee, K. Kim, S. Oh, J. Park, G. Hong, D. Ka, K. Hwang, J. Park, K. Kang, J. Kim et al., “A 1ynm 1.25 v 8gb, 16gb/s/pin gddr6-based accelerator-in-memory supporting 1tflops mac operation and various activation functions for deep-learning applications,” in 2022 IEEE International Solid-State Circuits Conference (ISSCC), vol. 65.   IEEE, 2022, pp. 1–3.
  10. H.-S. Wong and S. Salahuddin, “Memory leads the way to better computing,” Nature nanotechnology, vol. 10, pp. 191–4, 03 2015.
  11. H.-S. P. Wong, H.-Y. Lee, S. Yu, Y.-S. Chen, Y. Wu, P.-S. Chen, B. Lee, F. T. Chen, and M.-J. Tsai, “Metal–oxide rram,” Proceedings of the IEEE, vol. 100, no. 6, pp. 1951–1970, 2012.
  12. W. J. Gallagher and S. S. P. Parkin, “Development of the magnetic tunnel junction mram at ibm: From first junctions to a 16-mb mram demonstrator chip,” IBM J. Res. Dev., vol. 50, no. 1, pp. 5–23, Jan. 2006. [Online]. Available: http://dx.doi.org/10.1147/rd.501.0005
  13. J. Hoffman, X. Pan, J. W. Reiner, F. J. Walker, J. P. Han, C. H. Ahn, and T. P. Ma, “Ferroelectric field effect transistors for memory applications,” Advanced Materials, vol. 22, no. 26-27, pp. 2957–2961, 2010. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/adma.200904327
  14. P. M. Research, “Market study on in-memory computing: Adoption of fast-processing databases fuels the demand. report pmrrep33026,” 2022.
  15. P. Radojković, P. Carpenter, P. Esmaili-Dokht, R. Cimadomo, H.-P. Charles, S. Abu, and P. Amato, “Processing in memory: the tipping point,” White paper: Processing in Memory: the Tipping Point, 2021.
  16. M. Anderson, B. Chen, S. Chen, S. Deng, J. Fix, M. Gschwind, A. Kalaiah, C. Kim, J. Lee, J. Liang et al., “First-generation inference accelerator deployment at facebook,” arXiv preprint arXiv:2107.04140, 2021.
  17. A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh, and E. Eleftheriou, “Memory devices and applications for in-memory computing,” Nature Nanotechnology, pp. 1–16, 2020.
  18. F. Ottati, G. Turvani, G. Masera, and M. Vacca, “Custom memory design for logic-in-memory: Drawbacks and improvements over conventional memories,” Electronics, vol. 10, no. 18, p. 2291, 2021.
  19. S. Kvatinsky, D. Belousov, S. Liman, G. Satat, N. Wald, E. G. Friedman, A. Kolodny, and U. C. Weiser, “Magic—memristor-aided logic,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 61, no. 11, pp. 895–899, 2014.
  20. S. Li, C. Xu, Q. Zou, J. Zhao, Y. Lu, and Y. Xie, “Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories,” in Proceedings of the 53rd Annual Design Automation Conference, 2016, pp. 1–6.
  21. W. A. Simon, Y. M. Qureshi, M. Rios, A. Levisse, M. Zapater, and D. Atienza, “Blade: An in-cache computing architecture for edge devices,” IEEE Transactions on Computers, vol. 69, no. 9, pp. 1349–1363, 2020.
  22. V. Seshadri, D. Lee, T. Mullins, H. Hassan, A. Boroumand, J. Kim, M. A. Kozuch, O. Mutlu, P. B. Gibbons, and T. C. Mowry, “Ambit: In-memory accelerator for bulk bitwise operations using commodity dram technology,” in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017, pp. 273–287.
  23. D. Fakhry, M. Abdelsalam, M. W. El-Kharashi, and M. Safar, “A review on computational storage devices and near memory computing for high performance applications,” Memories-Materials, Devices, Circuits and Systems, p. 100051, 2023.
  24. G. Singh, L. Chelini, S. Corda, A. J. Awan, S. Stuijk, R. Jordans, H. Corporaal, and A.-J. Boonstra, “Near-memory computing: Past, present, and future,” Microprocessors and Microsystems, vol. 71, p. 102868, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0141933119300389
  25. A. Gebregiorgis, H. A. Du Nguyen, J. Yu, R. Bishnoi, M. Taouil, F. Catthoor, and S. Hamdioui, “A survey on memory-centric computer architectures,” J. Emerg. Technol. Comput. Syst., vol. 18, no. 4, oct 2022. [Online]. Available: https://doi.org/10.1145/3544974
  26. F. Gao, G. Tziantzioulis, and D. Wentzlaff, “Computedram: In-memory compute using off-the-shelf drams,” in Proceedings of the 52nd annual IEEE/ACM international symposium on microarchitecture, 2019, pp. 100–113.
  27. M. Ali, A. Jaiswal, S. Kodge, A. Agrawal, I. Chakraborty, and K. Roy, “Imac: In-memory multi-bit multiplication and accumulation in 6t sram array,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 67, no. 8, pp. 2521–2531, 2020.
  28. Z.-R. Wang, Y.-T. Su, Y. Li, Y.-X. Zhou, T.-J. Chu, K.-C. Chang, T.-C. Chang, T.-M. Tsai, S. M. Sze, and X.-S. Miao, “Functionally complete boolean logic in 1t1r resistive random access memory,” IEEE Electron Device Letters, vol. 38, no. 2, pp. 179–182, 2016.
  29. R. Neale, D. Nelson, and G. E. Moore, “Nonvolatile and reprogrammable, the read-mostly memory is here,” Electronics, vol. 43, no. 20, pp. 56–60, 1970.
  30. G. W. Burr, M. J. Brightsky, A. Sebastian, H.-Y. Cheng, J.-Y. Wu, S. Kim, N. E. Sosa, N. Papandreou, H.-L. Lung, H. Pozidis et al., “Recent progress in phase-change memory technology,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 6, no. 2, pp. 146–162, 2016.
  31. Z. Guo, J. Yin, Y. Bai, D. Zhu, K. Shi, G. Wang, K. Cao, and W. Zhao, “Spintronics for energy-efficient computing: An overview and outlook,” Proceedings of the IEEE, vol. 109, no. 8, pp. 1398–1417, 2021.
  32. A. Kent and D. Worledge, “A new spin on magnetic memories,” Nature nanotechnology, vol. 10, pp. 187–91, 03 2015.
  33. D. Reis, M. Niemier, and X. S. Hu, “Computing in memory with fefets,” in Proceedings of the international symposium on low power electronics and design, 2018, pp. 1–6.
  34. J. D. Kendall and S. Kumar, “The building blocks of a brain-inspired computer,” Applied Physics Reviews, vol. 7, no. 1, 2020.
  35. V. Milo, G. Malavena, C. Compagnoni, and D. Ielmini, “Memristive and cmos devices for neuromorphic computing,” Materials, vol. 13, p. 166, 01 2020.
  36. D. Patterson, K. Asanovic, A. Brown, R. Fromm, J. Golbus, B. Gribstad, K. Keeton, C. Kozyrakis, D. Martin, S. Perissakis, R. Thomas, N. Treuhaft, and K. Yelick, “Intelligent ram (iram): the industrial setting, applications, and architectures,” in Proceedings International Conference on Computer Design VLSI in Computers and Processors, 1997, pp. 2–7.
  37. J. Draper, J. Chame, M. Hall, C. Steele, T. Barrett, J. LaCoss, J. Granacki, J. Shin, C. Chen, C. W. Kang et al., “The architecture of the diva processing-in-memory chip,” in Proceedings of the 16th international conference on Supercomputing, 2002, pp. 14–25.
  38. Y. Kang, W. Huang, S.-M. Yoo, D. Keen, Z. Ge, V. Lam, P. Pattnaik, and J. Torrellas, “Flexram: Toward an advanced intelligent memory system,” in 2012 IEEE 30th International Conference on Computer Design (ICCD).   IEEE, 2012, pp. 5–14.
  39. J. Gómez-Luna, I. E. Hajj, I. Fernandez, C. Giannoula, G. F. Oliveira, and O. Mutlu, “Benchmarking a new paradigm: An experimental analysis of a real processing-in-memory architecture,” arXiv preprint arXiv:2105.03814, 2021.
  40. H. Shin, D. Kim, E. Park, S. Park, Y. Park, and S. Yoo, “Mcdram: Low latency and energy-efficient matrix computations in dram,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, pp. 2613–2622, 2018.
  41. B. Kim, J. Chung, E. Lee, W. Jung, S. Lee, J. Choi, J. Park, M. Wi, S. Lee, and J. H. Ahn, “Mvid: Sparse matrix-vector multiplication in mobile dram for accelerating recurrent neural networks,” IEEE Transactions on Computers, vol. 69, no. 7, pp. 955–967, 2020.
  42. L. Ke, X. Zhang, J. So, J.-G. Lee, S.-H. Kang, S. Lee, S. Han, Y. Cho, J. H. Kim, Y. Kwon et al., “Near-memory processing in action: Accelerating personalized recommendation with axdimm,” IEEE Micro, vol. 42, no. 1, pp. 116–127, 2021.
  43. A. Yazdanbakhsh, C. Song, J. Sacks, P. Lotfi-Kamran, H. Esmaeilzadeh, and N. S. Kim, “In-dram near-data approximate acceleration for gpus,” in Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018, pp. 1–14.
  44. J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi, “A scalable processing-in-memory accelerator for parallel graph processing,” in Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015, pp. 105–117.
  45. D. Zhang, N. Jayasena, A. Lyashevsky, J. L. Greathouse, L. Xu, and M. Ignatowski, “Top-pim: Throughput-oriented programmable processing in memory,” in Proceedings of the 23rd international symposium on High-performance parallel and distributed computing, 2014, pp. 85–98.
  46. R. Nair, S. F. Antao, C. Bertolli, P. Bose, J. R. Brunheroto, T. Chen, C.-Y. Cher, C. H. Costa, J. Doi, C. Evangelinos et al., “Active memory cube: A processing-in-memory architecture for exascale systems,” IBM Journal of Research and Development, vol. 59, no. 2/3, pp. 17–1, 2015.
  47. M. Gao and C. Kozyrakis, “Hrl: Efficient and flexible reconfigurable logic for near-data processing,” in 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).   Ieee, 2016, pp. 126–137.
  48. D. Reis, A. F. Laguna, M. Niemier, and X. S. Hu, “In-memory computing accelerators for emerging learning paradigms,” in Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023, pp. 606–611.
  49. L. Xie, H. A. Du Nguyen, J. Yu, A. Kaichouhi, M. Taouil, M. AlFailakawi, and S. Hamdioui, “Scouting logic: A novel memristor-based logic design for resistive computing,” in 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).   IEEE, 2017, pp. 176–181.
  50. A. Ankit, I. El Hajj, S. R. Chalamalasetti, S. Agarwal, M. Marinella, M. Foltin, J. P. Strachan, D. Milojicic, W.-M. Hwu, and K. Roy, “Panther: A programmable architecture for neural network training harnessing energy-efficient reram,” IEEE Transactions on Computers, vol. 69, no. 8, pp. 1128–1142, 2020.
  51. X. Qiao, X. Cao, H. Yang, L. Song, and H. Li, “Atomlayer: A universal reram-based cnn accelerator with atomic layer computation,” in Proceedings of the 55th Annual Design Automation Conference, 2018, pp. 1–6.
  52. P. Chen, M. Wu, Y. Ma, L. Ye, and R. Huang, “Rimac: An array-level adc/dac-free reram-based in-memory dnn processor with analog cache and computation,” in Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023, pp. 228–233.
  53. G. Yuan, P. Behnam, Z. Li, A. Shafiee, S. Lin, X. Ma, H. Liu, X. Qian, M. N. Bojnordi, Y. Wang et al., “Forms: Fine-grained polarized reram-based in-situ computation for mixed-signal dnn accelerator,” in 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).   IEEE, 2021, pp. 265–278.
  54. Z. Wang, S. Joshi, S. Savel’Ev, W. Song, R. Midya, Y. Li, M. Rao, P. Yan, S. Asapu, Y. Zhuo et al., “Fully memristive neural networks for pattern classification with unsupervised learning,” Nature Electronics, vol. 1, no. 2, pp. 137–145, 2018.
  55. W.-H. Chen, K.-X. Li, W.-Y. Lin, K.-H. Hsu, P.-Y. Li, C.-H. Yang, C.-X. Xue, E.-Y. Yang, Y.-K. Chen, Y.-S. Chang et al., “A 65nm 1mb nonvolatile computing-in-memory reram macro with sub-16ns multiply-and-accumulate for binary dnn ai edge processors,” in 2018 IEEE International Solid-State Circuits Conference-(ISSCC).   IEEE, 2018, pp. 494–496.
  56. C.-X. Xue, W.-H. Chen, J.-S. Liu, J.-F. Li, W.-Y. Lin, W.-E. Lin, J.-H. Wang, W.-C. Wei, T.-W. Chang, T.-C. Chang et al., “24.1 a 1mb multibit reram computing-in-memory macro with 14.6 ns parallel mac computing time for cnn based ai edge processors,” in 2019 IEEE International Solid-State Circuits Conference-(ISSCC).   IEEE, 2019, pp. 388–390.
  57. Q. Liu, B. Gao, P. Yao, D. Wu, J. Chen, Y. Pang, W. Zhang, Y. Liao, C.-X. Xue, W.-H. Chen et al., “33.2 a fully integrated analog reram based 78.4 tops/w compute-in-memory chip with fully parallel mac computing,” in 2020 IEEE International Solid-State Circuits Conference-(ISSCC).   IEEE, 2020, pp. 500–502.
  58. C. Eckert, X. Wang, J. Wang, A. Subramaniyan, R. Iyer, D. Sylvester, D. Blaaauw, and R. Das, “Neural cache: Bit-serial in-cache acceleration of deep neural networks,” in 2018 ACM/IEEE 45Th annual international symposium on computer architecture (ISCA).   IEEE, 2018, pp. 383–396.
  59. M. Kang, S. K. Gonugondla, S. Lim, and N. R. Shanbhag, “A 19.4-nj/decision, 364-k decisions/s, in-memory random forest multi-class inference accelerator,” IEEE Journal of Solid-State Circuits, vol. 53, no. 7, pp. 2126–2135, 2018.
  60. H. Valavi, P. J. Ramadge, E. Nestler, and N. Verma, “A 64-tile 2.4-mb in-memory-computing cnn accelerator employing charge-domain compute,” IEEE Journal of Solid-State Circuits, vol. 54, no. 6, pp. 1789–1799, 2019.
  61. A. Biswas and A. P. Chandrakasan, “Conv-sram: An energy-efficient sram with in-memory dot-product computation for low-power convolutional neural networks,” IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 217–230, 2018.
  62. S. Yin, Z. Jiang, J.-S. Seo, and M. Seok, “Xnor-sram: In-memory computing sram macro for binary/ternary deep neural networks,” IEEE Journal of Solid-State Circuits, vol. 55, no. 6, pp. 1733–1743, 2020.
  63. S. Angizi, Z. He, A. S. Rakin, and D. Fan, “Cmp-pim: an energy-efficient comparator-based processing-in-memory neural network accelerator,” in Proceedings of the 55th Annual Design Automation Conference, 2018, pp. 1–6.
  64. J. Doevenspeck, K. Garello, B. Verhoef, R. Degraeve, S. Van Beek, D. Crotti, F. Yasin, S. Couet, G. Jayakumar, I. Papistas et al., “Sot-mram based analog in-memory computing for dnn inference,” in 2020 IEEE Symposium on VLSI Technology.   IEEE, 2020, pp. 1–2.
  65. L. Chang, X. Ma, Z. Wang, Y. Zhang, Y. Xie, and W. Zhao, “Pxnor-bnn: In/with spin-orbit torque mram preset-xnor operation-based binary neural networks,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 11, pp. 2668–2679, 2019.
  66. S. Jung, H. Lee, S. Myung, H. Kim, S. K. Yoon, S.-W. Kwon, Y. Ju, M. Kim, W. Yi, S. Han et al., “A crossbar array of magnetoresistive memory devices for in-memory computing,” Nature, vol. 601, no. 7892, pp. 211–216, 2022.
  67. E. Breyer, H. Mulaosmanovic, T. Mikolajick, and S. Slesazeck, “Reconfigurable nand/nor logic gates in 28 nm hkmg and 22 nm fd-soi fefet technology,” in 2017 IEEE International Electron Devices Meeting (IEDM).   IEEE, 2017, pp. 28–5.
  68. X. Yin, C. Li, Q. Huang, L. Zhang, M. Niemier, X. S. Hu, C. Zhuo, and K. Ni, “Fecam: A universal compact digital and analog content addressable memory using ferroelectric,” IEEE Transactions on Electron Devices, vol. 67, no. 7, pp. 2785–2792, 2020.
  69. A. Kazemi, M. M. Sharifi, A. F. Laguna, F. Müller, R. Rajaei, R. Olivo, T. Kämpfe, M. Niemier, and X. S. Hu, “In-memory nearest neighbor search with fefet multi-bit content-addressable memories,” in 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE).   IEEE, 2021, pp. 1084–1089.
  70. R. Khaddam-Aljameh, M. Stanisavljevic, J. F. Mas, G. Karunaratne, M. Braendli, F. Liu, A. Singh, S. M. Müller, U. Egger, A. Petropoulos et al., “Hermes core–a 14nm cmos and pcm-based in-memory compute core using an array of 300ps/lsb linearized cco-based adcs and local digital processing,” in 2021 Symposium on VLSI Circuits.   IEEE, 2021, pp. 1–2.
  71. M. Le Gallo, R. Khaddam-Aljameh, M. Stanisavljevic, A. Vasilopoulos, B. Kersting, M. Dazzi, G. Karunaratne, M. Brändli, A. Singh, S. M. Mueller et al., “A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference,” Nature Electronics, pp. 1–14, 2023.
  72. Q. Dong, M. E. Sinangil, B. Erbagci, D. Sun, W.-S. Khwa, H.-J. Liao, Y. Wang, and J. Chang, “15.3 a 351tops/w and 372.4 gops compute-in-memory sram macro in 7nm finfet cmos for machine-learning applications,” in 2020 IEEE International Solid-State Circuits Conference-(ISSCC).   IEEE, 2020, pp. 242–244.
  73. H. Wang, R. Liu, R. Dorrance, D. Dasalukunte, D. Lake, and B. Carlton, “A charge domain sram compute-in-memory macro with c-2c ladder-based 8-bit mac unit in 22-nm finfet process for edge inference,” IEEE Journal of Solid-State Circuits, vol. 58, no. 4, pp. 1037–1050, 2023.
  74. S. De, F. Mueller, N. Laleni, M. Lederer, Y. Raffel, S. Mojumder, A. Vardar, S. Abdulazhanov, T. Ali, S. Dünkel et al., “Demonstration of multiply-accumulate operation with 28 nm fefet crossbar array,” IEEE Electron Device Letters, vol. 43, no. 12, pp. 2081–2084, 2022.
  75. G. Pedretti, C. E. Graves, S. Serebryakov, R. Mao, X. Sheng, M. Foltin, C. Li, and J. P. Strachan, “Tree-based machine learning performed in-memory with memristive analog cam,” Nature communications, vol. 12, no. 1, p. 5806, 2021.
  76. K. Akarvardar and H.-S. P. Wong, “Technology prospects for data-intensive computing,” Proceedings of the IEEE, vol. 111, no. 1, pp. 92–112, 2023.
  77. C. Zhang, H. Sun, S. Li, Y. Wang, H. Chen, and H. Liu, “A survey of memory-centric energy efficient computer architecture,” IEEE Transactions on Parallel and Distributed Systems, 2023.
  78. “Axelera,” https://www.axelera.ai/digital-in-memory-computing-for-deep-learning-acceleration/.
  79. “d-matrix,” https://www.d-matrix.ai/.
  80. “Gyrfalcon tech,” https://www.gyrfalcontech.ai/about-us/company-overview/.
  81. “Memcpu,” https://www.memcpu.com/.
  82. “Memverge,” https://memverge.com/company/.
  83. “mythic,” https://mythic.ai/.
  84. “Neuroblade,” https://www.neuroblade.com/product/.
  85. “Rain,” https://rain.ai/about-us/.
  86. “Semron,” https://www.semron.ai.
  87. “Surecore,” https://www.sure-core.com.
  88. “Synthara,” https://www.synthara.ai.
  89. “Syntiant,” https://www.syntiant.com/.
  90. “Tetramem,” https://www.tetramem.com.
  91. M. Rao, H. Tang, J. Wu, W. Song, M. Zhang, W. Yin, Y. Zhuo, F. Kiani, B. Chen, X. Jiang et al., “Thousands of conductance levels in memristors integrated on cmos,” Nature, vol. 615, no. 7954, pp. 823–829, 2023.
  92. “Encharge ai,” https://enchargeai.com.
  93. “Reconceive,” https://www.re-conceive.com/home.
  94. “Fractile,” https://www.fractile.ai/.
  95. “Untether,” https://www.untether.ai/.
  96. A. Siemieniuk, L. Chelini, A. A. Khan, J. Castrillon, A. Drebes, H. Corporaal, T. Grosser, and M. Kong, “OCC: An automated end-to-end machine learning optimizing compiler for computing-in-memory,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 41, no. 6, pp. 1674–1686, Aug. 2021. [Online]. Available: https://ieeexplore.ieee.org/document/9502921
  97. A. A. Khan, H. Farzaneh, K. F. Friebel, L. Chelini, and J. Castrillon, “Cinm (cinnamon): A compilation infrastructure for heterogeneous compute in-memory and compute near-memory paradigms,” arXiv preprint arXiv:2301.07486, 2022.
  98. H. Farzaneh, J. P. C. de Lima, M. Li, A. A. Khan, X. S. Hu, and J. Castrillon, “C4cam: A compiler for cam-based in-memory accelerators,” arXiv preprint arXiv:2309.06418, 2023.
  99. J. P. C. de Lima, A. A. Khan, H. Farzaneh, and J. Castrillon, “Full-stack optimization for cam-only dnn inference,” in Proceedings of the 2024 Design, Automation and Test in Europe Conference (DATE), ser. DATE’24.   IEEE, Mar. 2024, pp. 1–6.
  100. J. Ryckaert, M. Niemier, Z. Enciso, M. M. Sharifi, X. S. Hu, I. O’Connor, A. Graening, R. Sharma, P. Gupta, J. Castrillon, J. P. C. de Lima, A. A. Khan, and H. Farzaneh, “Smoothing disruption across the stack: Tales of memory, heterogeneity, and compilers,” in Proceedings of the 2024 Design, Automation and Test in Europe Conference (DATE), ser. DATE’24.   IEEE, Mar. 2024, pp. 1–6.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Asif Ali Khan (13 papers)
  2. Hamid Farzaneh (3 papers)
  3. Jeronimo Castrillon (31 papers)
  4. João Paulo C. de Lima (4 papers)
Citations (11)

Summary

We haven't generated a summary for this paper yet.