Towards Efficient and Trustworthy AI Through Hardware-Algorithm-Communication Co-Design (2309.15942v1)
Abstract: AI algorithms based on neural networks have been designed for decades with the goal of maximising some measure of accuracy. This has led to two undesired effects. First, model complexity has risen exponentially when measured in terms of computation and memory requirements. Second, state-of-the-art AI models are largely incapable of providing trustworthy measures of their uncertainty, possibly `hallucinating' their answers and discouraging their adoption for decision-making in sensitive applications. With the goal of realising efficient and trustworthy AI, in this paper we highlight research directions at the intersection of hardware and software design that integrate physical insights into computational substrates, neuroscientific principles concerning efficient information processing, information-theoretic results on optimal uncertainty quantification, and communication-theoretic guidelines for distributed processing. Overall, the paper advocates for novel design methodologies that target not only accuracy but also uncertainty quantification, while leveraging emerging computing hardware architectures that move beyond the traditional von Neumann digital computing paradigm to embrace in-memory, neuromorphic, and quantum computing technologies. An important overarching principle of the proposed approach is to view the stochasticity inherent in the computational substrate and in the communication channels between processors as a resource to be leveraged for the purpose of representing and processing classical and quantum uncertainty.
- R. Bommasani, D. A. Hudson, E. Adeli, R. B. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, E. Brynjolfsson, S. Buch, D. Card, R. Castellon, N. S. Chatterji, A. S. Chen, K. Creel, J. Q. Davis, D. Demszky, C. Donahue, M. Doumbouya, E. Durmus, S. Ermon, J. Etchemendy, K. Ethayarajh, L. Fei-Fei, C. Finn, T. Gale, L. Gillespie, K. Goel, N. D. Goodman, S. Grossman, N. Guha, T. Hashimoto, P. Henderson, J. Hewitt, D. E. Ho, J. Hong, K. Hsu, J. Huang, T. Icard, S. Jain, D. Jurafsky, P. Kalluri, S. Karamcheti, G. Keeling, F. Khani, O. Khattab, P. W. Koh, M. S. Krass, R. Krishna, R. Kuditipudi, and et al., “On the opportunities and risks of foundation models,” CoRR, vol. abs/2108.07258, 2021. [Online]. Available: https://arxiv.org/abs/2108.07258
- “The economic potential of generative AI: The next productivity frontier,” https://www.mckinsey.com/, 2023, june 14 2023. [Online]. Available: https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017.
- L. Chen, S. T. Jose, I. Nikoloska, S. Park, T. Chen, O. Simeone et al., “Learning with limited samples: Meta-learning and applications to communication systems,” Foundations and Trends® in Signal Processing, vol. 17, no. 2, pp. 79–208, 2023.
- Y. Wang, G. Wei, and D. Brooks, “Benchmarking tpu, gpu, and CPU platforms for deep learning,” CoRR, vol. abs/1907.10701, 2019. [Online]. Available: http://arxiv.org/abs/1907.10701
- H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “Llama: Open and efficient foundation language models,” 2023.
- D. Tran, J. Liu, M. W. Dusenberry, D. Phan, M. Collier, J. Ren, K. Han, Z. Wang, Z. Mariet, H. Hu et al., “Plex: Towards reliability using pretrained large model extensions,” arXiv preprint arXiv:2207.07411, 2022.
- A. Z. Ren, A. Dixit, A. Bodrova, S. Singh, S. Tu, N. Brown, P. Xu, L. Takayama, F. Xia, J. Varley et al., “Robots that ask for help: Uncertainty alignment for large language model planners,” arXiv preprint arXiv:2307.01928, 2023.
- C. Ruah, O. Simeone, and B. Al-Hashimi, “A bayesian framework for digital twin-based control, monitoring, and data collection in wireless systems,” arXiv preprint arXiv:2212.01351, 2022.
- A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh, and E. Eleftheriou, “Memory devices and applications for in-memory computing,” Nat. Nanotechnology, Mar 2020. [Online]. Available: https://doi.org/10.1038/s41565-020-0655-z
- B. Zhang, S. Yin, M. Kim, J. Saikia, S. Kwon, S. Myung, H. Kim, S. J. Kim, J.-S. Seo, and M. Seok, “Pimca: A programmable in-memory computing accelerator for energy-efficient dnn inference,” IEEE Journal of Solid-State Circuits, vol. 58, no. 5, pp. 1436–1449, 2023.
- H. Jia, M. Ozatay, Y. Tang, H. Valavi, R. Pathak, J. Lee, and N. Verma, “15.1 a programmable neural-network inference accelerator based on scalable in-memory computing,” in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, 2021, pp. 236–238.
- Joksas, P. Freitas, Z. Chai, W. Ng, M. Buckwell, C. Li, W. Zhang, Q. Xia, A. Kenyon, and A. Mehonic, “Committee machines—a universal method to deal with non-idealities in memristor-based neural networks,” Nat. Comm., 2020.
- Nandakumar et al., “Mixed-precision deep learning based on computational memory,” Front. in Neurosc., 2020.
- ——, “Experimental demonstration of supervised learning in spiking neural networks with phase-change memory synapses,” Sci. Rep., 2020.
- V. Joshi, M. Le Gallo, S. Haefeli, I. Boybat, S. R. Nandakumar, C. Piveteau, M. Dazzi, B. Rajendran, A. Sebastian, and E. Eleftheriou, “Accurate deep neural network inference using computational phase-change memory,” Nat. Communications, vol. 11, no. 1, p. 2473, May 2020. [Online]. Available: https://doi.org/10.1038/s41467-020-16108-9
- E. Gibney, “Hello quantum world! google publishes landmark quantum supremacy claim,” Nature, vol. 574, no. 7779, pp. 461–463, 2019.
- M. Schuld, I. Sinayskiy, and F. Petruccione, “An introduction to quantum machine learning,” Contemporary Physics, vol. 56, no. 2, pp. 172–185, 2015.
- O. Simeone et al., “An introduction to quantum machine learning for engineers,” Foundations and Trends® in Signal Processing, vol. 16, no. 1-2, pp. 1–223, 2022.
- K. Friston, “The free-energy principle: a unified brain theory?” Nature reviews neuroscience, vol. 11, no. 2, pp. 127–138, 2010.
- M. Davies, “Benchmarks for progress in neuromorphic computing,” Nat. Machine Intelligence, vol. 1, no. 9, pp. 386–388, 2019.
- Z. Jiang, F. F. Xu, J. Araki, and G. Neubig, “How can we know what language models know?” Transactions of the Association for Computational Linguistics, vol. 8, pp. 423–438, 2020.
- Y. Chen, L. Yuan, G. Cui, Z. Liu, and H. Ji, “A close look into the calibration of pre-trained language models,” arXiv preprint arXiv:2211.00151, 2022.
- M. Braverman, X. Chen, S. Kakade, K. Narasimhan, C. Zhang, and Y. Zhang, “Calibration, entropy rates, and memory in language models,” in International Conference on Machine Learning. PMLR, 2020, pp. 1089–1099.
- M. Bertran, S. Tang, M. Kearns, J. Morgenstern, A. Roth, and Z. S. Wu, “Scalable membership inference attacks via quantile regression,” arXiv preprint arXiv:2307.03694, 2023.
- L. Tao, Y. Zhu, H. Guo, M. Dong, and C. Xu, “A benchmark study on calibration,” 2023.
- C. Wang, “Calibration in deep learning: A survey of the state-of-the-art,” 2023.
- A. Kumar, S. Sarawagi, and U. Jain, “Trainable calibration measures for neural networks from kernel mean embeddings,” in International Conference on Machine Learning. PMLR, 2018, pp. 2805–2814.
- J. Huang, S. Park, and O. Simeone, “Calibration-aware bayesian learning,” in IEEE MLSP, 2023.
- B. He, B. Lakshminarayanan, and Y. W. Teh, “Bayesian deep ensembles via the neural tangent kernel,” Advances in neural information processing systems, vol. 33, pp. 1010–1022, 2020.
- N. Houlsby, F. Huszár, Z. Ghahramani, and M. Lengyel, “Bayesian active learning for classification and preference learning,” arXiv preprint arXiv:1112.5745, 2011.
- D. J. MacKay, “A practical bayesian framework for backpropagation networks,” Neural computation, vol. 4, no. 3, pp. 448–472, 1992.
- M. Zecchin, S. Park, O. Simeone, M. Kountouris, and D. Gesbert, “Robust bayesian learning for reliable wireless ai: Framework and applications,” IEEE Transactions on Cognitive Communications and Networking, 2023.
- K. M. Cohen, S. Park, O. Simeone, and S. Shamai, “Bayesian active meta-learning for reliable and efficient ai-based demodulation,” IEEE Transactions on Signal Processing, vol. 70, pp. 5366–5380, 2022.
- Guo et al., “On calibration of modern neural networks,” in in Proc. ICML, 2017.
- A. N. Angelopoulos, S. Bates et al., “Conformal prediction: A gentle introduction,” Foundations and Trends® in Machine Learning, vol. 16, no. 4, pp. 494–591, 2023.
- K. M. Cohen, S. Park, O. Simeone, and S. S. Shitz, “Calibrating ai models for few-shot demodulation via conformal prediction,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.
- K. M. Cohen, S. Park, O. Simeone, P. Popovski, and S. Shamai, “Guaranteed dynamic scheduling of ultra-reliable low-latency traffic via conformal prediction,” IEEE Signal Processing Letters, 2023.
- M. Zhu, M. Zecchin, S. Park, C. Guo, C. Feng, and O. Simeone, “Federated inference with reliable uncertainty quantification over wireless channels via conformal prediction,” 2023.
- E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks,” IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 51–63, 2019.
- N. Skatchkovsky, H. Jang, and O. Simeone, “Bayesian continual learning via spiking neural networks,” Frontiers in Computational Neuroscience, vol. 16, p. 1037976, 2022.
- J. Chen, N. Skatchkovsky, and O. Simeone, “Neuromorphic wireless cognition: Event-driven semantic communications for remote inference,” IEEE Transactions on Cognitive Communications and Networking, vol. 9, no. 2, pp. 252–265, 2023.
- S. Park and O. Simeone, “Quantum conformal prediction for reliable uncertainty quantification in quantum machine learning,” arXiv preprint arXiv:2304.03398, 2023.
- P. Popovski, O. Simeone, F. Boccardi, D. Gündüz, and O. Sahin, “Semantic-effectiveness filtering and control for post-5g wireless connectivity,” Journal of the Indian Institute of Science, vol. 100, pp. 435–443, 2020.
- G. P. Fettweis, N. ul Hassan, L. Landau, and E. Fischer, “Wireless interconnect for board and chip level,” in 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2013, pp. 958–963.
- N. Srinivasa, D. Zhang, and B. Grigorian, “A robust and scalable neuromorphic communication system by combining synaptic time multiplexing and mimo-ofdm,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 3, pp. 585–608, 2014.
- D. Liu and O. Simeone, “Channel-driven monte carlo sampling for bayesian distributed learning in wireless data centers,” IEEE Journal on Selected Areas in Communications, vol. 40, no. 2, pp. 562–577, 2021.
- ——, “Privacy for free: Wireless federated learning via uncoded transmission with adaptive power control,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 170–185, 2020.
- S. B. Furber, F. Galluppi, S. Temple, and L. A. Plana, “The SpiNNaker project,” Proceedings of the IEEE, vol. 102, no. 5, pp. 652–665, May 2014.
- P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura, B. Brezzo, I. Vo, S. K. Esser, R. Appuswamy, B. Taba, A. Amir, M. D. Flickner, W. P. Risk, R. Manohar, and D. S. Modha, “A million spiking-neuron integrated circuit with a scalable communication network and interface,” Science, vol. 345, no. 6197, pp. 668–673, 2014.
- Davies, N. Srinivasa, T. Lin, G. Chinya, Y. Cao, S. H. Choday, G. Dimou, P. Joshi, N. Imam, S. Jain, Y. Liao, C. Lin, A. Lines, R. Liu, D. Mathaikutty, S. McCoy, A. Paul, J. Tse, G. Venkataramanan, Y. Weng, A. Wild, Y. Yang, and H. Wang, “Loihi: A neuromorphic manycore processor with on-chip learning,” IEEE Micro, January 2018.
- Le-Gallo, T. Tuma, F. Zipoli, A. Sebastian, and E. Eleftheriou, “Inherent stochasticity in phase-change memory devices,” in ESSDERC, 2016.
- Boybat et al., “Neuromorphic computing with multi-memristive synapses,” Nat. Comm., 2018. [Online]. Available: https://doi.org/10.1038/s41467-018-04933-y
- Rajendran, A. Sebastian, M. Schmuker, N. Srinivasa, and E. Eleftheriou, “Low-power neuromorphic hardware for signal processing applications: A review of architectural and system-level design approaches,” IEEE Signal Proc. Mag., 2019.
- P. Katti, N. Skatchkovsky, O. Simeone, B. Rajendran, and B. M. Al-Hashimi:, “Bayesian inference on binary spiking networks leveraging nanoscale device stochasticity,” in 2023 IEEE International Symposium on Circuits and Systems (ISCAS), 2023.
- Dalgaty, N. Castellani, C. Turck, K.-E. Harabi, D. Querlioz, and E. Vianello, “In situ learning using intrinsic memristor variability via markov chain monte carlo sampling,” Nat. Electronics, 2021.
- W. . et al., “Towards on-chip bayesian neuromorphic learning,” arXiv:2005.04165, 2020.
- H. Jang, N. Skatchkovsky, and O. Simeone, “BiSNN: Training spiking neural networks with binary weights via bayesian learning,” in 2021 IEEE Data Science and Learning Workshop (DSLW). IEEE, 2021, pp. 1–6.
- A. D. King, J. Raymond, T. Lanting, R. Harris, A. Zucca, F. Altomare, A. J. Berkley, K. Boothby, S. Ejtemaee, C. Enderud, E. Hoskinson, S. Huang, E. Ladizinsky, A. J. R. MacDonald, G. Marsden, R. Molavi, T. Oh, G. Poulin-Lamarre, M. Reis, C. Rich, Y. Sato, N. Tsai, M. Volkmann, J. D. Whittaker, J. Yao, A. W. Sandvik, and M. H. Amin, “Quantum critical dynamics in a 5,000-qubit programmable spin glass,” Nature, vol. 617, no. 7959, pp. 61–66, May 2023. [Online]. Available: https://doi.org/10.1038/s41586-023-05867-2
- Y. Kim, A. Eddins, S. Anand, K. X. Wei, E. van den Berg, S. Rosenblatt, H. Nayfeh, Y. Wu, M. Zaletel, K. Temme, and A. Kandala, “Evidence for the utility of quantum computing before fault tolerance,” Nature, vol. 618, no. 7965, pp. 500–505, Jun 2023. [Online]. Available: https://doi.org/10.1038/s41586-023-06096-3
- R. Acharya, I. Aleiner, R. Allen, T. I. Andersen, M. Ansmann, F. Arute, K. Arya, A. Asfaw, J. Atalaya, R. Babbush, D. Bacon, J. C. Bardin, J. Basso, A. Bengtsson, S. Boixo, G. Bortoli, A. Bourassa, J. Bovaird, L. Brill, M. Broughton, B. B. Buckley, D. A. Buell, T. Burger, B. Burkett, N. Bushnell, Y. Chen, Z. Chen, B. Chiaro, J. Cogan, R. Collins, P. Conner, W. Courtney, A. L. Crook, B. Curtin, D. M. Debroy, A. Del Toro Barba, S. Demura, A. Dunsworth, D. Eppens, C. Erickson, L. Faoro, E. Farhi, R. Fatemi, L. Flores Burgos, E. Forati, A. G. Fowler, B. Foxen, W. Giang, C. Gidney, D. Gilboa, M. Giustina, A. Grajales Dau, J. A. Gross, S. Habegger, M. C. Hamilton, M. P. Harrigan, S. D. Harrington, O. Higgott, J. Hilton, M. Hoffmann, S. Hong, T. Huang, A. Huff, W. J. Huggins, L. B. Ioffe, S. V. Isakov, J. Iveland, E. Jeffrey, Z. Jiang, C. Jones, P. Juhas, D. Kafri, K. Kechedzhi, J. Kelly, T. Khattar, M. Khezri, M. Kieferová, S. Kim, A. Kitaev, P. V. Klimov, A. R. Klots, A. N. Korotkov, F. Kostritsa, J. M. Kreikebaum, D. Landhuis, P. Laptev, K.-M. Lau, L. Laws, J. Lee, K. Lee, B. J. Lester, A. Lill, W. Liu, A. Locharla, E. Lucero, F. D. Malone, J. Marshall, O. Martin, J. R. McClean, T. McCourt, M. McEwen, A. Megrant, B. Meurer Costa, X. Mi, K. C. Miao, M. Mohseni, S. Montazeri, A. Morvan, E. Mount, W. Mruczkiewicz, O. Naaman, M. Neeley, C. Neill, A. Nersisyan, H. Neven, M. Newman, J. H. Ng, A. Nguyen, M. Nguyen, M. Y. Niu, T. E. O’Brien, A. Opremcak, J. Platt, A. Petukhov, R. Potter, L. P. Pryadko, C. Quintana, P. Roushan, N. C. Rubin, N. Saei, D. Sank, K. Sankaragomathi, K. J. Satzinger, H. F. Schurkus, C. Schuster, M. J. Shearn, A. Shorter, V. Shvarts, J. Skruzny, V. Smelyanskiy, W. C. Smith, G. Sterling, D. Strain, M. Szalay, A. Torres, G. Vidal, B. Villalonga, C. Vollgraff Heidweiller, T. White, C. Xing, Z. J. Yao, P. Yeh, J. Yoo, G. Young, A. Zalcman, Y. Zhang, N. Zhu, and G. Q. AI, “Suppressing quantum errors by scaling a surface code logical qubit,” Nature, vol. 614, no. 7949, pp. 676–681, Feb 2023. [Online]. Available: https://doi.org/10.1038/s41586-022-05434-1
- R. Ayanzadeh, P. Das, S. Tannu, and M. Qureshi, “Equal: Improving the fidelity of quantum annealers by injecting controlled perturbations,” in 2022 IEEE International Conference on Quantum Computing and Engineering (QCE). Los Alamitos, CA, USA: IEEE Computer Society, sep 2022, pp. 516–527. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/QCE53715.2022.00074
- R. Ayanzadeh, N. Alavisamani, P. Das, and M. Qureshi, “Frozenqubits: Boosting fidelity of qaoa by skipping hotspot nodes,” in Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ser. ASPLOS 2023. New York, NY, USA: Association for Computing Machinery, 2023, p. 311–324. [Online]. Available: https://doi.org/10.1145/3575693.3575741