Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Toward High Performance, Programmable Extreme-Edge Intelligence for Neuromorphic Vision Sensors utilizing Magnetic Domain Wall Motion-based MTJ (2402.15121v1)

Published 23 Feb 2024 in cs.AR, cs.ET, and eess.IV

Abstract: The desire to empower resource-limited edge devices with computer vision (CV) must overcome the high energy consumption of collecting and processing vast sensory data. To address the challenge, this work proposes an energy-efficient non-von-Neumann in-pixel processing solution for neuromorphic vision sensors employing emerging (X) magnetic domain wall magnetic tunnel junction (MDWMTJ) for the first time, in conjunction with CMOS-based neuromorphic pixels. Our hybrid CMOS+X approach performs in-situ massively parallel asynchronous analog convolution, exhibiting low power consumption and high accuracy across various CV applications by leveraging the non-volatility and programmability of the MDWMTJ. Moreover, our developed device-circuit-algorithm co-design framework captures device constraints (low tunnel-magnetoresistance, low dynamic range) and circuit constraints (non-linearity, process variation, area consideration) based on monte-carlo simulations and device parameters utilizing GF22nm FD-SOI technology. Our experimental results suggest we can achieve an average of 45.3% reduction in backend-processor energy, maintaining similar front-end energy compared to the state-of-the-art and high accuracy of 79.17% and 95.99% on the DVS-CIFAR10 and IBM DVS128-Gesture datasets, respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Yang Chai. In-sensor computing for machine vision, 2020.
  2. Ryoji Eki et al. A 1/2.3 inch 12.3 mpixel with on-chip 4.97 tops/w cnn processor back-illuminated stacked cmos image sensor. In ISSCC 2021, volume 64, pages 154–156. IEEE, 2021.
  3. Martin Lefebvre et al. A 0.2-to-3.6 tops/w programmable convolutional imager soc with in-sensor current-domain ternary-weighted mac operations for feature extraction and region-of-interest detection. In ISSCC 2021, volume 64, pages 118–120. IEEE, 2021.
  4. Sepehr Tabrizchi et al. Appcip: Energy-efficient approximate convolution-in-pixel scheme for neural network acceleration. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 13(1):225–236, 2023.
  5. Gourav Datta et al. A processing-in-pixel-in-memory paradigm for resource-constrained tinyml applications. Scientific Reports, 12, 2022a.
  6. Tzu-Hsiang Hsu et al. A 0.8 v intelligent vision sensor with tiny convolutional neural network and programmable weights using mixed-mode processing-in-sensor technique for image classification. In ISSCC 2022, volume 65, pages 1–3. IEEE, 2022.
  7. Patrick Lichtsteiner et al. A 128x128 120 db 15 μ𝜇\muitalic_μs latency asynchronous temporal contrast vision sensor. IEEE JSSC, 43(2):566–576, 2008.
  8. Juan Antonio Leñero-Bardallo et al. A 3.6 μ𝜇\muitalic_μs latency asynchronous frame-free event-driven dynamic-vision-sensor. IEEE JSSC, 46(6):1443–1455, 2011.
  9. Guang Chen et al. Event-based neuromorphic vision for autonomous driving: A paradigm shift for bio-inspired visual sensing and perception. IEEE Signal Processing Magazine, 37(4):34–49, 2020.
  10. Anh Nguyen et al. Real-time 6dof pose relocalization for event cameras with stacked spatial lstm networks. In CVPR, pages 0–0, 2019.
  11. Ana I Maqueda et al. Event-based vision meets deep learning on steering prediction for self-driving cars. In IEEE CVPR, pages 5419–5427, 2018.
  12. Gourav Datta et al. Can deep neural networks be converted to ultra low-latency spiking neural networks? In DATE 2022, volume 1, pages 718–723, 2022b.
  13. Ruibing Song et al. A reconfigurable convolution-in-pixel cmos image sensor architecture. IEEE Transactions on Circuits and Systems for Video Technology, 32(10):7212–7225, 2022.
  14. Xueyong Zhang et al. A 915–1220 tops/w, 976–1301 gops hybrid in-memory computing based always-on image processing for neuromorphic vision sensors. IEEE JSSC, 58(3):589–599, 2022.
  15. Tzu-Hsiang Hsu et al. A 0.5-v real-time computational cmos image sensor with programmable kernel for feature extraction. IEEE JSSC, 56(5):1588–1596, 2020.
  16. Neuromorphic-p2m: processing-in-pixel-in-memory paradigm for neuromorphic image sensors. Frontiers in Neuroinformatics, 17:1144301, 2023.
  17. Kaushik Roy et al. In-memory computing in emerging memory technologies for machine learning: An overview. In DAC 2020, pages 1–6. IEEE, 2020.
  18. Aayush Ankit et al. Resparc: A reconfigurable and energy-efficient architecture with memristive crossbars for deep spiking neural networks. In DAC 2017, pages 1–6, 2017.
  19. Abhronil Sengupta et al. Proposal for an all-spin artificial neural network: Emulating neural and synaptic functionalities through domain wall motion in ferromagnets. TBioCAS, 10(6):1152–1160, 2016.
  20. Thomas Leonard et al. Shape-dependent multi-weight magnetic artificial synapses for neuromorphic computing. Advanced Electronic Materials, 8(12):2200563, 2022.
  21. Mahshid Alamdar et al. Domain wall-magnetic tunnel junction spin-orbit torque devices and circuits for in-memory computing. APL, 118(11), 2021.
  22. S Ikeda et al. Tunnel magnetoresistance of 604% at 300k by suppression of ta diffusion in cofeb/ mgo/ cofeb pseudo-spin-valves annealed at high temperature. APL, 93(8), 2008.
  23. Xuan Hu et al. Spice-only model for spin-transfer torque domain wall mtj logic. IEEE TED, 66(6):2817–2821, 2019.
  24. Chao Wang et al. Compact model of dzyaloshinskii domain wall motion-based mtj for spin neural networks. IEEE TED, 67(6):2621–2626, 2020.
  25. Manman Wang et al. Compact model of domain wall mtj driven by spin-orbit torque and dzyaloshinskii–moriya interaction. IEEE Transactions on Magnetics, 58(8):1–5, 2021.
  26. Eduardo Martinez et al. Current-driven dynamics of dzyaloshinskii domain walls in the presence of in-plane fields: Full micromagnetic and one-dimensional analysis. Journal of Applied Physics, 115(21), 2014.
  27. Shijiang Luo et al. Integrator based on current-controlled magnetic domain wall. APL, 118(5), 2021.
  28. Xuanyao Fong et al. Knack: A hybrid spin-charge mixed-mode simulator for evaluating different genres of spin-transfer torque mram bit-cells. In 2011 International Conference on Simulation of Semiconductor Processes and Devices, pages 51–54. IEEE, 2011.
  29. Tsukasa Miura et al. A 6.9 μ𝜇\muitalic_μm pixel-pitch 3d stacked global shutter cmos image sensor with 3m cu-cu connections. In 3DIC 2019, pages 1–2. IEEE, 2019.
  30. Y Kagawa et al. Impacts of misalignment on 1μ𝜇\muitalic_μm pitch cu-cu hybrid bonding. In IITC 2020, pages 148–150. IEEE, 2020.
  31. Kwabena A Boahen. A burst-mode word-serial address-event link-i: Transmitter design. IEEE TCAS-I, 51(7):1269–1280, 2004.
  32. Daoqian Zhu et al. Threshold current density for perpendicular magnetization switching through spin-orbit torque. Physical Review Applied, 13(4):044078, 2020.
  33. Garrick Orchard et al. Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in Neuroscience, 9, 2015. URL https://www.frontiersin.org/articles/10.3389/fnins.2015.00437.
  34. Hongmin Li et al. Cifar10-dvs: An event-stream dataset for object classification. Frontiers in Neuroscience, 11, 2017. URL https://www.frontiersin.org/articles/10.3389/fnins.2017.00309.
  35. Arnon Amir et al. A low power, fully event-based gesture recognition system. In CVPR 2017, volume 1, pages 7388–7397, 2017.
  36. Wei Fang et al. Spikingjelly, 2020.
  37. Yusuke Sekikawa et al. Bit-pruning: A sparse multiplication-less dot-product. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=YUDiZcZTI8.
  38. R Yin et al. Sata: Sparsity-aware training accelerator for spiking neural networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42(6):1926–1938, 2023.
Citations (1)

Summary

  • The paper presents a processing-in-pixel framework integrating MDWMTJ devices to achieve energy-efficient neuromorphic vision processing with a 45.3% reduction in energy consumption.
  • The architecture supports three analog weight configurations that balance accuracy and efficiency, achieving 79.17% on DVS-CIFAR10 and 95.99% on IBM DVS128-Gesture.
  • The device-circuit-algorithm co-design enables post-fabrication reprogrammability, paving the way for versatile, high-performance edge computing applications.

Enhancing Edge Intelligence with Hybrid CMOS+X Neuromorphic Vision Sensors

Introduction to Hybrid CMOS+X Processing-in-Pixel Architecture

The quest for embedding intelligent processing capabilities directly into edge devices, particularly for computer vision applications, has spotlighted energy efficiency and throughput bottlenecks as primary challenges. Traditional solutions that segregate sensor hardware from processing platforms fail to offer the needed efficiency for edge computing scenarios. A promising avenue explored to circumvent these issues is the integration of neuromorphic vision sensors (NVS) with in-pixel processing capabilities. This integration aims to leverage the low energy consumption and high temporal precision of NVS, offering a pathway toward overcoming the existing energy inefficiency and throughput constraints.

The paper introduces a novel energy-efficient processing-in-pixel hardware solution utilizing a hybrid CMOS+X approach for spiking convolutional neural networks (CNNs), specifically targeting neuromorphic vision applications. This solution combines the emerging magnetic domain wall magnetic tunnel junction (MDWMTJ) technology with traditional CMOS-based neuromorphic pixels to perform in-situ massively parallel asynchronous analog convolution with high accuracy and low power consumption.

Key Contributions and Proposed Innovations

The proposed hybrid CMOS+X neuromorphic architecture distinguishes itself through several key innovations and contributions:

  • Introduction of a novel processing-in-pixel-in-memory (P²²M) framework for neuromorphic vision sensors that integrates MDWMTJ devices as core compute elements, enabling enhanced memorization capabilities and programmable computational paradigms.
  • Development of three analog weight configurations (CMOS-based, MDWMTJ-based, and hybrid CMOS+X), enabling diverse computation strategies that balance between accuracy, energy efficiency, and programmability.
  • Integration of a device-circuit-algorithm co-design framework that encapsulates device and circuit constraints within the algorithmic development process, yielding substantial energy savings and modest accuracy trade-offs under constrained conditions.

This framework not only demonstrates a reduction in backend processor energy consumption by an average of 45.3% across tested neuromorphic datasets but also maintains competitive accuracies of 79.17% and 95.99% on the DVS-CIFAR10 and IBM DVS128-Gesture datasets, respectively. These results showcase the potential of hybrid CMOS+X architectures in achieving high-performance, programmable extreme-edge intelligence.

Reprogrammability and Versatility

A distinct feature of the proposed architecture is its reprogrammability and versatility. With hybrid CMOS+X configurations, the system can be fine-tuned after fabrication to suit various applications, a crucial consideration for next-generation neuromorphic computing devices. The ability to reconfigure in-pixel processing parameters post-fabrication allows for broader application potential without requiring hardware redesign or replacement, facilitating adaptation to evolving computational needs.

Technical Evaluation and Device-Circuit-Algorithm Co-Design

The paper meticulously evaluates the proposed hardware through detailed HSpice simulations, incorporating realistic device and circuit constraints such as non-linearity, process variations, and TMR limitations. This comprehensive analysis ensures that the insights derived from the theoretical framework closely align with anticipated real-world performance, paving the way for practical implementations of the proposed system.

Looking Ahead: Future Developments in AI at the Edge

Looking ahead, the groundbreaking approach introduced by this paper holds the promise of revolutionizing edge intelligence by embedding highly energy-efficient, programmable computing capabilities directly within neuromorphic vision sensors. As research in spintronics and post-CMOS technologies progresses, we can anticipate even greater enhancements in the power efficiency, speed, and versatility of neuromorphic computing architectures. The continued development of device-circuit-algorithm co-design methodologies will play a pivotal role in fully unleashing the potential of these emerging technologies for a wide array of edge computing applications.

In conclusion, the proposed hybrid CMOS+X neuromorphic architecture represents a significant step forward in the quest for compact, energy-efficient, and versatile computing solutions tailored for the needs of edge devices. By addressing the critical challenges of energy consumption and processing throughput, this research lays a solid foundation for future advancements in neuromorphic computing, potentially transforming our approach to AI deployment in edge scenarios.

X Twitter Logo Streamline Icon: https://streamlinehq.com