Trainable Analogue Block (TAB)

Updated 16 April 2026

TAB is a neuromorphic hardware architecture that integrates diverse analog neurons with heterogeneous tuning curves for universal function approximation.
It leverages random device mismatch and systematic voltage offsets to create robust, low-power computation with minimal calibration.
TAB supports both offline batch learning and on-chip online adaptation, achieving low regression error and high throughput in sequence detection.

The Trainable Analogue Block (TAB) is a neuromorphic hardware architecture inspired by the principles of neural population coding in biological nervous systems. It exploits random device mismatch in advanced VLSI processes to instantiate a diverse ensemble of nonlinear encoding neurons, enabling robust, low-power, and highly adaptable computation for regression, classification, and temporal sequence learning. TABs are characterized by large hidden layers of analog “neurons” with heterogeneous tuning curves, systematically offset to guarantee distinct input-output mappings, and a programmable linear readout for training on arbitrary tasks with minimal reliance on precise hardware matching or calibration (Thakur et al., 2015, Thakur et al., 2015, Thakur et al., 2015, Hohenheim et al., 2022).

1. Theoretical Foundations: Population Coding and Random Projections

TAB design is grounded in biological population coding, wherein ensembles of broadly tuned neurons collaborate to encode stimuli. Each neuron’s tuning curve $f_i(x)$ (often tanh or Gaussian) covers a segment of the stimulus space. Heterogeneous tuning curves, achieved in TAB by random device mismatch and systematic voltage offsets, ensure full coverage and decorrelation, so that any target output can be constructed by a linear decoder:

$\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$

where $L$ is the number of hidden units, $w_i$ are the output weights, and $f_i(x)$ are the hidden activations. In silicon, each tunable “neuron” is implemented by a differential transistor pair, with mismatch in threshold voltage, slope factor, and bias currents setting the diversity of tuning curves (Thakur et al., 2015, Thakur et al., 2015).

Typical activation forms in deployed TAB systems are:

Sigmoidal: $f_i(x) = \tanh(\alpha_i x + \Delta_i + \varepsilon_i)$ , where $\alpha_i$ is gain (from bias/mismatch), $\Delta_i$ is a deliberate systematic offset, and $\varepsilon_i$ is a small random mismatch term.
Gaussian (in theory): $f_i(x) = \exp\left[ -\frac{(x - \mu_i)^2}{2\sigma_i^2} \right]$ (not commonly implemented in silicon hardware, but used in population coding models).

Systematic offsets are typically injected as a stepped reference voltage: $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 0, with $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 1 distributed evenly by a resistive ladder (Thakur et al., 2015). This heterogeneity is a critical enabler for universal approximation with a linear decoder.

2. Hardware Architecture and Circuit Implementation

The canonical TAB hardware instantiates three layers: an input layer, a much larger hidden layer, and a linear output layer. Essential hardware elements include:

Hidden (Nonlinear) Neuron Block: An NMOS or CMOS differential pair (M1/M2) operating in weak inversion implements a $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 2-like transfer; bias is set by M3. The input and reference voltages give each neuron its own tuning curve location. All transistors are minimum-sized to maximize the process-induced mismatch (Thakur et al., 2015, Thakur et al., 2015).

The drain currents satisfy:

$\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 3

so that $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 4.

Output Weighting Block: A binary-weighted current splitter (“R-2R” tree) controlled by a digital register, allowing programmable weights $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 5 per neuron at high precision (typically 10–13 bits). Each branch of the splitter is steered by a digital bit to either add or discard the associated fraction of the neuron current, effecting an accurate weighted sum at the output (Thakur et al., 2015, Thakur et al., 2015, Thakur et al., 2015).
Systematic Offsets and Mismatch: Device mismatch is leveraged as a resource, not a defect; systematic voltage offsets ensure the population codes span the stimulus space even in low-mismatch processes (Thakur et al., 2015, Thakur et al., 2015).
Scaling: Physical TAB arrays have been realized with hundreds of neurons and digital weight registers, with minimum per-neuron area and sub-μW power per neuron (Thakur et al., 2015). In smaller prototypes, time-multiplexed measurement across offset settings emulates larger populations (Thakur et al., 2015).

3. Learning Algorithms: Batch and Online Approaches

TAB supports training by making only the output-layer weights programmable, while all internal random projections and biases remain fixed post-fabrication. Two main training modes are used:

Batch (Offline) Learning:

For regression or classification, collect $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 6 input-output pairs $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 7 and compute the matrix $\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 8 of hidden responses. Solve the least-squares problem:

$\hat{y}(x) = \sum_{i=1}^L w_i f_i(x),$ 9

leading to the ridge regression solution:

$L$ 0

or unregularized by Moore–Penrose pseudoinverse $L$ 1 (Thakur et al., 2015, Thakur et al., 2015). This is equivalent to the Extreme Learning Machine (ELM) paradigm.

Online (Hardware-Friendly) Learning:

Sign-based Online Learning (SOL) adapts weights incrementally using the sign of the output error and hidden neuron response:

$L$ 2

where $L$ 3, $L$ 4 is the hidden activation, and $L$ 5 is a normalization parameter. This method, hardware-realized via counters and thresholding of analog currents, enables highly efficient online adaptation with minimal area and energy per update (Thakur et al., 2015).

For the batch mode, training is performed off-chip, and weights are downloaded to shift registers; for SOL, learning can proceed autonomously within the chip.

4. Experimental Results and Performance Metrics

TAB systems exhibit robust performance in function regression, classification, and temporal pattern recognition. Key empirical results include:

Function Approximation: Regression of $L$ 6, $L$ 7, and $L$ 8 with 34–456 hidden neurons demonstrated RMSEs of 0.018 (sin), 0.005 ( $L$ 9), and 0.035 (sinc) over $w_i$ 0. For more complex tasks, error scales as $w_i$ 1 (number of hidden neurons), consistent with function approximation theory (Thakur et al., 2015, Thakur et al., 2015).
Impact of Hidden Unit Diversity and Weight Resolution: With hundreds of randomly instantiated neurons, normalized RMS error can reach below 2% for smooth targets; below $w_i$ 2 bits, quantization error dominates, but 10–13 bits suffice for sub-1% error (Thakur et al., 2015).
Sequence Learning: Two TAB sequence-learner circuits detect and memorize two-pulse temporal patterns (A→B or B→A) with input delays in the $w_i$ 3– $w_i$ 4 ns range. Design A offers retrainability via reset with strict timing constraints; Design B is train-once with more relaxed delay tolerance. Throughputs reach $w_i$ 5 ops/s, dynamic power between 100–241 μW, and areas as low as $w_i$ 6 (unoptimized) (Hohenheim et al., 2022).
Learning Algorithm Behavior: SOL converges to low error for both smooth and highly oscillatory functions, with larger hidden layers yielding faster and more reliable convergence (Thakur et al., 2015). Digital TAB (NeuPS) supports high-throughput handwritten digit recognition with up to 95% classification accuracy on MNIST using online sign-based updates.
Fault Tolerance: Device mismatch assures that no two neurons are identical; thus, the loss of a neuron degrades performance minimally, enhancing robustness (Thakur et al., 2015).

5. Applications and Technological Implications

TAB enables a range of mixed-signal and neuromorphic applications:

Analog Front Ends: Post-silicon reconfigurability for precision ADC/DAC blocks, where adaptive population codes can compensate for manufacturing variation and enable tuning to application requirements (Thakur et al., 2015).
Low-power Pattern Recognition: Always-on sensor nodes, portable medical devices, embedded robotics benefit from the ultra-low standby power and rapid learning cycles associated with TAB (Thakur et al., 2015, Thakur et al., 2015).
Temporal Pattern Recognition: Fast (sub-100 ns) sequence learners for event-driven sensing, where tuning is performed by simple pulse timing rather than algorithmic optimization (Hohenheim et al., 2022).

TAB’s resilience to mismatch allows deployment in advanced process nodes without custom layout or labor-intensive calibration, shortening design cycles (Thakur et al., 2015, Thakur et al., 2015). Potential extension directions identified include on-chip online learning, multi-output/multi-channel TAB arrays, spike-based adaptation, and hardware exploration of alternative nonlinearities (e.g., Gaussian bumps).

6. Comparative Advantages, Limitations, and Future Prospects

Advantages:

Converts random device mismatch into computationally useful diversity.
Requires only linear (digital) learning at the output layer post-fabrication.
Insensitive to process variability, with no need for calibration or area-expensive transistor matching.
Extreme energy and area efficiency ( $w_i$ 71.2 μW per neuron, $w_i$ 81 mm $w_i$ 9 for hundreds of neurons).
Universally reconfigurable: a single TAB macro can be trained for arbitrary tasks across process nodes (Thakur et al., 2015, Thakur et al., 2015, Thakur et al., 2015).

Limitations:

Standard implementations require offline training or external computation; true on-chip online learning is an area of current research.
Representational power is ultimately limited by the linear (single-layer) readout; complex high-dimensional tasks may require larger arrays or layered architectures.
In sequence-learning designs, trade-offs exist between retrainability (Design A) and delay-range robustness (Design B), and analog pulse distortion may obscure some timing intervals (Hohenheim et al., 2022).

Future Directions:

Proposed enhancements include on-chip learning (e.g., LMS, spike-based rules), scalable multi-output architectures, and specialized circuits for spatiotemporal processing (Thakur et al., 2015, Thakur et al., 2015, Hohenheim et al., 2022). Application space encompasses adaptive analog ICs, neuromorphic sensors, and high-throughput reconfigurable pattern recognition under severe power and mismatch constraints. Exploration of new activation functions and deeper network structures for advanced tasks remains an open research avenue.

7. Summary Table: Key TAB Characteristics and Benchmarks

Attribute	Value/Description	Source
Process node	65 nm CMOS	(Thakur et al., 2015, Thakur et al., 2015)
Hidden layer size	34–456 neurons (prototypes); 1k+ possible	(Thakur et al., 2015, Thakur et al., 2015)
Activation nonlinearity	Tanh (differential pair); Gaussian (in theory)	(Thakur et al., 2015)
Output weight precision	10–13 bits (current splitter)	(Thakur et al., 2015, Thakur et al., 2015)
Per-neuron power	$f_i(x)$ 01.2 μW	(Thakur et al., 2015, Thakur et al., 2015)
Regression RMSE	0.005–0.035 (sin, $f_i(x)$ 1, sinc)	(Thakur et al., 2015, Thakur et al., 2015)
Learning mode	Offline batch (pseudoinverse), online SOL	(Thakur et al., 2015, Thakur et al., 2015)
Sequence learning throughput	$f_i(x)$ 2 ops/s	(Hohenheim et al., 2022)
Key applications	Mixed-signal analog blocks, always-on pattern recognition, sequence detection	(Thakur et al., 2015, Hohenheim et al., 2022)

TAB exemplifies the integration of biological coding concepts with aggressive semiconductor scaling, harnessing randomness for high-capacity, efficient, and reconfigurable computation across analog and neuromorphic domains.