Deep Differentiable Logic Gate Networks

Updated 23 January 2026

Deep Differentiable Logic Gate Networks are ultra-lightweight neural models that replace conventional arithmetic neurons with differentiable, Boolean truth tables.
They leverage rate coding and continuous relaxations to enable gradient-based training while ensuring ultra-low-power, single-cycle inference on FPGA/ASIC platforms.
Empirical evaluations, such as on ECG arrhythmia detection, demonstrate competitive 94%+ accuracy with dramatically reduced hardware resources and energy consumption.

Deep Differentiable Logic Gate Networks (LGNs) are a class of ultra-lightweight, reconfigurable neural models that replace conventional arithmetic-based neurons with differentiable, logic-inspired lookup tables—generally implemented directly using FPGA or ASIC logic primitives. These networks offer single-cycle, ultra-low-power inference suitable for edge deployment, wearable, and implantable devices. LGNs operate by structuring each neuron as a Boolean logic gate or truth table, and achieve end-to-end differentiability through careful continuous relaxations during training, enabling backpropagation and gradient-based optimization. This methodology enables deployment on hardware with minimal computational and energy requirements, while maintaining competitive accuracy compared to conventional deep learning architectures (Mommen et al., 16 Jan 2026).

1. Architectural Principles and Mathematical Formulation

Deep Differentiable Logic Gate Networks consist of $L$ feed-forward layers, each containing $O$ logic-based “neurons.” Each neuron is formulated as an $N$ -input, single-output Boolean function. The canonical realization on hardware is as a $2^N \!:\! 1$ multiplexer (MUX), i.e., a truth table with $2^N$ binary addresses. The selection of inputs for each neuron is typically fixed and randomly sampled at initialization, and remains fixed throughout training. Each output channel aggregates the activity of a group of such neurons (often corresponding to a class in classification tasks) via a population count prior to the readout (Mommen et al., 16 Jan 2026).

Mathematically, the output of a single $N$ -input LUT-based neuron is: $L_{\text{out}} = \sum_{i=0}^{2^N-1} W_i \prod_{j=0}^{N-1} \Big[s_{ij}L_j + (1 - s_{ij})(1 - L_j)\Big]$ where $W_i$ are the trainable parameters representing the output value for each address $i$ , $L_j \in [0,1]$ are the (possibly continuous) inputs to the neuron, and $O$ 0 is the $O$ 1th bit in the binary representation of $O$ 2 (i.e., the selector for input $O$ 3 at address $O$ 4) (Mommen et al., 16 Jan 2026). This expression yields a differentiable mapping from the continuous inputs to the LUT output during training and collapses to a pure MUX (digital logic) at inference by binarizing $O$ 5.

2. Differentiable Training, Rate Coding, and Binarization

During training, Deep Differentiable LGNs exploit the differentiable, soft assignment induced by the MUX equation above. Each product term in the sum encodes the probability that the input bit vector matches address $O$ 6, turning the entire expression into a convex combination of $O$ 7s, which enables gradient propagation with respect to both the LUT values $O$ 8 and the continuous inputs $O$ 9 (Mommen et al., 16 Jan 2026).

A notable technique for bridging training-in-continuous and inference-in-discrete regimes is rate coding. Each $N$ 0 is interpreted not merely as a binary bit, but as a firing probability encoding the expected activation over a temporal or batchwise bit stream. This enables the network to operate on real-valued signals during training while mapping naturally to deterministic binary logic at deployment. The hardware implementation can directly use binary input values or (optionally) a stream of bits whose frequency matches the continuous probability, with little degradation in accuracy and minimal hardware overhead. For ECG classification on the MIT-BIH dataset, rate coding with a stream length $N$ 1 achieved performance virtually identical to that of full-precision continuous activations (Mommen et al., 16 Jan 2026).

After training, LUT parameters $N$ 2 are binarized to $N$ 3 per neuron, ensuring tight match with FPGA or ASIC implementations.

3. Hardware Mapping and Energy Characteristics

Each $N$ 4-input neuron maps to a single $N$ 5 multiplexer (MUX), which is a standard primitive on modern FPGAs (e.g., six-input LUTs on Xilinx Artix-7). The entire network is a cascade of such MUXes, each taking $N$ 6 inputs from the previous layer and producing one binary output.

Key features include:

LUT count: For a model with $N$ 7 neurons per layer and $N$ 8 layers, the total logic resource is $N$ 9 (number of LUTs per neuron). For MIT-BIH ECG classification, only 2000 six-input LUTs per layer sufficed for 94%+ accuracy (Mommen et al., 16 Jan 2026).
Latency: Inference latency is a single clock cycle per layer, with overall pipeline latency $2^N \!:\! 1$ 0 ns.
Power Consumption: Total dynamic power at 100 MHz was estimated at 5–7 mW; energy per inference is 50–70 pJ, which is three to six orders of magnitude less than typical deep CNNs or SVMs on the same datasets (Mommen et al., 16 Jan 2026).

The activity is fully digital, with no multiply-accumulate, accumulation, or high-bit arithmetic units. Post-training, all parameters and pathways are explicitly Boolean, and the only operations required are MUX selection and (optionally) counting for output aggregation.

4. Practical Application and Empirical Results

The primary demonstration to date is inter-patient ECG arrhythmia classification (MIT-BIH arrhythmia database, four-class AAMI stratification), using a two-layer LGN with either 2000 six-input or 3000 four-input LUTs per layer. These models achieved 94.24–94.26% accuracy and $2^N \!:\! 1$ 1 indices of 0.643–0.646, using only 2.89k–6.17k FLOPs per inference (including preprocessing and readout) (Mommen et al., 16 Jan 2026). By comparison, state-of-the-art conventional neural and SVM-based approaches consume between three and six orders of magnitude more operations and power. In real hardware, the entire network fits within 2–3k physical LUTs on a Xilinx Artix-7 device with under 7 mW peak dynamic power, and classifies a sample in 10 ns.

These models generalize well across unseen patients (inter-patient splits), with error rates substantially lower than previous logic-gate neural network baselines and even outperforming deep CNNs on this metric.

Deep Differentiable LGNs (as defined here) may be distinguished from:

Classic weightless neural networks: which lack gradient-based differentiation and typically require full enumeration or evolutionary search to optimize LUT contents.
LUTNs with continuous or polynomial transfer: which use floating-point-valued LUT entries and either matching or quantized activations, but may also rely on batch normalization, adder trees, and partial summations, increasing area and energy (Guo, 9 Jun 2025, Andronic et al., 2023).
Post-training quantization or indirect LUT mapping: where a conventional floating-point DNN is trained and then “folded” into tables, potentially incurring mismatches in accuracy or hardware efficiency (Cardinaux et al., 2018).

The LGN framework distinguishes itself by directly training the MUX (Boolean) functional form, supporting rate coding for continuous-valued input robustness, and precisely matching the deployed hardware structure; thus achieving superior accuracy-efficiency trade-offs in the aforementioned application (Mommen et al., 16 Jan 2026).

6. Limitations, Challenges, and Prospects

The main constraint is scalability: as $2^N \!:\! 1$ 2 increases, the table size per neuron grows exponentially ( $2^N \!:\! 1$ 3 entries). For hardware-limited designs, $2^N \!:\! 1$ 4 is typical, balancing expressivity with reasonable area. Extremely deep or wide networks may require hierarchical or ensemble compositions, or further advances in hardware-aware regularization and skip-connectivity. While rate coding offers improved robustness, real-time inference is naturally optimized for deterministic, per-bit input streams.

A further concern is the optimization of input selection. Current practice employs fixed random connectivity per neuron, but data-driven or learned input assignment methods may further enhance accuracy and/or resource efficiency. The approach is also inherently binary; extending to multi-bit (quantized) outputs or hybrid logic–arithmetic pipelines may offer enhanced representational power for more complex tasks.

In summary, Deep Differentiable Logic Gate Networks combine the theoretical efficiency of Boolean logic with the practical tractability of modern gradient-based optimization, achieving ultra-low-latency, ultra-low-power, high-accuracy inference for edge sensing tasks such as arrhythmia detection, and providing a template for the next generation of hardware-native neural architectures (Mommen et al., 16 Jan 2026).