Papers
Topics
Authors
Recent
2000 character limit reached

Residue Number System (RNS)

Updated 21 November 2025
  • RNS is a carry-free number system defined over pairwise-coprime moduli, enabling parallel, component-wise arithmetic operations.
  • It leverages modular arithmetic in each residue channel to eliminate carry propagation, leading to constant-time addition, subtraction, and multiplication.
  • The choice of moduli directly affects bit efficiency and dynamic range, making RNS ideal for applications in digital signal processing, cryptographic accelerators, and quantum computing.

A residue number system (RNS) is a non-weighted, carry-free number representation defined over a set of pairwise-coprime moduli, with component-wise, parallelizable arithmetic operations. Formally, each integer XX within a dynamic range [0,M)[0, M), where M=i=1kmiM = \prod_{i=1}^k m_i and {mi}\{m_i\} is a set of coprime moduli, is mapped to its residue vector (r1,...,rk)(r_1, ..., r_k), where ri=Xmodmir_i = X \bmod m_i (Liu et al., 2020, Dutta et al., 2012). This structure enables highly parallel hardware implementations for addition, subtraction, and multiplication and underpins advanced architectures in cryptography, digital signal processing, deep neural network acceleration, quantum computing, and photonic/analog computing.

1. Formal Definition, Representation, and Reconstruction

Given coprime moduli {m1,...,mk}\{m_1, ..., m_k\}, any integer X[0,M)X \in [0, M) is uniquely encoded as

X(r1,...,rk),ri=Xmodmi.X \longleftrightarrow (r_1, ..., r_k), \quad r_i = X \bmod m_i.

Reconstruction employs the Chinese Remainder Theorem (CRT): Xi=1kriMiNimodM,Mi=M/mi,Ni=Mi1modmi.X \equiv \sum_{i=1}^k r_i\, M_i\,N_i \bmod M,\quad M_i = M/m_i,\quad N_i = M_i^{-1}\bmod m_i. This map is bijective due to the coprimality constraint, and arithmetic operations on vectors are performed element-wise modulo each mim_i, ensuring no inter-channel carry propagation (Demirkiran et al., 2023, Dutta et al., 2012, 0901.1123).

2. Carry-Free Arithmetic: Basic Operations and Advantages

Component-wise modular arithmetic is the central strength of RNS:

  • Addition: (X+Y)modmi=(ri+si)modmi(X+Y)\bmod m_i = (r_i+s_i)\bmod m_i
  • Subtraction: (XY)modmi=(risi)modmi(X-Y)\bmod m_i = (r_i-s_i)\bmod m_i
  • Multiplication: (X×Y)modmi=(ri×si)modmi(X \times Y)\bmod m_i = (r_i \times s_i)\bmod m_i

This enables:

Component-wise modularity is leveraged in DNN hardware (Liu et al., 2020), cryptographic accelerators (Garg et al., 2016), and high-throughput DSP blocks (Dutta et al., 2012).

3. Moduli Selection, Bit Efficiency, and Dynamic Range

The choice of moduli directly determines RNS bit efficiency, area, frequency, and dynamic range:

  • Bit efficiency: For moduli set {m1,,mk}\{m_1,\dots,m_k\} and range M=imiM = \prod_i m_i, the efficiency is η=log2M/i=1klog2mi\eta = \log_2 M / \sum_{i=1}^k \lceil\log_2 m_i\rceil (Dutta et al., 2012).
  • Canonical sets: {2n1,2n,2n+1}\{2^n-1, 2^n, 2^n+1\} and {2n,22n1,22n+1}\{2^n, 2^{2n}-1, 2^{2n}+1\} offer high dynamic ranges with efficient reverse converters (0901.1123, Dutta et al., 2012).
  • Bit-efficient construction: Start with the core three-moduli set, appending the smallest coprime values as needed; total slice utilization and critical path improve over classical methods (Dutta et al., 2012).
Moduli Set Dynamic Range Typical Use
{2n1,2n,2n+1}\{2^n-1,2^n,2^n+1\} 2n(22n1)2^n \cdot (2^{2n}-1) DSP, DNN, Crypto
{2n,22n1,22n+1}\{2^n,2^{2n}-1,2^{2n}+1\} 2n(24n1)2^n (2^{4n}-1) High-DR systems
Custom coprime sets Custom (product) Fault-tolerance

Increasing the number of moduli grows the dynamic range multiplicatively, enabling sub-word residues and low-precision arithmetic for large-range computations (Demirkiran et al., 2023).

4. Hardware, Boolean, Photonic, and Quantum Realizations

Hardware Boolean Minimization:

  • Modular reduction and multiplication circuits synthesized using truth-table decomposition, SOP minimization, and combinational AND-OR/XOR gates achieve order-of-magnitude improvements in area and speed over standard EDA flows (Gorodecky et al., 2018).
  • Specialized residue generators, e.g., mod 2n+12^n+1 with Diminished-1 representation, further reduce area/latency, and support extension for conjugate moduli (2n12^n-1), yielding bi-residue generators with shared hardware (Piestrak et al., 17 May 2025).

Photonic and Optical RNS:

  • RNS digit-wise shifting mapped to spatial routing in hybrid photonic–plasmonic (HPP) switch networks; each modulus bank spatially realizes residuals via one-hot encoded waveguides. Arithmetic uses cascaded 2×2 HPP switches under static voltage controls (Peng et al., 2017).
  • Wavelength-division multiplexing enables O(100) parallel RNS computations, with post-processing off-chip CRT, achieving sub-20 ps/operation and fJ/operation energy.

Quantum RNS:

5. Advanced Methods: Multilayer, Redundant, and Fault-Tolerant RNS

Recursive (Multi-layer) RNS:

  • Constructs arbitrary-precision systems by recursively stacking virtual RNS layers, using carry-free Montgomery reduction at each level. The algorithm supports modular operations on RSA-scale (2048+ bit) moduli using only small-modulus arithmetic (e.g., 8 bits) at the hardware level (Hollmann et al., 2018).
  • Layered base extension, pseudo-residue handling, and redundancy ensure correctness and resistance to side-channel attacks.

Redundant RNS (R-RNS) and Error Correction:

  • Generalizes RNS by adding redundant moduli (e.g., for an RRNS(n,k) code), enabling error detection/correction through majority-voting reconstructions and per-channel correction (Demirkiran et al., 2023).
  • Digit-level redundancy, such as Signed-Digit SD-RNS, provides per-channel, per-digit carry-free addition and multiplication, with constant-time performance for additions and improvements in mixed operations. SD-RNS achieves 1.27× speedup over pure RNS and 2.25× over binary, with energy reductions up to 60% for DNN inference benchmarks (Mousavi et al., 10 Aug 2024).
Number System Add Time Mul Time Energy Best Application Scenario
BNS Highest Highest Highest None
RNS Lowest Higher Lower Addition-dominated
SD-RNS Low Low Lowest Mixed ops, DNN

6. RNS in Contemporary Computing: DNN Acceleration, High-Dimensional Methods, and Applications

Deep Learning Acceleration:

  • Large-tile Winograd convolution layers in quantized DNNs are accelerated by performing entire Winograd transformations in RNS, with all intermediate arithmetic in 8 or 16 bits and no loss in accuracy (Liu et al., 2020).
  • Analog/photonic DNN accelerators leverage RNS to decompose high-precision dot products into multiple concurrent low-precision MAC arrays, eliminating energy-prohibitive high-precision ADCs, achieving ≥99% of FP32 accuracy with 6-bit ADCs and 10²–10⁶× energy reduction (Demirkiran et al., 2023, Demirkiran et al., 2023).
  • Photonic tensor cores use RNS to realize modular arithmetic directly in phase, enabling high-speed (10 GHz) dot-products with only 5–6 bit conversion, achieving up to 23.8× throughput and 32.1× energy-delay-product gains over CMOS systolic arrays (Demirkiran et al., 2023).

Hyperdimensional and Neuromorphic Computing:

  • RNS is mapped into high-dimensional phasor/complex vector representations, supporting additive and multiplicative binding as Hadamard/phasor operators. Decoding employs resonator networks exploiting the RNS factor structure, yielding exponential dynamic range versus memory (Kymn et al., 2023).
  • These frameworks replicate grid-cell–like coding, solve NP-hard problems such as subset sum, and provide robust, noise-tolerant representations for machine learning tasks.

7. Theoretical and Practical Implications

  • RNS arithmetic eliminates the carry chain, facilitating massive hardware parallelism and constant-time arithmetic at all bit-widths.
  • Hardware realizations can achieve up to 30× higher speed, 15× area reduction versus standard synthesis, and unprecedented levels of fault tolerance when enhanced with digit- or modulus-level redundancy (Gorodecky et al., 2018, Mousavi et al., 10 Aug 2024).
  • In analog and photonic systems, RNS decouples per-channel converter precision from overall accuracy, allowing precise DNN training and inference at minimal data-converter energy (Demirkiran et al., 2023, Demirkiran et al., 2023).
  • In quantum computing, parallel distribution of RNS residue operations reduces circuit depth and enhances resilience to noise, offering a practical path to scalable quantum arithmetic in the NISQ era (Gaur et al., 7 Jun 2024, Gaur et al., 21 Jun 2025).

A plausible implication is that RNS, especially when combined with redundancy or implemented in non-traditional substrates, is positioned as a foundational mechanism for future highly parallel, energy-efficient, and noise-resilient arithmetic across digital, analog, photonic, and quantum computing platforms.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Residue Number System (RNS).