Normalized Base-2 Encoding (NB2E)

Updated 13 December 2025

Normalized Base-2 Encoding (NB2E) is a deterministic method that converts continuous scalar values into fixed-length binary vectors via hierarchical dyadic step functions.
NB2E outperforms traditional Fourier and raw encodings by preserving phase and amplitude during extrapolation beyond the training domain.
Its implementation leverages normalized inputs and bit-level precision, enabling standard MLPs to reliably extend periodic function modeling outside observed ranges.

Normalized Base-2 Encoding (NB2E) is a deterministic method for encoding continuous scalar values as fixed-length binary vectors, designed to enable vanilla multi-layer perceptrons (MLPs) to extrapolate periodic functions well beyond their training range without requiring prior knowledge of functional form. NB2E leverages hierarchical, power-of-two “step” basis functions and bit-wise positional encoding—transforming real scalars into bit vectors whose structure aligns with the phase of periodic signals. This methodology has demonstrated robust out-of-distribution extrapolation on a diverse set of periodic functions, outperforming established encodings such as Fourier features in terms of phase and amplitude preservation outside the training interval (Powell et al., 11 Dec 2025).

1. Mathematical Formulation

Let $x$ denote a real scalar input. NB2E first normalizes $x$ by a positive constant %%%%2%%%%:

$x' = \frac{x}{z}, \qquad x' \in [0,1)$

Given an integer precision parameter $N$ , the encoding $\gamma(x)$ is a length- $N$ binary vector $[B_1, B_2, \dots, B_N]$ where

$x' = \sum_{i=1}^N B_i 2^{-i} \qquad\Longleftrightarrow\qquad B_i = \left\lfloor 2^i x' \right\rfloor - 2 \left\lfloor 2^{i-1} x' \right\rfloor\,,\quad i = 1, \ldots, N$

As $N \to \infty$ , NB2E encodes all points in the unit interval; practical use cases employ $N=32$ –48 for double-precision resolution. Each $B_i \in \{0,1\}$ corresponds to the $i$ -th bit in the binary expansion of $x'$ , with higher-order bits switching at dyadic intervals $2^{-i}$ .

2. Motivations and Theoretical Rationale

NB2E is motivated by limitations in traditional positional and Fourier encodings. Classical approaches such as

$\gamma_{\rm FFE}(x) = [\sin(2\pi x),\, \cos(2\pi x),\, \sin(4\pi x),\, \dots]$

decompose $x$ across continuous frequencies but fail to support extrapolation of unknown periodicities outside the training window. In contrast, NB2E provides a hierarchical, dyadic decomposition wherein each bit $B_i$ represents a step function of period $2^{-i}$ , localizing phase information at all scales. The normalization to $[0,1)$ is necessary to maintain consistent bit semantics across the domain and to ensure each bit achieves both values during training. This structure enables neural networks to recognize and generalize repeating bit patterns far beyond observed data.

3. Algorithmic Description and Implementation

The NB2E encoding algorithm is as follows. Given input $(x, z, N)$ :

Compute $x' \leftarrow x/z$
For each $i = 1$ $i = 1$ to $N$ $N$ :
- $u_i \leftarrow \left\lfloor 2^i x' \right\rfloor$
- $u_{i-1} \leftarrow \left\lfloor 2^{i-1} x' \right\rfloor$
- $B_i \leftarrow u_i - 2u_{i-1}$
Return $\gamma(x) = [B_1, \ldots, B_N]$

This process is compactly expressed as

$\gamma(x)_i = \left\lfloor 2^i(x/z) \right\rfloor - 2\left\lfloor 2^{i-1}(x/z) \right\rfloor\,, \quad i=1,\ldots,N$

4. Empirical Behavior and Quantitative Results

NB2E has been evaluated on a suite of periodic functions:

$f(x) = \sin(x)$
$f(x) = \sin(x) + 2.5\sin(1.4x + 3.7)$
$2\bigl(\frac{x}{3.1} - \left\lfloor \frac{x}{3.1} + 0.5 \right\rfloor\bigr) + 4\bigl| \frac{x}{5} - \left\lfloor \frac{x}{5} + 0.5 \right\rfloor \bigr| - 1$
A composite of beats, exponential decay, and square wave (see Appendix A of (Powell et al., 11 Dec 2025))

Experimental setup: $10^4$ i.i.d. samples $x \sim \mathcal{U}[0, X_{\max}]$ , training on $x' \le 0.7$ (normalized), testing on $0.7 < x' < 1$ (pure extrapolation). Several encoding/architecture baselines were assessed:

Input Encoding	In-Domain MAE	Extrapolation MAE	Behavior Outside Training
Raw $x'$	$\gg 1$	N/A	Fails on both train/test
Fixed Fourier (FFE)	$\sim 10^{-3}$	$\sim 50\%$ drift	Loses phase, amplitude alignment
NB2E	$\sim 10^{-3}$	$10^{-3}$ – $10^{-2}$	Preserves phase, amplitude

NB2E-MLP configurations use five dense hidden layers (each 512 ELU units, $L_2$ regularization $\lambda=10^{-4}$ ), AdamW optimizer with cosine annealing, batch size $1000$, and $4000$ epochs. NB2E MLPs maintained signal period and amplitude well beyond the training interval, where other encodings produced significant drift or failed outright.

5. Internal Representation and Bit-Phase Dynamics

Analysis of internal activations was conducted by recording the hidden representations (512D) of NB2E-MLP on a full period of $\sin(x)$ . Dimensionality reduction via UMAP and clustering by DBSCAN revealed that hidden representations form clusters corresponding both to:

The phase of the periodic target function.
Local bit patterns (e.g., “bits 5–7”).

Each cluster spans a narrow phase interval but is distributed over the full positional axis; beyond $x' > 0.7$ , the model continues to assign samples to correct bit-phase clusters, reconstructing $\sin(x)$ in proper phase. For complex composite signals, the network’s representations form higher-dimensional “chart” embeddings, effectively factorizing phase and bit-pattern for each periodic component.

6. Limitations and Extensions

Key limitations:

Training must encompass several complete periods (at least 5–7 cycles) for robust learning of bit-phase dynamics.
The greatest normalized training input, $x'_{\max}$ , must exceed $0.5$ (for bit 1), $0.75$ (bit 2), etc.; otherwise, some bits never switch, resulting in phase resets at unsupervised dyadic boundaries.
Extrapolation is constrained to the nearest dyadic up to $x'<1$ ; structure outside this interval is not guaranteed.

Potential extensions identified include:

Learning relative bit-differences (similar to relative positional encodings).
Integrating NB2E into sequence modeling architectures (e.g., RNNs, transformers), especially for irregular time series.
Extending to multi-dimensional input spaces via interleaved binary grids.

7. Comparison to Other Encodings and Model Classes

NB2E differs from other input encodings in several key respects:

Fourier encodings provide identical frequency scales but, being continuous, lack the discrete positional landmarks for out-of-domain phase matching.
Random Fourier features create kernel approximations but are not suited for deterministic phase extrapolation.
Symbolic regression and physics-informed neural networks (PINNs) demand either known functional forms or explicit structure search.
SIREN (sinusoidal activations) can be combined with NB2E—a combination yielding lower mean absolute error (MAE $\sim 10^{-4}$ ) in preliminary testing, though comprehensive analysis is pending.

NB2E’s transformation of scalars into a structured binary hierarchy allows standard MLPs to “factor” phase information, enabling interpolation and projection beyond the observed domain. In contrast, continuous and Fourier encodings do not induce the necessary discrete synchrony for extrapolatory phase alignment (Powell et al., 11 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

Extrapolation of Periodic Functions Using Binary Encoding of Continuous Numerical Values (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Normalized Base-2 Encoding (NB2E).