Normalized Base-2 Encoding (NB2E)
- Normalized Base-2 Encoding (NB2E) is a deterministic method that converts continuous scalar values into fixed-length binary vectors via hierarchical dyadic step functions.
- NB2E outperforms traditional Fourier and raw encodings by preserving phase and amplitude during extrapolation beyond the training domain.
- Its implementation leverages normalized inputs and bit-level precision, enabling standard MLPs to reliably extend periodic function modeling outside observed ranges.
Normalized Base-2 Encoding (NB2E) is a deterministic method for encoding continuous scalar values as fixed-length binary vectors, designed to enable vanilla multi-layer perceptrons (MLPs) to extrapolate periodic functions well beyond their training range without requiring prior knowledge of functional form. NB2E leverages hierarchical, power-of-two “step” basis functions and bit-wise positional encoding—transforming real scalars into bit vectors whose structure aligns with the phase of periodic signals. This methodology has demonstrated robust out-of-distribution extrapolation on a diverse set of periodic functions, outperforming established encodings such as Fourier features in terms of phase and amplitude preservation outside the training interval (Powell et al., 11 Dec 2025).
1. Mathematical Formulation
Let denote a real scalar input. NB2E first normalizes by a positive constant %%%%2%%%%:
Given an integer precision parameter , the encoding is a length- binary vector where
As , NB2E encodes all points in the unit interval; practical use cases employ –48 for double-precision resolution. Each corresponds to the -th bit in the binary expansion of , with higher-order bits switching at dyadic intervals .
2. Motivations and Theoretical Rationale
NB2E is motivated by limitations in traditional positional and Fourier encodings. Classical approaches such as
decompose across continuous frequencies but fail to support extrapolation of unknown periodicities outside the training window. In contrast, NB2E provides a hierarchical, dyadic decomposition wherein each bit represents a step function of period , localizing phase information at all scales. The normalization to is necessary to maintain consistent bit semantics across the domain and to ensure each bit achieves both values during training. This structure enables neural networks to recognize and generalize repeating bit patterns far beyond observed data.
3. Algorithmic Description and Implementation
The NB2E encoding algorithm is as follows. Given input :
- Compute
- For each to :
- Return
This process is compactly expressed as
4. Empirical Behavior and Quantitative Results
NB2E has been evaluated on a suite of periodic functions:
- A composite of beats, exponential decay, and square wave (see Appendix A of (Powell et al., 11 Dec 2025))
Experimental setup: i.i.d. samples , training on (normalized), testing on $0.7 < x' < 1$ (pure extrapolation). Several encoding/architecture baselines were assessed:
| Input Encoding | In-Domain MAE | Extrapolation MAE | Behavior Outside Training |
|---|---|---|---|
| Raw | N/A | Fails on both train/test | |
| Fixed Fourier (FFE) | drift | Loses phase, amplitude alignment | |
| NB2E | – | Preserves phase, amplitude |
NB2E-MLP configurations use five dense hidden layers (each 512 ELU units, regularization ), AdamW optimizer with cosine annealing, batch size $1000$, and $4000$ epochs. NB2E MLPs maintained signal period and amplitude well beyond the training interval, where other encodings produced significant drift or failed outright.
5. Internal Representation and Bit-Phase Dynamics
Analysis of internal activations was conducted by recording the hidden representations (512D) of NB2E-MLP on a full period of . Dimensionality reduction via UMAP and clustering by DBSCAN revealed that hidden representations form clusters corresponding both to:
- The phase of the periodic target function.
- Local bit patterns (e.g., “bits 5–7”).
Each cluster spans a narrow phase interval but is distributed over the full positional axis; beyond , the model continues to assign samples to correct bit-phase clusters, reconstructing in proper phase. For complex composite signals, the network’s representations form higher-dimensional “chart” embeddings, effectively factorizing phase and bit-pattern for each periodic component.
6. Limitations and Extensions
Key limitations:
- Training must encompass several complete periods (at least 5–7 cycles) for robust learning of bit-phase dynamics.
- The greatest normalized training input, , must exceed $0.5$ (for bit 1), $0.75$ (bit 2), etc.; otherwise, some bits never switch, resulting in phase resets at unsupervised dyadic boundaries.
- Extrapolation is constrained to the nearest dyadic up to ; structure outside this interval is not guaranteed.
Potential extensions identified include:
- Learning relative bit-differences (similar to relative positional encodings).
- Integrating NB2E into sequence modeling architectures (e.g., RNNs, transformers), especially for irregular time series.
- Extending to multi-dimensional input spaces via interleaved binary grids.
7. Comparison to Other Encodings and Model Classes
NB2E differs from other input encodings in several key respects:
- Fourier encodings provide identical frequency scales but, being continuous, lack the discrete positional landmarks for out-of-domain phase matching.
- Random Fourier features create kernel approximations but are not suited for deterministic phase extrapolation.
- Symbolic regression and physics-informed neural networks (PINNs) demand either known functional forms or explicit structure search.
- SIREN (sinusoidal activations) can be combined with NB2E—a combination yielding lower mean absolute error (MAE ) in preliminary testing, though comprehensive analysis is pending.
NB2E’s transformation of scalars into a structured binary hierarchy allows standard MLPs to “factor” phase information, enabling interpolation and projection beyond the observed domain. In contrast, continuous and Fourier encodings do not induce the necessary discrete synchrony for extrapolatory phase alignment (Powell et al., 11 Dec 2025).