- The paper presents a multi-scale representation that hierarchically encodes integer sets to reveal the distribution of prime numbers.
- It employs recursive aggregation of binary patterns, showing that only a subset of possible configurations occurs due to inherent arithmetic constraints.
- Probabilistic models and invariant histograms are developed, enabling efficient reconstruction algorithms for both prime and Mersenne number analyses.
Multi-Scale Representation of Integer Sets and Its Application to Prime Numbers
Introduction and Motivation
The paper introduces a multi-scale, hierarchical framework for representing and analyzing arithmetic properties of integer sets, with a particular focus on prime numbers. The approach encodes properties such as primality into nested sequences of small integers (0–255), enabling both local and global analysis of integer sets at varying granularities. This method is inspired by multi-resolution techniques in signal processing, such as wavelet analysis, but is tailored to the discrete, combinatorial structure of the integers.
The central idea is to partition the set of natural numbers into blocks at multiple scales, encode the presence or absence of a property (e.g., primality) within each block as a binary pattern, and then recursively aggregate these patterns to form higher-level summaries. This yields a tree-like structure where each node summarizes the property distribution over increasingly large intervals.
Hierarchical Construction and Encoding
The multi-scale representation is constructed as follows:
- Level 1 (Fine Granularity): The integers are partitioned into blocks of 8. Each block is encoded as an 8-bit binary string, where each bit indicates whether the corresponding integer is prime. This binary string is then converted to a decimal value in [0,255].
- Level 2 (Intermediate Granularity): Level 1 blocks are grouped into blocks of 8, and each is encoded as a binary string indicating whether the corresponding Level 1 block contains at least one prime. This again yields a value in [0,255].
- Level 3 (Coarse Granularity): The process is repeated, aggregating Level 2 blocks into larger blocks, and so on.
This recursive encoding is formalized by the sequence of functions f(k), where k denotes the scale. The construction ensures that each k-pattern encodes the distribution of the property over 8k consecutive integers.
(Figure 1)
Figure 1: Multi-scale tree illustrating three resolution levels. Each node encodes the presence of prime numbers in a specific interval: 512 integers (level 3), 64 integers (level 2), and 8 integers (level 1).
This hierarchical structure allows for efficient navigation between scales, facilitating both coarse and fine-grained analysis of the distribution of primes.
Statistical Properties and Pattern Restrictions
A key empirical finding is that, due to arithmetic constraints, not all possible patterns occur at each scale. For example, at Level 1, only 14 out of 256 possible patterns are realized. This is a consequence of the distribution of primes and the impossibility of certain configurations (e.g., three consecutive odd numbers all being prime).
The paper provides a detailed classification of Level 1 patterns, associating each with specific configurations of primes within a block. For instance, the pattern 40 corresponds to the presence of a twin prime pair, while 106 encodes a block containing four primes.
The distribution of these patterns is highly skewed: the majority of blocks contain no primes, a significant fraction contain a single prime, and blocks with three or four primes are exceedingly rare. The empirical histogram of Level 1 patterns is invariant across large intervals, suggesting a scale-independent statistical structure in the distribution of primes.

Figure 2: Histogram of f(2) values, showing the empirical distribution of Level 2 patterns over a large interval.
Mathematical Modeling of Pattern Distributions
The paper develops probabilistic models to estimate the frequency of each pattern at every scale. The key assumption is that the probability of an odd integer being prime is approximately uniform and given by p(m)≈m2li(m), where li(m) is the logarithmic integral.
For Level 1, the expected count of each pattern in an interval [1,m] is modeled as: πm(1)(j)≈Cj(1)(m)8mp(m)kj(1−p(m))4−kj
where kj is the number of primes in the pattern, and Cj(1)(m) is a correction factor accounting for dependencies and arithmetic constraints.
Analogous expressions are derived for higher levels, with the probability of a block containing at least one prime at Level k given recursively by: qk(m)=1−(1−qk−1(m))8
and the expected histogram counts at Level k are given by binomial-type formulas involving qk−1(m).
Empirical results confirm the accuracy of these models, with correction factors Cj(k)(m) typically close to 1 for large m.
Invariant Histogram Shape and Density Breakdown
A notable empirical observation is that the histograms of pattern frequencies at each scale converge to an invariant shape as the interval size increases. This suggests a form of statistical self-similarity in the distribution of primes across scales.
The paper also analyzes the breakdown of density: for sufficiently large intervals, there exist blocks at higher scales that contain no primes. The threshold for this breakdown is estimated using the derived probabilistic models, yielding explicit (albeit very large) bounds on the interval size beyond which the density property fails.
Algorithmic Reconstruction
The hierarchical encoding is invertible. The paper provides an explicit algorithm for reconstructing the set of integers (e.g., primes) corresponding to a given high-level pattern. The algorithm traverses the tree from the top down, decoding each level's binary pattern to recover the positions of the property at the finest scale.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
def D2B(n):
# Returns positions of 1s in 8-bit binary representation of n
return [j for j in range(8) if (n >> (7-j)) & 1]
def reconstruct_primes(pattern3, pattern2, pattern1, C):
prime_list = []
for n, val3 in enumerate(pattern3):
if val3 == C:
for j in D2B(C):
m = 8*n + j
for i in D2B(pattern2[m]):
k = 8*m + i
for p in D2B(pattern1[k]):
prime_list.append(8*k + p + 1)
return prime_list |
This algorithm is efficient and exploits the compactness of the multi-scale representation.
Application to Mersenne Numbers
The framework is adapted to study Mersenne numbers, with the property of interest being primality within the set M={22m+1−j}. The analysis reveals that, for m≥3, only a small subset of patterns occur, and no block contains more than one Mersenne prime. This reflects the extreme sparsity of Mersenne primes and demonstrates the flexibility of the multi-scale approach for analyzing other structured integer sets.
Implications and Future Directions
The multi-scale representation provides a novel tool for the statistical and structural analysis of integer sets. Its main advantages are:
- Compactness: Large intervals are summarized by sequences of small integers, facilitating efficient storage and computation.
- Scalability: The method is applicable to arbitrarily large intervals and can be adapted to other arithmetic properties.
- Pattern Analysis: The invariant histogram shapes and pattern restrictions offer new perspectives on the global and local structure of primes.
Potential future developments include:
- Refinement of the correction factors in the probabilistic models, possibly via analytic number theory.
- Extension to overlapping or variable-size partitions, which may capture additional structure.
- Application to other arithmetic sets (e.g., twin primes, k-almost primes) and to the study of gaps and local irregularities.
Conclusion
The paper presents a systematic, multi-scale framework for encoding and analyzing arithmetic properties of integer sets, with detailed application to the distribution of prime numbers. The approach reveals persistent statistical structures across scales, provides accurate probabilistic models for pattern frequencies, and enables efficient reconstruction and analysis of large integer intervals. The method opens new avenues for both empirical and theoretical investigation in analytic and computational number theory.