Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-scale representation of integer sets: application to prime numbers

Published 3 Jun 2025 in math.RA | (2506.03005v2)

Abstract: We propose a multi-scale analysis method for studying arithmetic properties of integer sets, such as primality. Our approach organizes information through a hierarchy of nested sequences, where each level enables a hierarchical expression of the studied property by examining patterns at varying levels of granularity. To illustrate the method, we apply it to prime numbers. While this does not claim any new breakthroughs on this classical problem, the approach allows for analysis of the studied property across large integer sequences and reveals characteristics observable at different scales. By limiting ourselves to the case of prime numbers, we build sequences with values in {0, ..., 255}, which have the advantage of simplifying the reading, at different scales, of the encoded property. We free ourselves from the numerous digits of large integers by replacing them with small integers between 0 and 255. We have also highlighted, at different scales, histograms composed of at most 256 values. We have observed that for a sufficiently large interval, they all share a same invariant shape, which can be viewed as a characteristic of prime numbers. Each value in the histogram represents the count of a subset of prime numbers. We have proposed an estimation for each value in the histogram and at all scales. We hope that the proposed framework will be useful for investigating arithmetic properties.

Authors (1)

Summary

  • The paper presents a multi-scale representation that hierarchically encodes integer sets to reveal the distribution of prime numbers.
  • It employs recursive aggregation of binary patterns, showing that only a subset of possible configurations occurs due to inherent arithmetic constraints.
  • Probabilistic models and invariant histograms are developed, enabling efficient reconstruction algorithms for both prime and Mersenne number analyses.

Multi-Scale Representation of Integer Sets and Its Application to Prime Numbers

Introduction and Motivation

The paper introduces a multi-scale, hierarchical framework for representing and analyzing arithmetic properties of integer sets, with a particular focus on prime numbers. The approach encodes properties such as primality into nested sequences of small integers (0–255), enabling both local and global analysis of integer sets at varying granularities. This method is inspired by multi-resolution techniques in signal processing, such as wavelet analysis, but is tailored to the discrete, combinatorial structure of the integers.

The central idea is to partition the set of natural numbers into blocks at multiple scales, encode the presence or absence of a property (e.g., primality) within each block as a binary pattern, and then recursively aggregate these patterns to form higher-level summaries. This yields a tree-like structure where each node summarizes the property distribution over increasingly large intervals.

Hierarchical Construction and Encoding

The multi-scale representation is constructed as follows:

  • Level 1 (Fine Granularity): The integers are partitioned into blocks of 8. Each block is encoded as an 8-bit binary string, where each bit indicates whether the corresponding integer is prime. This binary string is then converted to a decimal value in [0,255][0,255].
  • Level 2 (Intermediate Granularity): Level 1 blocks are grouped into blocks of 8, and each is encoded as a binary string indicating whether the corresponding Level 1 block contains at least one prime. This again yields a value in [0,255][0,255].
  • Level 3 (Coarse Granularity): The process is repeated, aggregating Level 2 blocks into larger blocks, and so on.

This recursive encoding is formalized by the sequence of functions f(k)f^{(k)}, where kk denotes the scale. The construction ensures that each kk-pattern encodes the distribution of the property over 8k8^k consecutive integers.

(Figure 1)

Figure 1: Multi-scale tree illustrating three resolution levels. Each node encodes the presence of prime numbers in a specific interval: 512 integers (level 3), 64 integers (level 2), and 8 integers (level 1).

This hierarchical structure allows for efficient navigation between scales, facilitating both coarse and fine-grained analysis of the distribution of primes.

Statistical Properties and Pattern Restrictions

A key empirical finding is that, due to arithmetic constraints, not all possible patterns occur at each scale. For example, at Level 1, only 14 out of 256 possible patterns are realized. This is a consequence of the distribution of primes and the impossibility of certain configurations (e.g., three consecutive odd numbers all being prime).

The paper provides a detailed classification of Level 1 patterns, associating each with specific configurations of primes within a block. For instance, the pattern 40 corresponds to the presence of a twin prime pair, while 106 encodes a block containing four primes.

The distribution of these patterns is highly skewed: the majority of blocks contain no primes, a significant fraction contain a single prime, and blocks with three or four primes are exceedingly rare. The empirical histogram of Level 1 patterns is invariant across large intervals, suggesting a scale-independent statistical structure in the distribution of primes. Figure 2

Figure 2

Figure 2: Histogram of f(2)f^{(2)} values, showing the empirical distribution of Level 2 patterns over a large interval.

Mathematical Modeling of Pattern Distributions

The paper develops probabilistic models to estimate the frequency of each pattern at every scale. The key assumption is that the probability of an odd integer being prime is approximately uniform and given by p(m)2li(m)mp(m) \approx \frac{2 \operatorname{li}(m)}{m}, where li(m)\operatorname{li}(m) is the logarithmic integral.

For Level 1, the expected count of each pattern in an interval [1,m][1, m] is modeled as: πm(1)(j)Cj(1)(m)m8p(m)kj(1p(m))4kj\pi^{(1)}_m(j) \approx C^{(1)}_j(m) \frac{m}{8} p(m)^{k_j} (1-p(m))^{4-k_j} where kjk_j is the number of primes in the pattern, and Cj(1)(m)C^{(1)}_j(m) is a correction factor accounting for dependencies and arithmetic constraints.

Analogous expressions are derived for higher levels, with the probability of a block containing at least one prime at Level kk given recursively by: qk(m)=1(1qk1(m))8q_k(m) = 1 - (1 - q_{k-1}(m))^8 and the expected histogram counts at Level kk are given by binomial-type formulas involving qk1(m)q_{k-1}(m).

Empirical results confirm the accuracy of these models, with correction factors Cj(k)(m)C^{(k)}_j(m) typically close to 1 for large mm.

Invariant Histogram Shape and Density Breakdown

A notable empirical observation is that the histograms of pattern frequencies at each scale converge to an invariant shape as the interval size increases. This suggests a form of statistical self-similarity in the distribution of primes across scales.

The paper also analyzes the breakdown of density: for sufficiently large intervals, there exist blocks at higher scales that contain no primes. The threshold for this breakdown is estimated using the derived probabilistic models, yielding explicit (albeit very large) bounds on the interval size beyond which the density property fails.

Algorithmic Reconstruction

The hierarchical encoding is invertible. The paper provides an explicit algorithm for reconstructing the set of integers (e.g., primes) corresponding to a given high-level pattern. The algorithm traverses the tree from the top down, decoding each level's binary pattern to recover the positions of the property at the finest scale.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def D2B(n):
    # Returns positions of 1s in 8-bit binary representation of n
    return [j for j in range(8) if (n >> (7-j)) & 1]

def reconstruct_primes(pattern3, pattern2, pattern1, C):
    prime_list = []
    for n, val3 in enumerate(pattern3):
        if val3 == C:
            for j in D2B(C):
                m = 8*n + j
                for i in D2B(pattern2[m]):
                    k = 8*m + i
                    for p in D2B(pattern1[k]):
                        prime_list.append(8*k + p + 1)
    return prime_list

This algorithm is efficient and exploits the compactness of the multi-scale representation.

Application to Mersenne Numbers

The framework is adapted to study Mersenne numbers, with the property of interest being primality within the set M={22m+1j}\mathcal{M} = \{2^{2m+1}-j\}. The analysis reveals that, for m3m \geq 3, only a small subset of patterns occur, and no block contains more than one Mersenne prime. This reflects the extreme sparsity of Mersenne primes and demonstrates the flexibility of the multi-scale approach for analyzing other structured integer sets.

Implications and Future Directions

The multi-scale representation provides a novel tool for the statistical and structural analysis of integer sets. Its main advantages are:

  • Compactness: Large intervals are summarized by sequences of small integers, facilitating efficient storage and computation.
  • Scalability: The method is applicable to arbitrarily large intervals and can be adapted to other arithmetic properties.
  • Pattern Analysis: The invariant histogram shapes and pattern restrictions offer new perspectives on the global and local structure of primes.

Potential future developments include:

  • Refinement of the correction factors in the probabilistic models, possibly via analytic number theory.
  • Extension to overlapping or variable-size partitions, which may capture additional structure.
  • Application to other arithmetic sets (e.g., twin primes, k-almost primes) and to the study of gaps and local irregularities.

Conclusion

The paper presents a systematic, multi-scale framework for encoding and analyzing arithmetic properties of integer sets, with detailed application to the distribution of prime numbers. The approach reveals persistent statistical structures across scales, provides accurate probabilistic models for pattern frequencies, and enables efficient reconstruction and analysis of large integer intervals. The method opens new avenues for both empirical and theoretical investigation in analytic and computational number theory.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.