A Lower Bound for the Fourier Entropy of Boolean Functions on the Biased Hypercube (2511.07739v2)
Abstract: We study Boolean functions on the $p$-biased hypercube $({0,1}n,μ_pn)$ through the lens of Fourier/spectral entropy, i.e., the Shannon entropy of the squared Fourier coefficients. Motivated by recent progress on upper bounds toward the Fourier-Entropy-Influence (FEI) conjecture, we prove a complementary lower bound in terms of squared influences: for every $f:({0,1}n,μ_pn)\to {-1,1}$ we have $$ {\rm Ent}p(f)\ge 4p(1-p)(2p-1)2\cdot\sum{k=1}n{\rm Inf}{(p)}_k(f)2.$$
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
A Lower Bound for the Fourier Entropy of Boolean Functions on the Biased Hypercube — Explained Simply
What is this paper about?
This paper studies special yes/no functions (called Boolean functions) that take several bits as input. But instead of each bit being a fair coin (heads/tails with equal chance), each bit is biased: it is 1 with probability and 0 with probability $1-p$. The authors look at how “complicated” these functions are when you break them into simple building blocks (their Fourier expansion), and they measure this complexity using something called Fourier entropy. They prove a new inequality that gives a general lower bound on this entropy, in terms of how sensitive the function is to flipping each input bit.
The big questions
Put simply, the paper asks:
- If a function’s output cares a lot about its inputs (especially in a squared sense), how “spread out” must its Fourier representation be?
- Can we connect two important measures of complexity — entropy (how spread out the “energy” is across Fourier coefficients) and influences (how often flipping a particular bit changes the output) — for the biased setting where inputs are not equally likely?
The main result answers this with a clean inequality that holds for every Boolean function under any bias :
Key ideas and methods (in everyday language)
To understand the approach, here are the key ideas, explained informally:
- Boolean functions and the biased cube: Think of switches, each either 0 or 1. Normally you might flip a fair coin to decide each switch; here, each switch is 1 with probability , not necessarily 1/2.
- Fourier expansion as a remix: Any Boolean function can be decomposed into a sum of simple patterns (like “notes” in music). The squared coefficients tell you how much each pattern contributes. The Fourier entropy measures how spread out these contributions are — high entropy means the function uses many patterns; low entropy means it relies on just a few.
- Influence as “how much one switch matters”: The influence of bit is the chance that flipping only bit flips the function’s output. The total influence adds up these chances over all bits. Here, the authors focus on the sum of squared influences, which very roughly gives a stricter measure of “how strongly sensitive” the function is.
- Random restrictions (freeze some bits): The authors repeatedly “freeze” some coordinates (fix a subset of inputs) and average over the rest — like blurring part of a picture to see what structure remains. This is a common tool that simplifies analysis while preserving key properties.
- A clever “moment” trick: They define a quantity that depends on a small parameter and the set of “live” coordinates (the ones not frozen). This quantity acts like a knob you can turn. The crucial identity is that the derivative at of a certain version of this moment gives exactly the Fourier entropy:
So, understanding how changes lets them control entropy.
- Step-by-step (telescoping) and inequalities: They grow one coordinate at a time (add one bit to the “live” set) and sum the changes along the way. Using standard inequalities (like Jensen’s inequality and a Cauchy–Schwarz-type inequality), they bound the change. Then they connect this change to the influences by carefully relating neighboring Fourier coefficients to how often flipping a bit changes the function.
- Putting it together: The chain rule (adding one bit at a time), the moment derivative, and the inequalities combine to prove the main lower bound on entropy in terms of squared influences, with a bias-dependent factor.
The main result and why it matters
The central theorem says:
Interpretation:
- The left side is how spread out the function’s Fourier energy is.
- The right side measures, in a strict (squared) way, how sensitive the function is to its inputs, multiplied by a factor that depends on .
- If the input bits are very biased (far from fair), then is large, so the lower bound is meaningful and strong. If the inputs are perfectly fair (), then , so the bound becomes trivial; this is expected because in the fair case, strong lower bounds of this type are much harder or impossible with current techniques. In other words, this result is especially informative for biased settings, which are very important in probability, random graphs, and theoretical computer science.
Why important:
- It complements famous “upper bound” results (the Fourier-Entropy-Influence conjecture and progress towards it) by giving a general “lower bound.” That is, it tells you not only that entropy cannot be too big relative to influence (upper bounds) but also that it cannot be too small if the function is sensitive (lower bounds).
- It gives concrete guarantees: if many bits (in a squared sense) matter to the output, the Fourier spectrum cannot be packed into just a few coefficients — the function must be spectrally rich.
Simple consequences the authors derive
Here are two immediate takeaways:
- How many Fourier coefficients can be nonzero? If the sum of squared influences is large, then the function must have many nonzero Fourier coefficients. More precisely, the number of nonzero coefficients is at least , so their bound forces a large support when squared influences are large.
- Noise stability gets capped by entropy: Imagine you copy the input but re-randomize each bit with a small probability . Noise stability is the expected agreement between and . Earlier work (Keller–Kindler) tied high squared-influence mass to low stability. Using the new lower bound, the paper shows that stability is upper-bounded by a function of the entropy:
Intuition: if the spectrum is spread out enough (high entropy), the function can’t remain too stable under random noise.
A note on sharpness and a conjecture
The authors believe the optimal constant in front of the squared influences should be the binary entropy function of , namely , instead of . They give two classic examples (the “dictatorship” function that depends on just one bit, and the “parity” function that multiplies several bits together) where the conjectured factor is exactly right. Their proven bound is a bit weaker (missing a logarithmic factor) in extreme bias regimes. This suggests a clear and exciting target for future research.
What’s the broader impact?
- It strengthens our understanding of how structural sensitivity (influences) forces spectral complexity (entropy) in the biased setting.
- It adds a new tool to the toolkit for studying noise sensitivity, sharp thresholds in random graphs, and complexity questions in theoretical computer science.
- It points toward a sharper, elegant truth (with the factor) that matches exactly on important examples, encouraging further work to close the gap.
Overall, the paper provides a clean, general lower bound that fits neatly alongside recent progress on upper bounds, and it opens a promising path toward a fully tight relationship between entropy and influences for biased Boolean functions.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
Below is a concise list of what the paper leaves unresolved, highlighting concrete directions for future research.
- Tightening the constant: Prove or refute the conjecture that the optimal bias-dependent constant is the binary entropy of the edge-noise parameter, i.e., show Ent_p(f) ≥ h(q)·∑_k Inf_k{(p)}[f]2 for all Boolean f, where q=4p(1−p). The current proof yields q(1−q), which is weaker by a logarithmic factor in extreme-bias regimes.
- Locating the slack in the proof: Identify and replace the proof steps responsible for the loss (use of Jensen, lower bounding h(θ) by min{θ,1−θ}·log(1/min), and log(1+t) ≥ t/(1+t)) to recover h(q). Develop sharper inequalities tailored to the restricted-moment functional to avoid these relaxations.
- Extremizers and stability: Characterize all functions that attain equality in the conjectured bound with constant h(q) (dictatorships and parity functions are extremizers). Prove a stability theorem: if Ent_p(f) is close to h(q)·∑_k Inf_k{(p)}[f]2, then f is structurally close to a product of one-bit dictators/parities.
- Generalization to inhomogeneous biases: Extend the lower bound to product measures with coordinate-dependent biases p_i (non-uniform μ{p_1}×⋯×μ{p_n}). Determine whether a bound of the form Ent_{p}(f) ≥ ∑_i h(q_i)·Inf_i{(p)}[f]2 (with q_i=4p_i(1−p_i)) or a more intricate coupling across coordinates is correct.
- Beyond ±1-valued functions: Develop analogous lower bounds for real-valued f with bounded variance (e.g., Var[f]=1), non-binary output alphabets, or bounded range f∈[−1,1], and clarify how normalization (variance/bias) enters the constant.
- Min-entropy and Rényi analogues: Establish lower bounds for Fourier min-entropy and Rényi entropies (α≠1) under μ_p in terms of squared influences (or related sensitivity parameters), and determine bias-dependent optimal constants.
- Uniform cube limitation: The bound is identically 0 at p=1/2 (q=1), which is unavoidable in full generality. Identify meaningful structural restrictions (e.g., forbidding all mass on a single Fourier coefficient, eliminating all degree-1 mass, or assuming small level-1 weight) under which nontrivial lower bounds can be obtained on the uniform cube.
- Chain optimization: Investigate whether the choice of coordinate chain (ordering) or more sophisticated restriction schemes (random chains, block restrictions, adaptive chains) can strengthen the lower bound or recover h(q).
- Noise stability corollary: Make α(p,ε) explicit (from Keller–Kindler) and quantify tightness of S_ε(f) ≤ (6e+1)·(Ent_p(f)/(2p−1)2){α(p,ε)·ε}. Determine whether the exponent and prefactor are optimal and find families that achieve (or separate) this bound.
- Empirical/tightness benchmarks: Compute Ent_p(f) and ∑_k Inf_k{(p)}[f]2 for canonical classes (majority, tribes, monotone graph properties at critical thresholds, juntas) across biases p to assess tightness, identify gaps, and guide constant improvements.
- Algorithmic estimation: Develop sample-efficient algorithms to estimate ∑k Inf_k{(p)}[f]2 under μ_p from random examples, turning the bound into a practical certificate of minimum spectral entropy. Provide finite-sample guarantees and concentration bounds for the restricted-moment functional M{J,ε,p}.
- Incorporating higher-degree structure: Explore lower bounds that include level weights (e.g., ∑_S |S|·ĥf(S)2 or higher-degree influences), monotonicity constraints, or structural parameters (e.g., junta size) to strengthen constants beyond q(1−q).
- Second-order/moment refinements: Go beyond the first derivative at ε=0 in the moment functional M_{J,ε,p} (e.g., optimize over ε>0, use second-order expansions, or convexity/concavity properties of M_{J,ε,p}) to improve constants toward h(q).
- Cross-measure extensions: Extend the framework to q-ary hypercubes and Gaussian settings (biased analogues) and determine the correct entropy–influence-squared lower bounds and constants in these domains.
- Equality cases for the proven bound: Determine whether any nontrivial family attains the proven constant q(1−q) (not just up to a logarithmic factor), and characterize when the Engel/Sedrakyan and Jensen steps are tight.
- Leveraging hypercontractivity: Integrate sharp p-biased hypercontractivity or isoperimetric inequalities (referenced but not used in the proof) to derive stronger lower bounds or bridge to the conjectured h(q) constant.
Glossary
- BKS framework: A paradigm in the analysis of Boolean functions by Benjamini–Kalai–Schramm that links sensitivity to influences under noise. "In the BKS framework~\cite{BKS1999}, the squared–influence mass is the canonical quantitative driver of sensitivity under this resampling operation"
- Binary entropy function: The function h(q) = −q log q − (1 − q) log(1 − q) capturing the entropy of a Bernoulli variable with parameter q. "is given by the binary entropy function ."
- Dictatorship function: A Boolean function depending on a single coordinate, typically f(x) = 2x_k − 1. "Dictatorship function. Let ."
- Discrete derivative operator: The operator ∂i measuring the effect of flipping coordinate i, defined via restricted evaluations. "the th (discrete) derivative operator on is defined by "
- ε-moment: A parameterized moment of restricted Fourier coefficients used to analyze entropy via random restrictions. "we define the -moment of -biased -restricted Fourier coefficients for as"
- Fourier coefficients: The weights in the Fourier expansion relative to the p-biased basis, denoted by . "the real numbers are known as its Fourier coefficients w.r.t. ."
- Fourier entropy: The Shannon entropy of the squared Fourier coefficients of a Boolean function. "the spectral/Fourier entropy of with respect to the measure is "
- Fourier expansion: Representation of a function as a sum of basis characters indexed by subsets S. "This representation is known as the Fourier expansion of the function "
- Fourier-Min-Entropy-Influence Conjecture: A conjecture asserting that min-entropy is bounded in terms of influence. "A weaker conjecture is the so-called Fourier-Min-Entropy-Influence Conjecture which asks if the min-entropy could be also bounded by constant times of ."
- Fourier-Entropy-Influence (FEI) Conjecture: The conjecture that Fourier entropy is bounded above by a constant times total influence. "A central theme in the analysis of Boolean functions is the interplay between spectral quantities and combinatorial sensitivity parameters such as influences. For , let denote the -biased influence of coordinate , and let be the total influence. In the direction of upper bounds on Fourier entropy, it is natural to recall the original Fourier-Entropy-Influence (FEI) Conjecture of Friedgut and Kalai in 1996"
- Hypercontractive constant: A parameter governing hypercontractivity inequalities; here, the p-biased version influences noise sensitivity bounds. "where essentially comes from the -biased hypercontractive constant (see~\cite{KK2013} for the precise definition and conditions)."
- Noise stability: The correlation of a function under independent resampling of inputs with rate ε. "The noise stability of is defined by "
- Parseval's identity: The equality relating the L2 norm of a function to the sum of squares of its Fourier coefficients. "Parseval's identity gives "
- Plancherel's identity: The statement that the inner product equals the dot product of Fourier coefficients. "Next we have Plancherel's identity, which states that the inner product of and is precisely the dot product of their vectors of Fourier coefficients."
- p-biased hypercube: The domain where each coordinate is 1 with probability p independently. "We study Boolean functions on the -biased hypercube "
- p-biased influence: The probability that flipping a coordinate changes the function value under measure μp. "let denote the -biased influence of coordinate "
- p-biased measure: The product measure on the cube with bias p for 1’s. "Consider the discrete cube endowed with the -biased measure $\mu_p<sup>n"</sup></li> <li><strong>Random restriction</strong>: A technique where a subset of coordinates is fixed randomly according to μp. "a random restriction of $fJf_{J^c\to z}z\in\{0,1\}^{J^c}\mu_p^{J^c}$."</li> <li><strong>Sedrakyan/Bergström/Engel's form</strong>: An inequality giving a lower bound via Cauchy–Schwarz (Engel’s form). "[Sedrakyan/Bergstr\"om/Engel's form]"</li> <li><strong>Shannon entropy</strong>: The measure of uncertainty H = −∑ p log p; here applied to squared Fourier coefficients. "We study Boolean functions on the $p(\{0,1\}^n,\mu_p^n)$ through the lens of Fourier/spectral entropy, i.e., the Shannon entropy of the squared Fourier coefficients."</li> <li><strong>Shannon’s inequality</strong>: The bound H(p) ≤ log|supp(p)| relating entropy to support size. "It follows from Shannon’s inequality $H(p)=\sum_S p_S \log(1/p_S)\le\log|{\rm supp}(p)|$"
Practical Applications
Below is the analysis of the practical, real-world applications derived from the research paper titled "A Lower Bound for the Fourier Entropy of Boolean Functions on the Biased Hypercube".
Immediate Applications
- Quantitative Finance
- Application: The findings regarding the biased hypercube and influences can be applied to model and analyze financial markets, where decisions can be modeled as Boolean functions with varying degrees of influence on market outcomes.
- Sector: Finance
- Tools/Products: Risk assessment algorithms, market sensitivity analysis tools.
- Assumptions: Assumes the market behavior can be simplified and represented as Boolean functions.
- Information Theory and Cryptography
- Application: Insights into Fourier entropy influencing Boolean functions could improve error detection and correction algorithms, as well as secure data transmission methods.
- Sector: Software, Cybersecurity
- Tools/Products: Enhanced cryptographic protocols, error-correction models in digital communication.
- Assumptions: The biasing model aligns with real-world data transmission errors and attacks.
Long-Term Applications
- Machine Learning
- Application: Machine learning models can utilize concepts from the biased hypercube and Fourier entropy to better understand feature importance and model interpretability, particularly in ensemble methods or neural networks.
- Sector: Artificial Intelligence
- Tools/Products: Feature importance estimation tools, model diagnostic frameworks.
- Dependencies: Requires further research to integrate with existing machine learning frameworks and datasets.
- Quantum Computing
- Application: The lower bound for Fourier entropy could aid in developing algorithms that leverage quantum bits, which inherently operate in superposition with Boolean-like computations.
- Sector: Quantum Computing
- Tools/Products: Quantum algorithm development platforms, quantum error correction codes.
- Dependencies: Needs extensive scaling and adaptation for quantum systems, which are still under active development and exploration.
- Artificial Intelligence in Healthcare
- Application: Used in developing diagnostic tools that assess different medical conditions based on various influential factors modelled as Boolean functions.
- Sector: Healthcare
- Tools/Products: Diagnostic AI systems, health monitoring applications.
- Dependencies: Requires integration with large-scale health data and collaborative development with medical professionals.
Each application is paired with assumptions or dependencies that highlight the research context and potential limitations that may impact the deployment and scaling of these innovations.
Collections
Sign up for free to add this paper to one or more collections.