Probably Approximately Symmetric Framework
- The PAS framework integrates plug‐in and pseudo-PML methods to achieve near-optimal estimation of symmetric properties under varying sample count regimes.
- It defines symmetry as invariance under label permutations and employs sample splitting to allocate ‘easy’ and ‘hard’ symbols for improved estimation accuracy.
- The approach extends to geometric symmetry detection by approximating shape distortions through sublinear sampling while ensuring theoretical performance guarantees.
The Probably Approximately Symmetric (PAS) framework denotes two distinct but foundational approaches in the literature. One line (Charikar et al., 2020) develops a general-purpose framework for optimal symmetric property estimation over distributions—the focus of information-theoretic property estimation. The other (Korman et al., 2014) introduces PAS for efficient symmetry detection in geometric shapes, with strong theoretical guarantees. Both share the core idea of blending probabilistic and approximate methods to handle symmetry, but their technical domains and mechanisms are fundamentally different.
1. Symmetric Property Estimation: Principle and Formalism
Given a finite alphabet of size with discrete distribution (the -simplex), symmetric properties are scalar functions that are invariant under relabeling of . A broad and important subclass is the separable symmetric properties, for scalar . Empirically, for i.i.d. samples , the counts 0 define the profile 1—the histogram of symbol count frequencies.
A central fact is that any symmetric property estimator depends only on 2, not the labeling. Classic properties such as support size (3), Shannon entropy (4), and 5-distance to uniformity fit this formalism.
2. The Plug-in Estimator and the “Easy” Regime
When the per-symbol sample count is large, empirical estimation—plug-in estimator 6 where 7—is minimax-optimal for many 8. Specifically, for smooth 9 in “large” 0 regions (1), and 2 exceeding the effective support scale (tuned to the property), the bias and variance are controlled:
- For Shannon entropy, if 3, the sample complexity is 4.
- For 5-distance to uniformity, in the regime 6, 7 suffices.
In these “easy” regions, the PAS framework reverts to the empirical estimator, unifying both the trivial and complex regimes.
3. The Difficult Region: Profile Maximum Likelihood and Pseudo-PML
When 8 becomes small (9 or 0 nonsmooth near 1), empirical methods suffer from significant bias. For such hard regimes, the Profile Maximum Likelihood (PML) estimator is introduced:
- The PML distribution 2 solves 3, maximizing the probability of observing the sample profile under 4.
- Acharya–Das–Orlitsky–Suresh (ADOS’16) established that substituting 5 into 6 yields universal minimax-optimal estimators for bounded 7.
However, exact PML is computationally intractable (8-hard and NP-hard). The PAS framework replaces exact PML with computationally feasible approximate variants—pseudo-PML—by restricting attention to subsets 9 of “difficult” symbols (typically those with small counts). The S-pseudo-profile 0 and its corresponding pseudo-PML 1 are optimized only over 2, exploiting lower complexity for tractability (e.g., via convex relaxations or Sinkhorn scaling).
4. The PAS Framework: Two-Stage Estimation and Sample Complexity
PAS proceeds in two stages:
- Sample Splitting: Split 3 samples into 4 and 5.
- Subset Selection: Use 6 to define the hard subset 7 (symbols with frequency in a target set 8), and 9 as the good subset.
- Pseudo-PML Estimation on 0: On 1, estimate the S-pseudo-profile, and compute a 2-approximate pseudo-PML 3.
- Combined Estimation: For 4, use plug-in with bias correction. Return
5
Main sample complexity result: For a property with “complexity” 6 (e.g., 7 for entropy), PAS attains 8 with high probability as soon as
9
This rate matches known instance-optimal bounds throughout both easy and hard regimes.
Comparison with previous PML-based methods ([ADOS’16]): PAS eliminates the need for property-specific polynomial approximations and broadens near-optimality.
5. Algorithmic Structure and Implementation
The PAS algorithm can be summarized as follows:
- Input: 0 samples, property 1, threshold set 2.
- Step 1: Split samples into 3, 4.
- Step 2: Define 5 from 6 via 7; let 8.
- Step 3: Extract S-pseudo-profile from 9.
- Step 4: Compute 0-approximate S-pseudo-PML using convex-concave surrogates, Sinkhorn, or local methods for 1 with 2.
- Step 5: For 3, use plug-in with correction; for 4, use 5.
- Step 6: Return combined estimator as above.
The pseudo-PML optimization dominates runtime but is practical for 6 or 7.
6. Applications and Worked Examples
PAS achieves near-optimal sample complexity in the estimation of core symmetric properties:
| Property | Complexity 8 | Sample Complexity | Empirical Suffices When |
|---|---|---|---|
| Shannon entropy 9 | 0 | 1 | 2 |
| 3 to uniformity 4 | 5 | 6 | 7 |
| Support size under 8 | 9 | 0 | all 1 (PML-optimal) |
For entropy, PAS transitions from plug-in in the easy regime to pseudo-PML on the rare-symbol tail, capturing missing-mass behavior. For support size estimation, PML plug-in is minimax-optimal for all 2.
7. Summary and Further Directions
The PAS framework (Charikar et al., 2020) provides a unified, instance-optimal estimation strategy for a broad class of separable symmetric properties, integrating plug-in estimators and tractable PML-based correction. The key algorithmic insight is constraining expensive optimization to small “hard” subsets, ensuring computational feasibility while matching information-theoretic lower bounds. Open questions include developing polynomial-time 3 approximations for general PML, extending to non-separable properties (e.g., Rényi entropy), and generalizing PAS to more complex statistical settings (e.g., multi-sample estimation, testing).
Secondary usage—PAS in geometric symmetry detection (Korman et al., 2014)—follows similar probabilistic-approximate principles. Here, a rigid transformation 4 is an 5-symmetry of a shape 6 if its distortion in 7 norm (integrating the level-set difference over the ball 8) is at most 9. The algorithm samples 00 at density tied to the total variation of the shape, using sublinear random sampling to estimate distortion, and achieves 01-probability correctness within user-specified accuracy 02 and complexity 03.
Both paradigms demonstrate the power of combining probabilistic correctness with approximate or subsampled optimization to achieve theoretical tightness and computational practicality—unifying the theory and practice of symmetric estimation and detection across statistics and geometry (Charikar et al., 2020, Korman et al., 2014).