Generalization of regeneration stability to high-cardinality educational features

Determine whether the regeneration stability properties demonstrated by the Non-Parametric Gaussian Copula (NPGC) synthesizer—measured as preservation of fidelity across repeated synthetic feedback loop iterations—generalize to datasets containing high-cardinality educational features such as course identifiers or learning objective codes.

Background

The paper introduces the Non-Parametric Gaussian Copula (NPGC) synthesizer and evaluates its regeneration stability under a synthetic feedback loop (iterative regeneration) protocol, showing minimal degradation of fidelity across iterations on the Adult dataset.

In the Discussion, the authors note a limitation that their regeneration stability analysis was conducted on a single dataset and explicitly identify uncertainty about whether these stability properties extend to high-cardinality educational variables (e.g., course IDs or learning objective codes), leaving this as an open question.

References

Additionally, our regeneration stability analysis was conducted on a single dataset; whether these properties generalize to high-cardinality educational features (e.g., course IDs or learning objective codes) remains an open question.

Stable and Privacy-Preserving Synthetic Educational Data with Empirical Marginals: A Copula-Based Approach  (2604.04195 - Ramos et al., 5 Apr 2026) in Section 7 (Discussion), Limitations and scope conditions