Two Embedding Theorems for Data with Equivalences under Finite Group Action (1207.6986v2)
Abstract: There is recent interest in compressing data sets for non-sequential settings, where lack of obvious orderings on their data space, require notions of data equivalences to be considered. For example, Varshney & Goyal (DCC, 2006) considered multiset equivalences, while Choi & Szpankowski (IEEE Trans. IT, 2012) considered isomorphic equivalences in graphs. Here equivalences are considered under a relatively broad framework - finite-dimensional, non-sequential data spaces with equivalences under group action, for which analogues of two well-studied embedding theorems are derived: the Whitney embedding theorem and the Johnson-Lindenstrauss lemma. Only the canonical data points need to be carefully embedded, each such point representing a set of data points equivalent under group action. Two-step embeddings are considered. First, a group invariant is applied to account for equivalences, and then secondly, a linear embedding takes it down to low-dimensions. Our results require hypotheses on discriminability of the applied invariant, such notions related to seperating invariants (Dufresne, 2008), and completeness in pattern recognition (Kakarala, 1992). In the latter theorem, the embedding complexity depends on the size of the canonical part, which may be significantly smaller than the whole data set, up to a factor equal to the size the group.