Beyond Black Box Densities: Parameter Learning for the Deviated Components (2202.02651v2)
Abstract: As we collect additional samples from a data population for which a known density function estimate may have been previously obtained by a black box method, the increased complexity of the data set may result in the true density being deviated from the known estimate by a mixture distribution. To model this phenomenon, we consider the \emph{deviating mixture model} $(1-\lambda{*})h_0 + \lambda{*} (\sum_{i = 1}{k} p_{i}{*} f(x|\theta_{i}{*}))$, where $h_0$ is a known density function, while the deviated proportion $\lambda{*}$ and latent mixing measure $G_{} = \sum_{i = 1}{k} p_{i}{} \delta_{\theta_i{*}}$ associated with the mixture distribution are unknown. Via a novel notion of distinguishability between the known density $h_{0}$ and the deviated mixture distribution, we establish rates of convergence for the maximum likelihood estimates of $\lambda{*}$ and $G{*}$ under Wasserstein metric. Simulation studies are carried out to illustrate the theory.