2000 character limit reached
The Sample Complexity of Multi-Distribution Learning for VC Classes (2307.12135v1)
Published 22 Jul 2023 in cs.LG and stat.ML
Abstract: Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension d class on $k$ distributions to be $O(\epsilon{-2} \ln(k)(d + k) + \min{\epsilon{-1} dk, \epsilon{-4} \ln(k) d})$, the best lower bound is $\Omega(\epsilon{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.