How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders (2210.08344v2)

Published 15 Oct 2022 in cs.LG and cs.CV

Abstract: Masked Autoencoders (MAE) based on a reconstruction task have risen to be a promising paradigm for self-supervised learning (SSL) and achieve state-of-the-art performance across different benchmark datasets. However, despite its impressive empirical success, there is still limited theoretical understanding of it. In this paper, we propose a theoretical understanding of how masking matters for MAE to learn meaningful features. We establish a close connection between MAE and contrastive learning, which shows that MAE implicit aligns the mask-induced positive pairs. Built upon this connection, we develop the first downstream guarantees for MAE methods, and analyze the effect of mask ratio. Besides, as a result of the implicit alignment, we also point out the dimensional collapse issue of MAE, and propose a Uniformity-enhanced MAE (U-MAE) loss that can effectively address this issue and bring significant improvements on real-world datasets, including CIFAR-10, ImageNet-100, and ImageNet-1K. Code is available at (https://github.com/zhangq327/U-MAE).

Authors (3)

Qi Zhang (785 papers)
Yifei Wang (141 papers)
Yisen Wang (120 papers)

Citations (59)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - zhangq327/U-MAE: Official Code for NeurIPS 2022 Paper: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders (50 stars)

How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders (2210.08344v2)

Summary

Related Papers

GitHub