Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data Transformation Insights in Self-supervision with Clustering Tasks (2002.07384v1)

Published 18 Feb 2020 in stat.ML, cs.LG, math.ST, and stat.TH

Abstract: Self-supervision is key to extending use of deep learning for label scarce domains. For most of self-supervised approaches data transformations play an important role. However, up until now the impact of transformations have not been studied. Furthermore, different transformations may have different impact on the system. We provide novel insights into the use of data transformation in self-supervised tasks, specially pertaining to clustering. We show theoretically and empirically that certain set of transformations are helpful in convergence of self-supervised clustering. We also show the cases when the transformations are not helpful or in some cases even harmful. We show faster convergence rate with valid transformations for convex as well as certain family of non-convex objectives along with the proof of convergence to the original set of optima. We have synthetic as well as real world data experiments. Empirically our results conform with the theoretical insights provided.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Abhimanu Kumar (10 papers)
  2. Aniket Anand Deshmukh (15 papers)
  3. Urun Dogan (19 papers)
  4. Denis Charles (17 papers)
  5. Eren Manavoglu (7 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.