Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Localized Latent Updates for Fine-Tuning Vision-Language Models (2212.06556v1)

Published 13 Dec 2022 in cs.CV, cs.CL, and cs.LG

Abstract: Although massive pre-trained vision-LLMs like CLIP show impressive generalization capabilities for many tasks, still it often remains necessary to fine-tune them for improved performance on specific datasets. When doing so, it is desirable that updating the model is fast and that the model does not lose its capabilities on data outside of the dataset, as is often the case with classical fine-tuning approaches. In this work we suggest a lightweight adapter, that only updates the models predictions close to seen datapoints. We demonstrate the effectiveness and speed of this relatively simple approach in the context of few-shot learning, where our results both on classes seen and unseen during training are comparable with or improve on the state of the art.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Moritz Ibing (6 papers)
  2. Isaak Lim (9 papers)
  3. Leif Kobbelt (23 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.