Papers
Topics
Authors
Recent
2000 character limit reached

Personalization Toolkit: Training Free Personalization of Large Vision Language Models

Published 4 Feb 2025 in cs.CV | (2502.02452v2)

Abstract: Large Vision LLMs (LVLMs) have significant potential to provide personalized assistance by adapting to the unique needs and preferences of individual users. The personalization of LVLMs has emerged as a field that focuses on customizing models to recognize specific object instances and provide tailored responses. However, current methodologies depend on time-consuming test-time training for each user and object, which proves to be impractical. This paper introduces a novel, training-free approach to LVLM personalization by leveraging pre-trained vision foundation models to extract distinct features, retrieval-augmented generation (RAG) techniques to recognize instances in the visual input, and visual prompting methods. Our model-agnostic vision toolkit enables flexible and efficient personalization without the need for extensive retraining. We demonstrate state-of-the-art results, surpassing conventional training-based approaches, and set a new benchmark for LVLM personalization.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.