Personalization Toolkit: Training Free Personalization of Large Vision Language Models

Published 4 Feb 2025 in cs.CV | (2502.02452v2)

Abstract: Large Vision LLMs (LVLMs) have significant potential to provide personalized assistance by adapting to the unique needs and preferences of individual users. The personalization of LVLMs has emerged as a field that focuses on customizing models to recognize specific object instances and provide tailored responses. However, current methodologies depend on time-consuming test-time training for each user and object, which proves to be impractical. This paper introduces a novel, training-free approach to LVLM personalization by leveraging pre-trained vision foundation models to extract distinct features, retrieval-augmented generation (RAG) techniques to recognize instances in the visual input, and visual prompting methods. Our model-agnostic vision toolkit enables flexible and efficient personalization without the need for extensive retraining. We demonstrate state-of-the-art results, surpassing conventional training-based approaches, and set a new benchmark for LVLM personalization.

Abstract PDF Upgrade to Chat

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

Personalization Toolkit: Training Free Personalization of Large Vision Language Models

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (4)

Collections

Tweets

Personalization Toolkit: Training Free Personalization of Large Vision Language Models

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (4)

Collections

Tweets