Fast Vocabulary Transfer for Language Model Compression (2402.09977v1)

Published 15 Feb 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Real-world business applications require a trade-off between LLM performance and size. We propose a new method for model compression that relies on vocabulary transfer. We evaluate the method on various vertical domains and downstream tasks. Our results indicate that vocabulary transfer can be effectively used in combination with other compression techniques, yielding a significant reduction in model size and inference time while marginally compromising on performance.

References (25)

Citations (18)

View on Semantic Scholar

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Fast Vocabulary Transfer for Language Model Compression (2402.09977v1)

Collections

Summary

Follow-up Questions

Authors (4)

Don't miss out on important new AI/ML research

Fast Vocabulary Transfer for Language Model Compression (2402.09977v1)

Collections

Summary

Follow-up Questions

Related Papers

Authors (4)

Don't miss out on important new AI/ML research