Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation (2406.07529v4)

Published 11 Jun 2024 in cs.LG

Abstract: Model merging has emerged as an effective approach to combine multiple single-task models into a multitask model. This process typically involves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the objectives of different tasks can lead to trade-offs during the merging process. In real-world applications, a set of solutions with various trade-offs can be more informative, helping practitioners make decisions based on diverse preferences. In this paper, we introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP). MAP efficiently identifies a Pareto set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. It amortizes the substantial computational cost of evaluations needed to estimate the Pareto front by using quadratic approximation surrogate models derived from a pre-selected set of scaling coefficients. Experimental results on vision and natural language processing tasks demonstrate that MAP can accurately identify the Pareto front, providing practitioners with flexible solutions to balance competing task objectives. We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Lu Li (166 papers)
  2. Tianyu Zhang (111 papers)
  3. Zhiqi Bu (42 papers)
  4. Suyuchen Wang (16 papers)
  5. Huan He (45 papers)
  6. Jie Fu (229 papers)
  7. Yonghui Wu (115 papers)
  8. Jiang Bian (229 papers)
  9. Yong Chen (299 papers)
  10. Yoshua Bengio (601 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com