Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
37 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
37 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
10 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Mixture of Experts in Image Classification: What's the Sweet Spot? (2411.18322v1)

Published 27 Nov 2024 in cs.CV and cs.LG

Abstract: Mixture-of-Experts (MoE) models have shown promising potential for parameter-efficient scaling across various domains. However, the implementation in computer vision remains limited, and often requires large-scale datasets comprising billions of samples. In this study, we investigate the integration of MoE within computer vision models and explore various MoE configurations on open datasets. When introducing MoE layers in image classification, the best results are obtained for models with a moderate number of activated parameters per sample. However, such improvements gradually vanish when the number of parameters per sample increases.

Summary

We haven't generated a summary for this paper yet.