Causes of DeBERTa’s underperformance in large universes and variability across small universes

Determine whether the underperformance of the encoder-only DeBERTa model in the large investment universe arises from model size or other factors, and explain the causes of DeBERTa’s varying performance across different small investment universes when used for news-based stock return prediction.

Background

Empirical results show decoder-only models, particularly Mistral, performing robustly across investment universes, while the encoder-only DeBERTa exhibits underperformance in a large universe and inconsistent results in smaller universes.

The paper explicitly flags uncertainty about the reasons for DeBERTa’s behavior, including whether model capacity or other factors are responsible in the large universe and why performance varies across small universes.

Understanding these causes could inform model selection and scaling decisions for financial news-based return prediction.

References

Several open questions remain for future research. For instance, it is unclear whether the underperformance of encoder-only DeBERTa in the large investment universe is due to the model size or other factors, and why DeBERTa has varying performance in different small universes.

— Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow (2407.18103 - Guo et al., 25 Jul 2024) in Conclusion

Causes of DeBERTa’s underperformance in large universes and variability across small universes

Sponsor

Background

References

Related Problems