Differential impact of appending EOS tokens in encoder-only versus decoder-only LLMs
Determine how appending an End-of-Sequence (EOS) token to the input sequence affects encoder-only models such as DeBERTa versus decoder-only models such as Mistral and Llama3 when these models are fine-tuned for predicting forward stock returns from financial newsflow, clarifying the different impacts across model families.
References
In experiments, we observed that appending the EOS token is more helpful for encoder-only LLMs. For a comparison on the same ground, we append EOS tokens for both encoder-only and decoder-only LLMs and leave the study on the different impacts of appending tokens to future work.
                — Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow
                
                (2407.18103 - Guo et al., 25 Jul 2024) in Section 3.2 (Methodology), Bottleneck Representations vs. Aggregated Representations