Dice Question Streamline Icon: https://streamlinehq.com

Addressing omitted factor bias by inferring latent factors directly from returns

Develop generative Transformer models, such as Generative Pretrained Transformers (GPT), that infer latent asset-pricing factors directly from returns to mitigate omitted variable bias inherent in Transformer-based factor models that rely solely on observable characteristic-sorted portfolios; rigorously evaluate whether this approach improves out-of-sample fit and investment performance relative to observable-factor inputs.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper applies Transformer architectures (including SERT and a pre-trained Transformer) to asset pricing using observable characteristic-sorted portfolios as inputs. While effective, this setup is susceptible to omitted factor bias because new or unobserved factors cannot be incorporated directly. The authors suggest that a generative approach that extracts factors from returns may overcome this limitation.

They specifically point to generative Transformer models (e.g., GPT) as candidates to subtract factors directly from returns, thereby potentially addressing missing factor issues and improving prediction and strategy performance.

References

However, some open questions remain for future researchers. For example, as we use the observable factors for the factor model, it may have omitted factor issues which cause omitted variable bias or missing factor problems, although Transformer models already moderate such issues via autoencoder structure and self-attention mechanism. This is due to the observable input factors. Namely, we are not able to add factors that have not been explored yet. Future researchers can attempt alternative models from the Transformer model family for subtracting factors directly from the returns, for example, generative Transformer models such as generative pretrained Transformer (GPT).

Asset Pricing in Pre-trained Transformer (2505.01575 - Lai, 2 May 2025) in Section 6, Conclusion