Applicability of SPY-trained WSJ headline-embedding models to individual stocks

Determine whether machine learning stock prediction models that integrate Wall Street Journal headline embeddings (generated using OpenAI text-embedding models) with macroeconomic and financial inputs, trained and validated on SPDR S&P 500 ETF Trust (SPY) data, can be reliably applied to specific individual stocks and achieve comparable predictive performance.

Background

The paper develops and evaluates multiple machine learning architectures (FFNN, LSTM, GRU, TCN, and NN-HMM) to predict next-day SPY movements using Wall Street Journal headline embeddings reduced via PCA, alongside economic indicators like DXY and Treasury yields. Across 390 trained models, the inclusion of headline embeddings substantially improves performance.

Despite these results on SPY (a broad market ETF), the authors explicitly identify uncertainty about whether the same modeling approach is transferable to individual equities. They suggest transfer learning as a potential strategy but emphasize that the applicability to specific stocks remains unresolved.

References

However, one remaining challenge is the issue of data reusability: Although the model performs well on general stock predictions, its applicability to specific stocks remains an open question [4].

— News Sentiment Embeddings for Stock Price Forecasting (2507.01970 - Qayyum, 19 Jun 2025) in Subsection "Challenges of Hyperparameter Optimization"

Applicability of SPY-trained WSJ headline-embedding models to individual stocks

Sponsor

Background

References

Related Problems