Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Base ChatGPT be Used for Forecasting without Additional Optimization? (2404.07396v3)

Published 11 Apr 2024 in econ.GN, cs.AI, and q-fin.EC

Abstract: This study investigates whether OpenAI's ChatGPT-3.5 and ChatGPT-4 can forecast future events. To evaluate the accuracy of the predictions, we take advantage of the fact that the training data at the time of our experiments (mid 2023) stopped at September 2021, and ask about events that happened in 2022. We employed two prompting strategies: direct prediction and what we call future narratives which ask ChatGPT to tell fictional stories set in the future with characters retelling events that happened in the past, but after ChatGPT's training data had been collected. We prompted ChatGPT to engage in storytelling, particularly within economic contexts. After analyzing 100 trials, we find that future narrative prompts significantly enhanced ChatGPT-4's forecasting accuracy. This was especially evident in its predictions of major Academy Award winners as well as economic trends, the latter inferred from scenarios where the model impersonated public figures like the Federal Reserve Chair, Jerome Powell. As a falsification exercise, we repeated our experiments in May 2024 at which time the models included more recent training data. ChatGPT-4's accuracy significantly improved when the training window included the events being prompted for, achieving 100% accuracy in many instances. The poorer accuracy for events outside of the training window suggests that in the 2023 prediction experiments, ChatGPT-4 was forming predictions based solely on its training data. Narrative prompting also consistently outperformed direct prompting. These findings indicate that narrative prompts leverage the models' capacity for hallucinatory narrative construction, facilitating more effective data synthesis and extrapolation than straightforward predictions. Our research reveals new aspects of LLMs' predictive capabilities and suggests potential future applications in analytical contexts.

Unlocking the Predictive Capabilities of GPT-3.5 and GPT-4 Through Innovative Prompting Strategies

Introduction to Predictive Modeling with GPT

The potential of generative LLMs for predictive analysis has attracted significant attention within the AI research community. This paper examines the ability of OpenAI's ChatGPT-3.5 and ChatGPT-4 to forecast future events, namely the 2022 Academy Award winners and economic indicators for late 2021 and early 2022. This endeavor is set against the backdrop of the models' last training update in September 2021, offering a clear demarcation for assessing their prediction capabilities based on historical data.

Methodological Approach

The research employed a dual-prompt strategy to evaluate the models' forecasting precision:

  1. Direct Prediction: Straightforward prompts requesting predictions for specific future outcomes.
  2. Future Narratives: Prompts inviting the models to construct fictional narratives set in the future, incorporating events happening post-September 2021 as factual retrospections.

This innovative prompting method, particularly the narrative approach, aimed to circumvent the models' programming limitations regarding future predictions. By engaging GPT-3.5 and GPT-4 in creative storytelling, we explored whether these models could inadvertently reveal predictive insights woven into their generative text outputs.

Findings and Implications

Predictive Performance on the Academy Awards

The paper highlighted a remarkable distinction in predictive accuracy between direct and narrative prompts. Notably, GPT-4 exhibited profound proficiency in anticipating the winners of major Academy Awards categories through narrative prompts, demonstrating a predictive acumen that surpassed direct prompting methods. However, the prediction of the Best Picture category remained elusive, suggesting a nuanced limitation of the model's forecasting capabilities in scenarios with broad nominee pools.

Economic Forecasting

The narrative prompting approach also shed light on GPT-4's potential in economic forecasting. While direct prompts yielded negligible insights, fictional narratives attributed to authoritative figures like Federal Reserve Chair Jerome Powell revealed surprisingly cogent predictions of inflation and unemployment rates. These findings underscored the narrative method's effectiveness in eliciting indirectly predictive data from the models.

Ethical and Practical Considerations

The paper's findings prompt a discussion on the ethical parameters governing GPT's use for predictive tasks. The success of narrative prompts in bypassing direct prediction limitations raises questions about the broader implications for LLM applications in sensitive areas such as finance and healthcare. Aligning the creative exploitation of these models with OpenAI's ethical guidelines necessitates a nuanced understanding of their operational frameworks and potential societal impacts.

Future Directions

The differential success rates across various prediction tasks invite further exploration into refining prompting techniques. Future research could delve into:

  • The underlying mechanisms enabling narrative prompts to elicit more accurate predictions.
  • The development of hybrid prompting strategies that balance direct and narrative elements.
  • The exploration of other domains where LLMs might offer predictive utility, guided by ethical considerations.

Conclusion

This investigation into the predictive capabilities of GPT-3.5 and GPT-4 highlights the untapped potential of LLMs as forecasting tools. By leveraging creative prompting strategies, we can enhance our understanding and utilization of these models beyond their conventional applications. As we continue to explore the frontier of AI's predictive prowess, maintaining a commitment to ethical standards will be paramount in harnessing the full potential of this technology for the benefit of society.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. Prediction Machines: The Simple Economics of Artificial Intelligence. Harvard Business Review Press, Boston, MA, USA.
  2. Using gpt for market research. Working Paper 23-062, Harvard Business School Marketing Unit.
  3. Horton, J. J. (2023). Large language models as simulated economic agents: What can we learn from homo silicus? https://doi.org/10.48550/arXiv.2301.07543.
  4. Hughes, A. (2023). ChatGPT: Everything you need to know about OpenAI’s GPT-4 tool. BBC Science Focus, September 25, 2023.
  5. From transcripts to insights: Uncovering corporate risks using generative AI. Working Paper 2023-132, Becker Friedman Institute.
  6. Levy, S. (2024). In defense of AI hallucinations. Wired, January 5, 2024.
  7. Can chatgpt forecast stock price movements? return predictability and large language models. http://dx.doi.org/10.2139/ssrn.4412788.
  8. Mehdi, Y. (2023). Bing at Microsoft Build 2023: Continuing the Transformation of Search, May 23, 2023. https://blogs.bing.com/search/may_2023/Bing-at-Microsoft-Build-2023. Accessed: April 07, 2024.
  9. OpenAI (2024a). OpenAI–Documentation–Models. https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo. Accessed: April 06, 2024.
  10. OpenAI (2024b). Usage Policies. https://openai.com/policies/usage-policies. Accessed: April 10, 2024.
  11. Inside the secret list of websites that make AI like ChatGPT sound smart. Washington Post, April 19, 2023.
  12. Silver, N. (2013). Oscar Predictions, Election-Style. https://fivethirtyeight.com/features/oscar-predictions-election-style/. Accessed: April 09, 2024.
  13. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 6000–6010, Red Hook, NY, USA. Curran Associates Inc.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Van Pham (2 papers)
  2. Scott Cunningham (3 papers)
Citations (1)
Youtube Logo Streamline Icon: https://streamlinehq.com
Reddit Logo Streamline Icon: https://streamlinehq.com