Can Base ChatGPT be Used for Forecasting without Additional Optimization? (2404.07396v3)

Published 11 Apr 2024 in econ.GN, cs.AI, and q-fin.EC

Abstract: This study investigates whether OpenAI's ChatGPT-3.5 and ChatGPT-4 can forecast future events. To evaluate the accuracy of the predictions, we take advantage of the fact that the training data at the time of our experiments (mid 2023) stopped at September 2021, and ask about events that happened in 2022. We employed two prompting strategies: direct prediction and what we call future narratives which ask ChatGPT to tell fictional stories set in the future with characters retelling events that happened in the past, but after ChatGPT's training data had been collected. We prompted ChatGPT to engage in storytelling, particularly within economic contexts. After analyzing 100 trials, we find that future narrative prompts significantly enhanced ChatGPT-4's forecasting accuracy. This was especially evident in its predictions of major Academy Award winners as well as economic trends, the latter inferred from scenarios where the model impersonated public figures like the Federal Reserve Chair, Jerome Powell. As a falsification exercise, we repeated our experiments in May 2024 at which time the models included more recent training data. ChatGPT-4's accuracy significantly improved when the training window included the events being prompted for, achieving 100% accuracy in many instances. The poorer accuracy for events outside of the training window suggests that in the 2023 prediction experiments, ChatGPT-4 was forming predictions based solely on its training data. Narrative prompting also consistently outperformed direct prompting. These findings indicate that narrative prompts leverage the models' capacity for hallucinatory narrative construction, facilitating more effective data synthesis and extrapolation than straightforward predictions. Our research reveals new aspects of LLMs' predictive capabilities and suggests potential future applications in analytical contexts.

PDF HTML Abstract

Unlocking the Predictive Capabilities of GPT-3.5 and GPT-4 Through Innovative Prompting Strategies

Introduction to Predictive Modeling with GPT

The potential of generative LLMs for predictive analysis has attracted significant attention within the AI research community. This paper examines the ability of OpenAI's ChatGPT-3.5 and ChatGPT-4 to forecast future events, namely the 2022 Academy Award winners and economic indicators for late 2021 and early 2022. This endeavor is set against the backdrop of the models' last training update in September 2021, offering a clear demarcation for assessing their prediction capabilities based on historical data.

Methodological Approach

The research employed a dual-prompt strategy to evaluate the models' forecasting precision:

Direct Prediction: Straightforward prompts requesting predictions for specific future outcomes.
Future Narratives: Prompts inviting the models to construct fictional narratives set in the future, incorporating events happening post-September 2021 as factual retrospections.

This innovative prompting method, particularly the narrative approach, aimed to circumvent the models' programming limitations regarding future predictions. By engaging GPT-3.5 and GPT-4 in creative storytelling, we explored whether these models could inadvertently reveal predictive insights woven into their generative text outputs.

Findings and Implications

Predictive Performance on the Academy Awards

The paper highlighted a remarkable distinction in predictive accuracy between direct and narrative prompts. Notably, GPT-4 exhibited profound proficiency in anticipating the winners of major Academy Awards categories through narrative prompts, demonstrating a predictive acumen that surpassed direct prompting methods. However, the prediction of the Best Picture category remained elusive, suggesting a nuanced limitation of the model's forecasting capabilities in scenarios with broad nominee pools.

Economic Forecasting

The narrative prompting approach also shed light on GPT-4's potential in economic forecasting. While direct prompts yielded negligible insights, fictional narratives attributed to authoritative figures like Federal Reserve Chair Jerome Powell revealed surprisingly cogent predictions of inflation and unemployment rates. These findings underscored the narrative method's effectiveness in eliciting indirectly predictive data from the models.

Ethical and Practical Considerations

The paper's findings prompt a discussion on the ethical parameters governing GPT's use for predictive tasks. The success of narrative prompts in bypassing direct prediction limitations raises questions about the broader implications for LLM applications in sensitive areas such as finance and healthcare. Aligning the creative exploitation of these models with OpenAI's ethical guidelines necessitates a nuanced understanding of their operational frameworks and potential societal impacts.

Future Directions

The differential success rates across various prediction tasks invite further exploration into refining prompting techniques. Future research could delve into:

The underlying mechanisms enabling narrative prompts to elicit more accurate predictions.
The development of hybrid prompting strategies that balance direct and narrative elements.
The exploration of other domains where LLMs might offer predictive utility, guided by ethical considerations.

Conclusion

This investigation into the predictive capabilities of GPT-3.5 and GPT-4 highlights the untapped potential of LLMs as forecasting tools. By leveraging creative prompting strategies, we can enhance our understanding and utilization of these models beyond their conventional applications. As we continue to explore the frontier of AI's predictive prowess, maintaining a commitment to ethical standards will be paramount in harnessing the full potential of this technology for the benefit of society.

PDF Markdown Bookmark Chat (Pro)

References (13)

Authors (2)

Van Pham (2 papers)
Scott Cunningham (3 papers)

Citations (1)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/emollick/status/1779601255238611287

https://twitter.com/causalinf/status/1853777213344149516

https://twitter.com/MaxGhenis/status/1852765706506432893

https://twitter.com/JimPethokoukis/status/1778758597737902229

https://twitter.com/participatory/status/1779610991665267034

https://twitter.com/eeghor/status/1780051790249242964