Identify the cause of GPT-4’s macro-level attribution performance discrepancy across funds

Ascertain the reasons for the observed discrepancy in GPT-4’s accuracy when executing macro-level performance attribution calculations at the “GICS Type” level using the generate_prompt_macro approach within a LangChain OpenAI Functions Agent and pandas workflow, specifically why the Portfolio Defensive achieved perfect accuracy on the first run while Portfolio Growth and Portfolio Value required several attempts. Evaluate whether prompt complexity and the absence of numerical examples in the prompt contribute to this inconsistency.

Background

The paper evaluates GPT-4-based AI agents for performance attribution tasks, including macro-level calculations using a prompt (generate_prompt_macro) that provides formulas and step-by-step instructions to compute allocation and selection effects at the parent “GICS Type” level. While the agent eventually achieved perfect accuracy overall, the authors report that only the Portfolio Defensive ran correctly on the first attempt, whereas Portfolio Growth and Portfolio Value required several attempts to reach perfect accuracy.

This inconsistency raises a reliability question about GPT-4’s ability to follow complex, multi-step instructions in a consistent manner across different datasets or funds. The authors suggest that the issue may be related to the prompt’s complexity and possibly the need for numerical examples, but they explicitly state that the exact reason is unknown.

References

We are not sure of the reasons for this performance discrepancy. It could be related to the fact that the prompt is complicated with several formulas and instructions, and it may need numerical examples in the prompt.

— Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? (2403.10482 - Melo et al., 15 Mar 2024) in Section 5, Objective #2, Method #1 (Macro level calculations)

Identify the cause of GPT-4’s macro-level attribution performance discrepancy across funds

Sponsor

Background

References

Related Problems