Papers
Topics
Authors
Recent
Search
2000 character limit reached

Formula-One Prompting (F-1)

Updated 3 February 2026
  • Formula-One Prompting (F-1) is a strategy for LLMs that formalizes governing equations in LaTeX to enhance mathematical reasoning.
  • It utilizes a two-phase approach: first extracting symbolic equations and then adaptively choosing a solving method based on equation complexity.
  • Experimental results show that F-1 significantly improves accuracy in finance, physics, and cryptography compared to conventional CoT and PoT methods.

Formula-One Prompting (F-1) is a prompting strategy for LLMs designed to improve reasoning in applied mathematics by explicitly formulating governing equations as intermediate representations. Unlike conventional @@@@2@@@@ (CoT) or Program-of-Thought (PoT) prompting, F-1 introduces a two-phase approach: extracting key symbolic equations first, then adaptively selecting a solution strategy—direct substitution, CoT, or PoT—based on the structure of those equations. Experimental results indicate that F-1 delivers significant accuracy gains in domains requiring retrieval or synthesis of mathematical laws, such as finance, physics, and cryptography (Nitarach et al., 27 Jan 2026).

1. Motivation and Background

Traditional prompting methods have shown limitations in domains where mathematical reasoning hinges on recognizing and applying domain-specific governing equations. Chain-of-Thought (CoT) prompting lays out natural language steps but is prone to losing track of crucial domain constraints and often introduces compounding symbolic errors. Program-of-Thought (PoT) prompting, while more precise in numerical calculations, is less effective at expressing high-level mathematical relationships, frequently resulting in verbose or inefficient computational routines even when a closed-form solution exists.

In applied problem settings—such as calculating compound interest, applying physical laws like F=maF=ma, or analyzing cryptographic constructs—extracting the salient formula or equation is central to successful reasoning. The F-1 approach leverages this insight by directing LLMs to explicitly formalize the relationships in symbolic LaTeX before any further computation, thus aligning model behavior with expert practices in science and engineering disciplines.

2. Two-Phase F-1 Methodology

Given a problem statement PP, F-1 guides the model through the following workflow:

PPhase 1: FormalizationEPhase 2: Adaptive SolvingAP \xrightarrow{\text{Phase 1: Formalization}} E \xrightarrow{\text{Phase 2: Adaptive Solving}} A

where EE denotes one or more LaTeX-formatted equations and AA the boxed final answer.

2.1 Phase I: Equation Formulation

The initial phase requires the model to:

  1. Extract all data “givens”—numerical rates, constants, parameters.
  2. Identify the target variable (quantity to compute or prove).
  3. Write key symbolic equations connecting givens to the target, exclusively in LaTeX.

Example (Finance):

Problem: A bank offers 5% annual interest compounded monthly. If the principal is \$1000, find the amount after 2 years.

F-1 Phase 1 output: Given: P=1000,r=0.05,m=12,t=2Formula: A=P(1+rm)mt\text{Given: }P=1000,\, r=0.05,\, m=12,\, t=2 \quad \text{Formula: } A = P\Bigl(1 + \frac{r}{m}\Bigr)^{mt}

Example (Cryptography):

Problem: Prove that H(s,j,x)=fgjn(x)n(s)H(s,j,x)=f^n_{\,g^n_j(x)}(s) is a PRF if FF and GG are PRFs.

F-1 Phase 1 output: Security goal: Pr[ArealH=1]Pr[ArandH=1]negl(n). Hybrid 0: O0(x)=fgjn(x)n(s)Hybrid 1: O1(x)=frG(x)n(s)Hybrid 2: O2(x)=r(x)\begin{aligned} &\text{Security goal: }\Bigl|\Pr[\mathcal{A}^{H}_{\mathrm{real}}=1] - \Pr[\mathcal{A}^{H}_{\mathrm{rand}}=1]\Bigr| \le \mathrm{negl}(n).\ &\text{Hybrid 0: }O_0(x)=f^n_{g^n_j(x)}(s)\quad \text{Hybrid 1: }O_1(x)=f^n_{r_G(x)}(s)\quad \text{Hybrid 2: }O_2(x)=r(x) \end{aligned}

2.2 Phase II: Adaptive Solving Strategy

Based on the explicit equations EE generated in Phase I, the model is instructed to select a solving strategy. The guiding decision rule is empirically determined and operates as follows:

Strategy(E)={Direct{substitutions}2 and closed-form PoT#ops(E)3 or iterative/recursive CoTotherwise\text{Strategy}(E) = \begin{cases} \text{Direct} & |\{\text{substitutions}\}| \leq 2 \text{ and closed-form} \ \text{PoT} & \#\text{ops}(E) \geq 3 \text{ or iterative/recursive} \ \text{CoT} & \text{otherwise} \end{cases}

where “#ops(E)\#\text{ops}(E)” counts arithmetic or code-like operations.

The overall process is encapsulated in the following pseudocode:

1
2
3
4
def formula_one_prompt(problem_text):
    prompt = SYSTEM_PROMPT + "\n" + USER_TEMPLATE.format(problem=problem_text)
    response = LLM.generate(prompt, temperature=0)
    return response
Here, the USER_TEMPLATE enforces: (1) Phase 1: Write key equations in LaTeX, (2) Phase 2: Choose Direct/CoT/PoT, solve, verify, and box the final answer.

3. Experimental Setting and Implementation

F-1 is validated across several LLMs and mathematical problem benchmarks:

Model Proprietary Open-Source
GPT-5
Gemini 2.5 Pro
DeepSeek-V3.1
Qwen3-235B
Qwen3-30B

Benchmarks span 2,116 problems:

  • IMO-Bench: 460 competition math/proof problems
  • OlympiadBench: 1,438 problems (subdivided into OE_math, OE_physics, TP_math, TP_physics)
  • FinanceMath: 200 applied finance problems
  • AICrypto: 18 cryptographic proof tasks

The system prompt for F-1 is “You are an AI assistant that solves problems mainly through equations.” No special tokens are required beyond LaTeX math delimiters and “\boxed{}”. Inference operates at temperature zero (greedy decoding), and all strategy-switching thresholds are empirical rather than hard-coded.

4. Empirical Results

Macro-averaged accuracy across benchmarks and models demonstrates consistent and significant improvements:

Method Overall Accuracy (%)
F-1 61.06
CoT 55.30 (Δ = +5.76)
PoT 52.64 (Δ = +8.42)

The gain is especially pronounced in applied domains:

  • FinanceMath: F-1 = 56.30%, CoT = 43.00% (Δ = +13.30)
  • AICrypto: F-1 = 87.54%, CoT = 80.30% (Δ = +7.24)
  • OlympiadBench Physics: F-1 = 44.92%, CoT = 42.37% (Δ = +2.55)
  • OlympiadBench Math: F-1 = 86.35%, CoT = 85.91% (Δ = +0.44)

Selection accuracy on differentiable problems—problems where methods yield different outcomes—was highest for applied mathematics (e.g., 73% on FinanceMath, 69.9% on OlympiadBench). F-1 attains approximately 81–84% of the maximal possible selection accuracy (the upper bound defined by always choosing whichever baseline succeeds on each instance).

5. Best Practices for F-1 Prompt Construction

Empirical findings recommend:

  • Structuring prompts in two clearly delimited phases: Phase 1 (LaTeX equations) and Phase 2 (solution).
  • Using minimalist directives to avoid over-generation; excessive verbosity can degrade performance.
  • Reinforcing solution verification and requiring the boxed final answer to curb hallucinations.
  • Priming models for novel domains by providing in-domain equation exemplars in the system prompt.

6. Limitations and Prospects for Extension

F-1's effectiveness is sensitive to model scale; equation formalization may fail on models smaller than 30B parameters lacking robust symbolic abstraction. The single-call architecture precludes backtracking or error correction if an inappropriate solving strategy is chosen, suggesting that multi-call or plan-and-solve variants could offer further gains. The F-1 methodology is presently validated only on equation-centric tasks; generalization to domains involving more loosely defined constraints, such as legal or ethical reasoning, will require development of new formalization schemas. Presently reported statistics are macro-averages over five models; quantifying variance via bootstrapping or random seed variation is indicated for future research.

7. Significance and Outlook

Formula-One Prompting operationalizes explicit equation formalization and harnesses equation structure to adaptively select among direct, CoT, or PoT approaches, all within a single LLM call. The method achieves improvements of +5.76 percentage points over CoT and +8.42 percentage points over PoT on average, with the greatest gains in applied mathematics (up to +13.30 on finance benchmark). The introduction of an intermediate equation step enforces structural alignment with expert problem-solving approaches in science, finance, and cryptography, while its low-footprint implementation facilitates practical adoption (Nitarach et al., 27 Jan 2026). A plausible implication is that equation-first prompting represents a scalable, model-agnostic means of closing the gap between LLM mathematical reasoning and domain expert performance, especially in domains governed by formal mathematical laws.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Formula-One Prompting (F-1).