Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 150 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 220 tok/s Pro
GPT OSS 120B 433 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

ICON: In-Context Operator Networks

Updated 31 October 2025
  • ICON is a transformer-based framework that learns operators from few-shot input-output pairs without parameter updates.
  • It integrates surrogate operator predictions into optimal control strategies, achieving high accuracy in complex, dynamic environments.
  • The approach adapts to diverse kernel forms and market dynamics, providing robust, scalable solutions for data-driven decision-making.

In-Context Operator Networks (ICON) are transformer-based neural architectures enabling data-driven learning of operators with a novel inference paradigm: after extensive offline pre-training, the model is prompted at inference with a small context (few-shot) of input-output pairs, from which it infers and applies the underlying operator without any parameter update. Originally introduced by Yang et al. (2023), ICON demonstrates robust operator generalization properties and serves as a foundation model for complex decision and prediction problems where the governing dynamics are unknown or changing, as exemplified in linear propagator frameworks for optimal order execution with transient market impact.

1. Mathematical Background: Propagator Models and Transient Impact

ICON is applied in the context of linear propagator models for order execution, as formulated in Bouchaud et al. (2004) and Gatheral (2010). In these models, a trader’s liquidation schedule is governed by an admissible trading rate utu_t, producing an inventory process: Xt=x0tusdsX_t = x - \int_0^t u_s \, ds The execution price at time tt is Pt=StYtP_t = S_t - Y_t, where StS_t is the unaffected asset price and YtY_t is the transient price impact. The impact process is given by a convolution operator: Yt=0tG(ts)usλdsY_t = \int_0^t G(t-s) u_s \lambda \, ds where G()G(\cdot) is the propagator kernel and λ>0\lambda > 0 the impact coefficient. Two kernel families are commonly considered: exponential decay (G(t)=exp(βt)G(t) = \exp(-\beta t)) and power laws (G(t)=(+t)γG(t) = (\ell + t)^{-\gamma}). ICON is designed to learn the operator Iθ\boldsymbol{I}_\theta defined by the propagator equation.

The goal is to solve the stochastic control problem: maxuJ(u)=E[0T(St(Iθ(u))t)utdtε0Tut2dtϕ0TXt2dt+XTSTϱXT2]\max_u J(u) = \mathbb{E} \left[ \int_0^T (S_t - (\boldsymbol{I}_\theta(u))_t) u_t\, dt - \varepsilon \int_0^T u_t^2\,dt - \phi \int_0^T X_t^2\,dt + X_T S_T - \varrho X_T^2 \right] where penalties ε,ϕ,ϱ\varepsilon,\phi,\varrho regularize execution behavior and terminal inventory.

2. ICON Framework: Training and Inference Protocols

ICON employs a transformer backbone to learn mappings from the trading rate trajectory uu to the impact trajectory YY. Offline, ICON is pre-trained on sampled operator classes—including diverse kernel forms, parameter settings, and discretizations. Each training example consists of a context set: {(uj,Yj)}j=1M\{(u^j, Y^j)\}_{j=1}^M for a specific operator θ\theta, and a query input u0u^0, with the target output Y0=Iθ(u0)Y^0 = \boldsymbol{I}_\theta(u^0).

In the online inference stage, the pre-trained ICON model receives a limited set of (u, Y) context pairs from the new, possibly unobserved, propagator kernel. ICON then predicts the impact curve for any query trajectory u0u^0, yielding a functional surrogate: I^(u0;{(uj,Yj)}j=1M)Iθ(u0)\boldsymbol{\hat{I}}(u^0;\{(u^j, Y^j)\}_{j=1}^M) \approx \boldsymbol{I}_\theta(u^0) No retraining or weight adaptation is performed. This approach leverages the transformer’s permutation invariance and contextual modeling to establish a few-shot, prompt-based inference regime.

3. Surrogate Operator Integration in Optimal Control

To solve the order execution problem when the propagator model is unknown or shifting, the ICON surrogate operator is integrated into the cost functional: maxuJICON(u):=E[i(StiI^(u)ti)utiΔtεuti2ϕXti2+]\max_{u}\, J_\mathrm{ICON}(u) := \mathbb{E} \left[ \sum_i (S_{t_i} - \boldsymbol{\hat{I}}(u)_{t_i}) u_{t_i}\Delta t - \varepsilon u^2_{t_i} - \phi X^2_{t_i} + \ldots \right] A neural network policy uti=NNϑ(ti,αti)u_{t_i} = \mathrm{NN}_\vartheta(t_i, \alpha_{t_i}) parameterizes the action schedule and is optimized using stochastic gradient descent with backpropagation through the frozen ICON network (Editor’s term: “ICON-OCnet” for this combined architecture).

This setup enables direct agent-level policy learning in environments with nonparametric, data-inferred state dynamics, overcoming classical limitations associated with analytic model fitting.

4. Empirical Performance and Generalization Properties

ICON achieves high accuracy in impact prediction for unseen operator classes: prediction errors are routinely less than 1% even when only 5 prompt pairs are provided and query trajectories are outside the training support. ICON successfully generalizes to kernels and parameter settings not encountered during pre-training, surpassing the limitations of parametric model-based approaches. ICON-OCnet reliably recovers the correct optimal execution strategies as derived by Abi Jaber and Neuman (2022) for the generating models.

ICON demonstrates strong transfer learning and data efficiency characteristics—a plausible implication is robust adaptation to market regime shifts without the need for retraining, critical in stochastic control settings with structural uncertainty.

5. Technical Advantages, Flexibility, and Robustness

ICON’s transformer architecture enables:

  • Model-free operator learning: No explicit kernel parameter or form required.
  • Adaptivity to arbitrary context size and discretization granularity.
  • Robustness to context ordering, noisy observations, and variable-length input/output sequences.
  • Seamless transfer to new operator forms or domains from a handful of prompt samples.

Unlike classical parametric identification, ICON directly surrogates the operator from empirical context, allowing tight integration with neural policy optimization frameworks. A plausible implication is that stochastic control and optimal execution problems with non-Markovian dynamics induced by non-exponential kernels, previously considered intractable, become amenable to data-driven solution with ICON.

6. Summary Table: Key Concepts and Formulas

Concept Formula / Definition
Inventory dynamics Xt=x0tusdsX_t = x - \int_0^t u_s ds
Price impact operator (Iθ(u))t=0tG(ts)usλds(\boldsymbol{I}_\theta(u))_t = \int_0^t G(t-s) u_s \lambda ds
ICON operator surrogate I^(u;context)\boldsymbol{\hat{I}}(u; \text{context})
Execution control objective J(u),JICON(u)J(u),\, J_\mathrm{ICON}(u) as above
In-context operator use Few-shot inference from {(uj,Yj)}\{(u^j, Y^j)\}
Policy parameterization u(t)=NNϑ(t,α)u(t) = \mathrm{NN}_\vartheta(t, \alpha)

7. Significance and General Applicability

ICON provides a principled and general methodology for operator learning and adaptation in real-world stochastic control problems. Its few-shot, prompt-driven inference paradigm, supported by extensive pre-training, presents a scalable approach for rapidly inferring and deploying surrogate dynamics in situations where the underlying system is only partially observed, subject to change, or fundamentally unknown. ICON’s empirical assessment establishes its viability for high-accuracy operator recovery and agent optimization in optimal order execution frameworks, suggesting direct applicability for a broader class of control and prediction problems in financial mathematics, engineering, and scientific computing.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to In-Context Operator Networks (ICON).