Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 91 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 30 tok/s
GPT-5 High 33 tok/s Pro
GPT-4o 98 tok/s
GPT OSS 120B 483 tok/s Pro
Kimi K2 242 tok/s Pro
2000 character limit reached

Trajectory-Based Hamiltonian Learning

Updated 1 August 2025
  • The paper presents trajectory-based Hamiltonian learning as a data-driven approach that reconstructs Hamiltonian functions through partitioned time intervals using the chattering algorithm.
  • It employs convex optimization by discretizing controls and solving linear programs to efficiently approximate solutions to nonconvex, nonsmooth optimal control problems.
  • The method is validated in real-time scheduling and control applications, demonstrating robustness, computational efficiency, and adaptability to dynamic system changes.

Trajectory-based Hamiltonian learning refers to a suite of methodologies that leverage observed or generated system trajectories to reconstruct, approximate, or optimize Hamiltonian functions underlying dynamical systems. This approach is central across physics-informed machine learning, optimal control, safe systems engineering, and real-time operations, enabling practitioners to bypass intractable analytic solutions in favor of data-driven yet physically consistent inference and control design. The core idea is to utilize trajectory data—whether direct state-control observations or indirect high-dimensional sensor outputs—to capture the structure and properties of the system’s Hamiltonian, with guaranteed conservation laws and stability, and to realize pragmatic solutions for problems such as optimal scheduling, collision avoidance, and safety certification.

1. Mathematical Foundations and Hamiltonian Formulation

Trajectory-based Hamiltonian learning builds on the classical control theory framework in which an optimal control problem is formulated as: J(t,x(t),u(t))=0Tg(t,x(t),u(t))dt+Ψ(x(T)),J(t, x(t), u(t)) = \int_0^T g(t, x(t), u(t)) dt + \Psi(x(T)), subject to the dynamics

x(t)=x(0)+0Tf(τ,x(τ),u(τ))dτ.x(t) = x(0) + \int_0^T f(\tau, x(\tau), u(\tau)) d\tau.

The Hamiltonian is defined as

H(t,x,p,u)=g(t,x,u)+pf(t,x,u),H(t, x, p, u) = g(t, x, u) + p^\top f(t, x, u),

with the costate (momentum) p(t)p(t) governed by

(dx/dt)=H/p,(dp/dt)=H/x,(dx/dt)^\top = \partial H / \partial p, \qquad (dp/dt)^\top = -\partial H / \partial x,

and the optimal control (according to Pontryagin’s Minimum Principle) characterized by

H(t,x(t),p(t),u(t))H(t,x(t),p(t),u(t)).H(t, x^*(t), p(t), u^*(t)) \leq H(t, x(t), p(t), u(t)).

This formulation leads to a two-point boundary value problem, which for most nonlinear or nonconvex systems is prohibitively challenging to solve directly.

2. Chattering Algorithm and Time Partitioning

A central construct in trajectory-based Hamiltonian learning, exemplified by the chattering algorithm (Kumar et al., 2017), is to partition the time horizon [0,T][0, T] into II small intervals [ti,ti+1][t_i, t_{i+1}]. On each interval, the local variational problem is reduced to tractable subproblems: minu(t)Utiti+1H(t,x(t),p(t),u(t))dt,\min_{u(t) \in U} \int_{t_i}^{t_{i+1}} H(t, x(t), p(t), u(t))\, dt, subject to

x(t)=xti+tit(H/p)dt,p(t)=pti+1ti+1t(H/x)dt.x(t) = x_{t_i} + \int_{t_i}^t (\partial H / \partial p) dt,\quad p(t) = p_{t_{i+1}} - \int_{t_{i+1}}^t (\partial H / \partial x) dt.

The innovative step is the "chattering" approximation, which uses a finite set of KK control levels cktic_k^{t_i} and convex weights αkti0\alpha_k^{t_i} \geq 0 with kαkti=1\sum_k \alpha_k^{t_i} = 1, so that on short intervals

u(t)k=1Kαktickti.u(t) \approx \sum_{k=1}^K \alpha_k^{t_i} c_k^{t_i}.

The Hamiltonian integral over the interval is then expressed as

titi+1H()dtΔtik=1KαktiH(ti,xti,pti,ckti),\int_{t_i}^{t_{i+1}} H(\cdot)\, dt \approx \Delta_{t_i} \sum_{k=1}^K \alpha_k^{t_i} H(t_i, x_{t_i}, p_{t_i}, c_k^{t_i}),

and the state and costate propagate via

xti+1=xti+Δtik=1KαktiHpckti,x_{t_{i+1}} = x_{t_i} + \Delta_{t_i} \sum_{k=1}^K \alpha_k^{t_i} \left.\frac{\partial H}{\partial p}\right|_{c_k^{t_i}},

pti+1=ptiΔtik=1KαktiHxckti.p_{t_{i+1}} = p_{t_i} - \Delta_{t_i} \sum_{k=1}^K \alpha_k^{t_i} \left.\frac{\partial H}{\partial x}\right|_{c_k^{t_i}}.

For each interval, the αkti\alpha_k^{t_i} are found by solving a linear program (relaxed knapsack problem): min{αkti}k=1KH(ti,xti,pti,ckti)αkti,k=1Kαkti=1,0αkti1.\min_{\{\alpha_k^{t_i}\}} \sum_{k=1}^K H(t_i, x_{t_i}, p_{t_i}, c_k^{t_i}) \alpha_k^{t_i},\quad \sum_{k=1}^K \alpha_k^{t_i} = 1,\quad 0 \leq \alpha_k^{t_i} \leq 1. This recursive process constructs a global trajectory as a union of locally averaged controls, bypassing intractable dynamic programming.

3. Relaxation, Convexification, and Computational Tractability

The chattering framework confers several distinctive advantages:

  • Non-convex and non-smooth optimal control problems are "relaxed" into a sequence of convex (linear) subproblems. The chattering combination acts as a dense subset in the space of admissible controls (in the Lebesgue-integral sense).
  • The reduction to convex programs ensures that each subproblem (with dimensionality scaling with KK) can be solved efficiently, often analytically, enabling implementation at timescales compatible with real-time systems.
  • The two-point boundary value problem is replaced by an initial value problem: the state and costate at t0t_0 are initialized, and updated iteratively, with correction steps to enforce terminal boundary conditions (by monitoring endpoint error and adjusting the co-state initialization).
  • The trajectory construction is robust to classical defects of optimal control: local minima due to nonconvexity, ill-posed smoothness, and the curse of dimensionality in numerical dynamic programming are all mitigated.

4. Data-Driven Hamiltonian Learning, Adaptivity, and Feedback

This framework allows the structure of the system Hamiltonian and cost-to-go to be inferred from observed or generated trajectory data:

  • The iterative control updates and local state-costate propagations yield a natural data-driven learning process, where control weights αkti\alpha_k^{t_i} and system trajectories form a basis for refining the empirical model of the Hamiltonian.
  • Real-time adaptivity is achieved by continuously updating the local approximation as new state and disturbance measurements become available. Disturbances and model mismatch are handled by re-solving the local problems, making the method suited for settings with rapid environmental change or imperfect modeling.
  • The chattering approach is compatible with feedback designs. By tracking error propagation in the costate and employing extremal corrections at each time step, the system can be steered adaptively toward optimality.

5. Applications in Real-Time Enterprise Scheduling and Control

When deployed on real-time scheduling and control processes (e.g., enterprise food distribution), the chattering-based trajectory learning paradigm delivers:

  • Computational efficiency, as the time to solution for each optimization scales with the number of discrete control levels, not the overall dimension of the state/control space.
  • Robustness to nonlinearities and nonsmoothness: the chattering method does not impose restrictive assumptions (such as global convexity, differentiability, or continuous cost functions) and thereby accommodates practical systems characterized by discontinuities and abrupt regime shifts.
  • Empirical effectiveness: the method has been validated in operational scenarios, producing near-optimal schedules and trajectories with minimal computational overhead, facilitating near–real-time decision making for highdimensional enterprise systems.

6. Implications for Hamiltonian Learning and Trajectory-Based Optimization

Trajectory-based Hamiltonian learning via the chattering algorithm represents a versatile, computationally tractable, and empirically validated approach to approximating optimal policies and system evolution. By partitioning the time horizon, convexifying local controls, and iterating on the averaged solution space, the method:

  • Avoids the bottleneck of solving high-dimensional Hamiltonian two-point boundary value problems directly.
  • Provides a principled mechanism to approximate the solutions of Pontryagin’s Minimum Principle in systems where analytic or classic numeric techniques are infeasible.
  • Lays the groundwork for iterative data-driven refinement of system models, as discretized controls and their induced state transitions reflect the underlying structure of the true Hamiltonian and the associated response surfaces.
  • Facilitates feedback incorporation and adaptivity, which are critical in real-world control and scheduling applications.

7. Summary and Outlook

Trajectory-based Hamiltonian learning based on the chattering algorithm partitions the optimal control problem into sequential, data-driven linear programs interfaced via locally averaged controls. This approach, by construction, yields:

  • A relaxation of nonconvex, nonsmooth control problems and their reduction to tractable convex optimization,
  • Efficient propagation of state and costate with repeated local correction,
  • Near-real-time implementation capacity, and
  • Enhanced understanding and empirical learning of system Hamiltonians from observed (or simulated) trajectory data. This paradigm offers both conceptual and engineering advances for trajectory optimization and Hamiltonian learning, with demonstrated benefits in computational efficiency, robustness, and practical adaptivity for complex scheduling and real-time control environments (Kumar et al., 2017).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)