Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 86 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 15 tok/s
GPT-5 High 16 tok/s Pro
GPT-4o 102 tok/s
GPT OSS 120B 467 tok/s Pro
Kimi K2 188 tok/s Pro
2000 character limit reached

Model-Based Planner for Ball Trajectory Prediction

Updated 29 August 2025
  • The planner integrates image-based detections with physical and probabilistic models, achieving accurate ball trajectory predictions even under occlusions.
  • It uses a probabilistic graphical model and Mixed Integer Programming to ensure smooth state transitions, physically plausible motion, and consistent ball-player interactions.
  • Empirical validation on sports like volleyball, basketball, and soccer shows that the system robustly bridges occlusions and rapid state changes, outperforming simpler models.

A model-based planner for ball trajectory prediction refers to a system that infers, reconstructs, or predicts the path of a ball in dynamic sports or robotic scenes by integrating learned or observed image data, physical constraints, and models of ball–player interaction. Such planners seek to address the intrinsic challenges present in tracking or forecasting ball motion under conditions of occlusion, abrupt state transitions (e.g., from "possessed" to "flying"), and physically plausible movement, as encountered in team sports like volleyball, basketball, and soccer.

1. Problem Definition and Core Methodology

The goal of a model-based planner for ball trajectory prediction is to accurately estimate the time-ordered sequence of a ball's 3D locations, {Xt}\{X^t\}, and semantic states, {St}\{S^t\}, even with unreliable or missing detections due to occlusion, motion blur, or ambiguous image evidence. This is achieved by coupling direct image-based detections with models of ball dynamics and explicit representations of player–ball interactions.

The methodology central to this paradigm involves two key ingredients:

  1. Probabilistic Graphical Model for Temporal Coupling At each frame tt, the ball's state is fully specified by its 3D position XtX^t, its semantic state StS^t (e.g., "flying", "rolling", "in_possession"), and associated image evidence ItI^t. A probabilistic temporal model links these variables across time via potential functions:

Ψ(X,S,I)=1ZΨI(X1,S1,I1)t=2T[ΨX(Xt1,St1,Xt)ΨS(St1,St)ΨI(Xt,St,It)]\Psi(X,S,I) = \frac{1}{Z} \Psi_I(X^1, S^1, I^1) \prod_{t=2}^T [\Psi_X(X^{t-1}, S^{t-1}, X^t) \cdot \Psi_S(S^{t-1}, S^t) \cdot \Psi_I(X^t, S^t, I^t)]

  • ΨI\Psi_I: image potential from detector and classifier
  • ΨS\Psi_S: state transition smoothness (e.g., "rolling" cannot switch directly to "flying" without intermediate "possessed" state)
  • ΨX\Psi_X: state-dependent physical transition constraint (e.g., maximum allowed velocity or proper ballistic motion)
  1. Mixed Integer Programming (MIP) Tracking Formulation The estimation is reformulated as a global optimization—specifically a Mixed Integer Program—on a graph-based structure:
    • Ball Graph Gb=(Vb,Eb)G_b = (V_b, E_b): Nodes encode candidate (xi,si,ti)(x_i, s_i, t_i); edges link feasible transitions between nodes at consecutive times.
    • Variables: Binary flows fij{0,1}f_{ij} \in \{0,1\} indicate activation of edge (i,j)(i, j) (ball transitions in space and state).
    • Cost Function: Associated with each edge, combining logarithms of relevant potentials:

    cbij=logΨX(xi,si,xj)+logΨS(si,sj)+logΨI(xj,sj,Itj)c_{bi}^j = \log \Psi_X(x_i, s_i, x_j) + \log \Psi_S(s_i, s_j) + \log \Psi_I(x_j, s_j, I^{t_j})

  • Optimization:

    maximize (i,j)Ebfijcbij\text{maximize } \sum_{(i, j) \in E_b} f_{ij} c_{bi}^j

    Subject to:

    • Binary constraints (fij{0,1}f_{ij} \in \{0,1\})
    • One active transition at t=1t = 1
    • Flow conservation at intermediate nodes
    • (Continuous) physical constraints as additional MIP constraints (see Section 3)

2. State Space and Interaction Modeling

The planner models the state space of the ball using semantically distinct and physically meaningful labels. Transitions between states are strictly regularized:

  • States: Typical examples include "in_possession", "flying", "rolling".
  • State Transition Potential: ΨS(si,sj)=P(St=sjSt1=si)\Psi_S(s_{i}, s_{j}) = P(S^{t} = s_{j} | S^{t-1} = s_{i}) is learned from annotated data to reflect feasible transitions and temporal smoothness.

Ball–player interaction is codified structurally:

  • When assigned state "in_possession", a proximity constraint is enforced. The ball's estimated location must be near an active player detection. This is formulated over a player graph Gp=(Vp,Ep)G_p = (V_p, E_p) and enforced as:

(k,l)Ep,tl=tj,xjxl2Dppkli:(i,j)Ebfij\sum_{(k, l) \in E_p, t_l = t_j, \|x_j - x_l\|_2 \leq D_p} p^l_k \geq \sum_{i : (i, j) \in E_b} f^j_i

  • This guarantees that whenever the ball is considered possessed, a player is actually present at the appropriate spatial position, physically anchoring the ball to plausible agents.

3. Incorporation of Physical Constraints

Physical feasibility is enforced through explicit constraints in the MIP that are informed by ballistics and contact dynamics:

  • Discrepancy Constraint: The continuous 3D position PtP^t (prediction) must remain close to the discrete candidate XtX^t:

PtXtDl\|P^t - X^t\| \leq D_l

  • Ballistics Constraint: For any state indicating free motion ("flying", "rolling"), the following second-order (state-dependent) difference constraint is imposed (for each coordinate cc):

As,c(Pct2Pct1+Pct2)+Bs,c(PctPct1)+Cs,cPctFs,cK(3Ms,ctMs,ct1Ms,ct2)A^{s, c}(P^t_c - 2P^{t-1}_c + P^{t-2}_c) + B^{s, c}(P^t_c - P^{t-1}_c) + C^{s, c}P^t_c - F^{s, c} \leq K(3 - M^t_{s, c} - M^{t-1}_{s, c} - M^{t-2}_{s, c})

  • Conditional Enforcement: Ms,ctM^t_{s, c} and the bound KK act as switches, imposing the strict physical constraint only when the state ss is active (e.g., only enforce ballistic constraints during "flying").

4. Learning of Potentials and Empirical Parameterization

All key potentials are trained from real, annotated ball sport sequences:

  • Image Evidence Potential:

ΨI(x,s,I)=σs(Pb(xI)Pc(sx,I))\Psi_I(x, s, I) = \sigma_s \big(P_b(x \mid I) \cdot P_c(s \mid x, I)\big)

with σs(y)=1/(1+e(θs0+θs1y))\sigma_s(y) = 1 / (1 + e^{- (\theta_{s0} + \theta_{s1} y)})

  • State Transition Potential: Inferred from the empirical co-occurrence of state transitions in real game data.
  • Physical Coefficients: The coefficients A,B,C,FA, B, C, F for the second-order physical constraints are explicitly set according to the underlying physics (e.g., for gravity, A=1A=1, F=g/(fps)2F=-g / \text{(fps)}^2).

This data-driven approach ensures that the cost surface guiding the optimizer reflects both statistical visual feasibility and underlying physical validities learned from actual play.

5. Empirical Validation and Robustness

The model is validated empirically on challenging sequences in volleyball, basketball, and soccer, each representing different interaction schemas and occlusion/visibility patterns. The MIP-based model yields:

  • Recovery of Physically Realistic Trajectories: E.g., in volleyball, enforcing second-order constraints and player coupling "finds" plausible parabolic arcs after player strikes, even under severe occlusions.
  • Smooth and Semantically Consistent State Trajectories: The system avoids unphysical state jumps (e.g., instant switch from "rolling" to "flying" without intermediary contact).
  • Occlusion Bridging: The model robustly interpolates when direct visual evidence disappears (as when the ball is not visible in soccer during rolling).
  • Superior Accuracy: Across all domains, the formulation outperforms baselines which model either physics or player–interaction alone.

6. Mathematical Summary Table

Component Mathematical Formulation Role
Probabilistic model energy Ψ(X,S,I)=1ZΨI(X1,S1,I1)t=2T[]\Psi(X,S,I) = \frac{1}{Z} \Psi_I(X^1,S^1,I^1)\prod_{t=2}^T [\cdot\cdot\cdot] Global trajectory likelihood
Edge cost in ball graph cbij=logΨX+logΨS+logΨIc_{bi}^j = \log \Psi_X + \log \Psi_S + \log \Psi_I Encodes evidence & feasibility
Ballistics/physics constraint As,c(Pct2Pct1+Pct2)+K(3Ms,ct)A^{s,c}(P^t_c - 2P^{t-1}_c + P^{t-2}_c) + \cdots \leq K (3-M^t_{s,c}\cdots) Ball motion plausibility by state
Possession constraint on player coupling (k,l)Ep,xjxl2Dppklifij\sum_{(k,l)\in E_p,\|x_j-x_l\|_2\leq D_p}p^l_k \geq \sum_{i} f^j_i Ensures proximity during possession

7. Implications and Scope

By framing trajectory prediction as a joint optimization over detection reliability, semantic state constraints, and physically grounded dynamics, the planner delivers significant improvements in robustness and interpretability for broadcast and analytic applications in sports. The formulation is inherently extensible: new game-specific knowledge (e.g., for rugby, hockey) can be encoded as additional states, constraints, or graph structure.

The coupling of ball–player proximity, discrete state transitions, and continuous physics injects strong priors that allow the system to "explain away" missing data with physically plausible interpolations, a property not observed in pure track–linkage or pure dynamical models. The MIP structure also lends itself to integrating new sources of soft evidence or high-level annotations with minimal architectural change.

In summary, model-based planners for ball trajectory prediction offer a principled and extensible solution for physically constrained tracking and anticipation, with demonstrated benefits across a range of ball–player interaction modalities and diverse real-world game environments (Maksai et al., 2015).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)