Papers
Topics
Authors
Recent
Search
2000 character limit reached

Newtonian Motion Primitives Explained

Updated 10 December 2025
  • Newtonian Motion Primitives are canonical motion patterns defined by constant-acceleration dynamics, modeling fundamental motions like free fall, throws, and ramp sliding.
  • They serve as benchmarks and enforcement targets in post-training generative video models using physics-grounded reward functions that leverage optical flow and mass proxies.
  • Evaluations on the NewtonBench-60K suite demonstrate improved visual fidelity and reduced trajectory errors, with consistent RMSE reduction across in-distribution and OOD settings.

Newtonian Motion Primitives (NMPs) define a set of canonical, physically plausible motion patterns governed by constant-acceleration dynamics, which serve as benchmarks and enforcement targets for evaluating and improving the physical realism of generative video models. Characterized by their adherence to Newton’s laws, NMPs formalize five essential classes of single-object motion: free fall, horizontal throw, parabolic throw, and sliding on a ramp (downward and upward), spanning a range of gravitational and frictional regimes. These primitives are central to methods that use verifiable, physics-grounded reward functions to post-train generative video models, closing the gap between visual plausibility and physical correctness (Le et al., 29 Nov 2025).

1. Definition and Formalization of Newtonian Motion Primitives

NMPs are defined as trajectories parameterized by constant-acceleration equations, modeling archetypal object motions under Newtonian mechanics. The five primitives, as instantiated in the NewtonBench-60K benchmark, are:

Primitive Defining Dynamics Constant Acceleration Components
Free Fall (NMP-F) Released from rest under uniform gravity gg with no initial velocity. ax=0a_x=0, ay=ga_y = -g
Horizontal Throw (NMP-TH) Launched with horizontal velocity v0v_0, zero vertical speed, under gravity. ax=0a_x=0, ay=ga_y = -g
Parabolic Throw (NMP-TP) Arbitrary velocity (v0x,v0y)(v_{0x},v_{0y}) under gravity. ax=0a_x=0, ay=ga_y = -g
Ramp Sliding Down (NMP-RD) Sliding down incline θ\theta with kinetic friction μk\mu_k. as=+g(sinθμkcosθ)a_s = +g(\sin\theta - \mu_k \cos\theta)
Ramp Sliding Up (NMP-RU) Initial uphill speed on same ramp, subject to gravity/friction, resulting in negative acc. as=g(sinθμkcosθ)a_s = -g(\sin\theta - \mu_k \cos\theta)

Ramp sliding accelerations asa_s are projected into image-plane axes using the tangent vector s^\hat{s}: (ax,ay)=(astx,asty)(a_x, a_y) = (a_s t_x, a_s t_y).

NMPs are chosen to exemplify motions where Newton’s second law hypothesizes constant acceleration, encompassing both gravity-induced and contact-dynamics scenarios.

2. Verifiable, Physics-Grounded Reward Functions

Two verifiable rewards complement each other to enforce adherence to NMP dynamics within video diffusion models:

a) Newtonian Kinematic Constraint:

Derived from discrete kinematics, the constant-acceleration residual for velocity proxies is:

vt+12vt+vt1=0v_{t+1} - 2v_t + v_{t-1} = 0

Operationalized via optical-flow proxies ϕt\phi_t (approximating vtΔtv_t \Delta t), the loss is:

Lkin=tϕt+12ϕt+ϕt122,L_{\text{kin}} = \sum_t \|\phi_{t+1} - 2\phi_t + \phi_{t-1}\|_2^2,

enforcing that estimated accelerations remain strictly constant along the image plane.

b) Mass-Conservation Reward:

Prevents degenerate solutions (e.g., objects slowing to near-zero speed) by anchoring motion to consistent visual identity/mass:

Rmass=1Tt=0T1ztgenztsim22,R_{\text{mass}} = \frac{1}{T} \sum_{t=0}^{T-1} \|z_t^{\text{gen}} - z_t^{\text{sim}}\|_2^2,

where ztgenz^{\text{gen}}_t and ztsimz^{\text{sim}}_t are high-level encoder features for generated and simulated reference frames, serving as mass proxies. Minimizing RmassR_{\text{mass}} maintains temporal invariance of object attributes.

The combined loss for post-training is Lphys=λkinLkin+λmassLmassL_{\text{phys}} = \lambda_{\text{kin}} L_{\text{kin}} + \lambda_{\text{mass}} L_{\text{mass}}.

3. Extraction of Physical Proxies

Physical quantities required for reward evaluation are extracted with frozen, pretrained utility models:

  • Velocity Proxy: Optical flow is estimated for each frame pair (Vt,Vt+1)(V_t, V_{t+1}) via frozen RAFT, yielding ϕt=(ϕt(x),ϕt(y))\phi_t = (\phi_t^{(x)}, \phi_t^{(y)}), from which vtϕt/Δtv_t \approx \phi_t/\Delta t.
  • Mass Proxy: High-level frame encodings ztz_t are obtained from a frozen V-JEPA2 video encoder. These features capture object identity and material, providing a differentiable stand-in for mass in the reward computation.

No depth, force, or direct trajectory supervision is used; all supervision arises from these measurable proxies.

4. Post-Training with NewtonRewards

The NewtonRewards\texttt{NewtonRewards} algorithm applies verifiable reward functions to post-train video diffusion models:

  1. Initialize from a pretrained generator GθG_\theta (OpenSora v1.2), text- and frame-conditioned.
  2. For each iteration:
    • Sample latent noise ϵ\epsilon and conditioning cc; generate a 32-frame clip V=Gθ(ϵ,c)V = G_\theta(\epsilon, c).
    • Compute optical flow (ϕt\phi_t) and mass proxy embeddings (ztgenz_t^{\text{gen}}) with frozen RAFT and V-JEPA2 on VV.
    • Pair features with reference simulated features (ztsimz_t^{\text{sim}}); calculate LkinL_{\text{kin}}, LmassL_{\text{mass}}.
    • Aggregate into LphysL_{\text{phys}}; update GθG_\theta via AdamW optimizer (1e51\mathrm{e}{-5} LR, batch size 1).
    • Utility models remain fixed during all updates.

Backpropagation is restricted to GθG_\theta, enabling scalable, reward-driven adaptation to physics priors.

5. NewtonBench-60K Evaluation Suite

The NewtonBench-60K dataset provides ground-truth trajectories for the five NMPs across a wide regime of physical parameters:

  • Composition: 50K training clips (10K per NMP), 10K benchmark clips (2K per NMP), evenly split into in-distribution (ID) and out-of-distribution (OOD) settings (parameter shifts in height, velocity, friction).
  • Rendering pipeline: Kubric + PyBullet + Blender, 32 frames/clip at 16 fps and 512×512512 \times 512 resolution.
  • Metrics:
    • Physics: Velocity RMSE, Acceleration RMSE computed from centroids extracted via SAM2 segmentation.
    • Visual: Trajectory L2 error, Chamfer distance, Binary mask Intersection over Union (IoU).

These metrics dissect both physical and visual fidelity of generated video sequences.

6. Quantitative Impact and Ablative Analyses

Application of NewtonRewards confers consistent improvements across NMPs and metric suites:

Test Regime Model L2 CD IoU vRMSE aRMSE Avg. Gain (%)
ID OpenSora (SFT) 0.1098 0.3159 0.1103 0.2792 3.3244
+NewtonRewards 0.0962 0.2930 0.1266 0.2628 3.0432 +9.75
OOD OpenSora (SFT) 0.1297 0.4082 0.0998 0.4230 6.1451
+NewtonRewards 0.1207 0.3780 0.1025 0.3816 5.1575 +8.60
  • All five NMPs show reduced trajectory and contour error, improved IoU, and lower RMSE in velocity/acceleration under both ID and OOD.
  • Ablations confirm that visual-feature alignment alone yields minor spatial gains but may substantially worsen velocity/acceleration errors.
  • Removing the kinematic loss nearly nullifies motion regularization; eliminating the mass term results in degenerate “reward hacking” with >66% speed collapse.

7. Significance and Broader Context

NMPs, as structured benchmarks and enforcement targets, enable discriminative assessment of physical plausibility in video generation. Their integration within NewtonRewards demonstrates that post-training with verifiable, physics-grounded rewards is feasible using only measurable proxies, without reliance on explicit trajectory or force data. This approach maintains quantitative fidelity to Newton’s laws across in-distribution and novel physical regimes, including frictional contacts and parameter shifts. A plausible implication is that such post-training could serve as a foundation for scaling physics-aware video generation to broader, more complex settings, provided further extension of proxy models and motion primitive classes (Le et al., 29 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Newtonian Motion Primitives (NMPs).