Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multiresolution Horn–Schunck Framework

Updated 15 February 2026
  • The framework is a coarse-to-fine optical flow estimation method that uses a pyramid-based image decomposition and iterative Horn–Schunck updates.
  • It employs bilinear interpolation and warping to propagate and refine flow estimates, effectively handling large displacements and noise.
  • Empirical evaluations on the MPI-Sintel benchmark show approximately 25% reduction in endpoint error and 23% decrease in angular error versus the classic approach.

The multiresolution Horn–Schunck (MR-HS) framework is a coarse-to-fine optical flow estimation methodology for sequential image analysis. It extends the classical global variational Horn–Schunck approach by integrating a pyramid-based scale-space decomposition and bilinear interpolation for inter-level flow propagation. This structure enables robust, accurate motion estimation—even in scenarios with large displacements or challenging image conditions—by progressively refining flow fields from coarse to fine spatial resolutions (Ziani, 20 Nov 2025).

1. Pyramid-Based Image Decomposition

The MR-HS framework relies on constructing a Gaussian (or equivalently Laplacian) image pyramid for both input frames ItI_t and It+1I_{t+1}. For images of size W×HW \times H, a sequence of L+1L+1 images {I0,I1,,IL}\{I^0, I^1, \dots, I^L\} is built, where

  • I0I^0 is the original (finest) image,
  • Each subsequent level (=1,,L\ell=1,\dots,L) is formed by Gaussian convolution with kernel gg (e.g., 3×33 \times 3, σ ≈ 1) followed by decimation (downsampling by 2).

This is implemented by the operation: I(x,y)=i=1+1j=1+1g(i,j)I1(2x+i,2y+j)I^\ell(x,y) = \sum_{i=-1}^{+1}\sum_{j=-1}^{+1} g(i,j)\, I^{\ell-1}(2x+i, 2y+j) Alternatively, smoothing and downsampling are often combined via 2×2 bilinear interpolation: I(x,y)=m=01n=01wmn(u,v)I1(2x+m,2y+n)I^\ell(x, y) = \sum_{m=0}^1\sum_{n=0}^1 w_{mn}(u, v)\, I^{\ell-1}(2x+m, 2y+n) where u=2x2xu = 2x - \lfloor 2x \rfloor, v=2y2yv = 2y - \lfloor 2y \rfloor, and wmnw_{mn} are classic bilinear weights.

At the coarsest level (LL), the optical flow fields uLu^L, vLv^L are initialized to zero.

2. Coarse-to-Fine Flow Estimation Strategy

Optical flow refinement proceeds hierarchically from the coarsest (smallest) resolution to the finest (full) resolution:

  1. Prolongation: The estimated flow field (u,v)(u^\ell, v^\ell) at level \ell is upsampled to level 1\ell-1 using bilinear interpolation. The magnitude is doubled to account for the spatial scaling:

u01(x,y)=2m=01n=01wmn(x2i,y2j)u(i+m,j+n)u^{\ell-1}_0(x, y) = 2 \sum_{m=0}^1\sum_{n=0}^1 w_{mn}(\tfrac{x}{2} - i, \tfrac{y}{2} - j)\, u^\ell(i + m, j + n)

with analogous treatment for vv, where (i,j)=(x/2,y/2)(i, j) = (\lfloor x/2 \rfloor, \lfloor y/2 \rfloor).

  1. Warping: The finer-level It+11I^{\ell-1}_{t+1} is warped towards It1I^{\ell-1}_t using the upsampled flow to account for currently estimated motion, reducing bias from large displacements.
  2. Initialization and Refinement: The upsampled flow initializes the Horn–Schunck solver at the next finer level. The energy functional minimized per level is:

E(u,v)=(Ixu+Iyv+It)2+α2(u2+v2)dxdyE_\ell(u, v) = \iint \left(I_x u + I_y v + I_t\right)^2 + \alpha^2\left(\|\nabla u\|^2 + \|\nabla v\|^2\right) \, dx\, dy

Spatial derivatives (Ix,IyI_x, I_y) and temporal difference (It=It+11It1I_t = I^{\ell-1}_{t+1} - I^{\ell-1}_t) are taken on the warped images.

3. Iterative Horn–Schunck Solver and Numerical Updates

At each level and refinement iteration, the MR-HS algorithm applies Gauss–Seidel (or Jacobi/SOR) updates based on local averages:

uˉi,j=15(ui,jk+ui1,jk+ui+1,jk+ui,j1k+ui,j+1k)\bar u_{i, j} = \frac{1}{5}\left(u^k_{i, j} + u^k_{i-1, j} + u^k_{i+1, j} + u^k_{i, j-1} + u^k_{i, j+1}\right)

vˉi,j=15(vi,jk+vi1,jk+vi+1,jk+vi,j1k+vi,j+1k)\bar v_{i, j} = \frac{1}{5}\left(v^k_{i, j} + v^k_{i-1, j} + v^k_{i+1, j} + v^k_{i, j-1} + v^k_{i, j+1}\right)

Given these, the flow is updated elementwise: ui,jk+1=uˉi,jIx(uˉi,jIx+vˉi,jIy+It)α2+Ix2+Iy2+εu^{k+1}_{i, j} = \bar u_{i, j} - \frac{I_x(\bar u_{i, j} I_x + \bar v_{i, j} I_y + I_t)}{\alpha^2 + I_x^2 + I_y^2 + \varepsilon}

vi,jk+1=vˉi,jIy(uˉi,jIx+vˉi,jIy+It)α2+Ix2+Iy2+εv^{k+1}_{i, j} = \bar v_{i, j} - \frac{I_y(\bar u_{i, j} I_x + \bar v_{i, j} I_y + I_t)}{\alpha^2 + I_x^2 + I_y^2 + \varepsilon}

with ε105\varepsilon \sim 10^{-5} for regularization. Iterations are performed until a prescribed convergence criterion (Δu2<τ\|\Delta u\|_2 < \tau) or a set maximum number of steps.

4. Bilinear Interpolation and Boundary Considerations

All resampling—both image downsampling and flow upsampling—uses the standard 2×2 bilinear kernel: f(x,y)=m=01n=01wmn(a,b)f(i+m,j+n)f(x, y) = \sum_{m=0}^1\sum_{n=0}^1 w_{mn}(a, b)\,f(i+m, j+n) where a=xxa = x-\lfloor x \rfloor, b=yyb = y - \lfloor y \rfloor, and the weights wmnw_{mn} are as previously defined.

Boundary handling is implemented with either index clamping (border replication) or zero padding; practical implementation uses libraries such as OpenCV's INTER_LINEAR interpolant with cv2.BORDER_REPLICATE.

A key detail is that prolongation uses the same bilinear weights as downsampling but includes the scaling factor of 2 to correct flow magnitudes at each resolution.

5. Parameter Selection and Convergence Behavior

Critical parameters influencing MR-HS performance include:

  • Smoothness weight α\alpha: Tunes the balance of data fidelity and smoothness, typical range [0.5,5][0.5, 5]; for Sintel data, α1\alpha \approx 1 is effective.
  • Pyramid depth LL: Chosen such that the coarsest grid contains at least 20×2020 \times 20 pixels; for 640×384640 \times 384 frames, values L=3L = 3 or $4$ are common.
  • Iterations per level: Fewer (50–100) for coarse levels, more (200–500) for finer resolutions; typical convergence occurs after a few hundred iterations monitored via the 2\ell_2 norm of flow updates.

Convergence guarantees exist under small-displacement and convexity conditions (see Mitiche & Mansouri, 2004). Empirically, the coarse-to-fine initialization mitigates poor local minima by handling large displacements at successively finer scales.

6. Computational Complexity and Quantitative Performance

The iteration cost per level is O(N)\mathcal{O}(N_\ell), with N=(W/2)×(H/2)N_\ell = (W/2^\ell) \times (H/2^\ell). Total computational effort is approximately: =0LIN43IN\sum_{\ell=0}^L I_\ell N_\ell \approx \frac{4}{3} I N assuming a constant iteration count II per level. Inclusion of warping and gradient computations maintains the overall scaling at O(N×L×I)\mathcal{O}(N \times L \times I), effectively linear in the number of pixels with a small pyramid factor.

Empirical evaluation on the MPI-Sintel benchmark ("final" pass) demonstrates:

Scene HS (AAE/EPE) MR-HS (AAE/EPE) Levels
Alley_1 12.46°, 2.62 6.61°, 1.81 4
Bamboo_2 10.83°, 1.68 8.81°, 1.17 4
Market_2 19.08°, 0.47 15.31°, 0.41 3
Mountain_1 17.13°, 3.90 15.28°, 2.78 4
Average 14.88°, 2.17 11.50°, 1.54

MR-HS provides approximately 25% reduction in endpoint error (EPE) and 23% reduction in angular error (AAE) relative to classic single-scale Horn–Schunck, with negligible added computational cost.

7. Algorithmic Workflow

The following pseudocode synthesizes the end-to-end MR-HS procedure, capturing all core algorithmic steps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Input: I_t, I_{t+1}, levels L, smoothness α, iters_per_level
Output: flow u^0,v^0 at full resolution

// 1. Build image pyramids
I^0_t = I_t;   I^0_{t+1} = I_{t+1}
for ℓ = 1 to L do
  I^ℓ_t     = downsample( I^{ℓ-1}_t )
  I^ℓ_{t+1} = downsample( I^{ℓ-1}_{t+1} )
end for

// 2. Initialize flow at coarsest level
u^L  zeros( size(I^L) )
v^L  zeros( size(I^L) )

// 3. Coarse-to-fine  
for ℓ = L down to 0 do
  if ℓ < L then
    // prolongate previous levels flow
    (u^ℓ, v^ℓ) = bilinear_upsample( u^{ℓ+1}, v^{ℓ+1} ) × 2
  end if

  // warp I_{t+1} toward I_t under current flow
  I^ℓ_{t+1,w} = warp( I^ℓ_{t+1}, u^ℓ, v^ℓ )

  // compute image derivatives on (I^ℓ_t, I^ℓ_{t+1,w})
  [I_x, I_y] = spatial_gradients( I^ℓ_t )
  I_t = I^ℓ_{t+1,w}  I^ℓ_t

  // refine flow by HornSchunck iterations
  for k = 1 to iters_per_level[ℓ] do
    for each pixel (i,j) in scan order do
      compute local averages ȳu, ȳv
      update u^ℓ_{i,j}, v^ℓ_{i,j} by the HS formula above
    end for
    if convergence then break
  end for
end for

return (u^0, v^0)

This MR-HS pipeline combines spatial multiresolution, variational regularization, and bilinear prolongation, yielding a principled, scalable approach for optical flow computation suitable for diverse applications in computer vision (Ziani, 20 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Multiresolution Horn–Schunck Framework.