Papers
Topics
Authors
Recent
2000 character limit reached

Continuous Cost Aggregation (CCA)

Updated 23 January 2026
  • Continuous Cost Aggregation (CCA) is an algorithmic framework that estimates continuous subpixel disparities from dual-pixel sensor data using local parabolic modeling and quadratic cost aggregation.
  • The method employs path-wise quadratic aggregation with adaptive smoothness constraints to robustly refine disparity estimates, especially in low-texture or blurred regions.
  • CCA leverages a multi-scale fusion strategy, combining coarse-scale priors with fine-scale data, which reduces memory requirements compared to traditional cost-volume approaches.

Continuous Cost Aggregation (CCA) is an algorithmic framework for extracting continuous, subpixel disparities from Dual-Pixel (DP) sensor data, leveraging local parabolic modeling, path-wise quadratic aggregation, and multi-scale coefficient fusion. CCA was introduced to address the challenge posed by DP images’ tiny baseline and non-uniform point spread function (PSF), which preclude conventional stereo matching algorithms from yielding accurate depth information. CCA combines closed-form subpixel disparity estimation within a semi-global matching (SGM) paradigm with efficient propagation of quadratic cost coefficients, enabling pixel-wise minimization without discrete winner-take-all steps and substantially reduced memory requirements relative to classical cost-volume approaches (Monin et al., 2023).

1. Local Parabolic Modeling of Pixelwise Matching Cost

CCA begins with rectified dual-pixel images IL,IRI_L,I_R and computes a discrete per-pixel cost volume Cint(p,d)C_{\text{int}}(p,d) for integer disparities d[dmin,,dmax]d\in[d_{\min},\dots,d_{\max}], typically via metrics such as Sum of Absolute Differences (SAD) or Normalized Cross-Correlation (NCC). Due to the minute DP baseline, subpixel accuracy is essential. Thus, for each pixel pp, a parabola is fit locally around the integer cost minimum:

  • Identify dp0=argminnCint(p,n)d^0_p = \arg\min_n C_{\text{int}}(p,n)
  • Fit Cp,dp0(Δd)=apΔd2+bpΔd+cpC_{p,d^0_p}(\Delta d) = a_p \Delta d^2 + b_p \Delta d + c_p using costs at dp01d^0_p-1, dp0d^0_p, dp0+1d^0_p+1
  • Coefficient calculation:
    • ap=[Cint(p,dp0+1)+Cint(p,dp01)2Cint(p,dp0)]/2a_p = [C_{\text{int}}(p,d^0_p+1) + C_{\text{int}}(p,d^0_p-1) - 2C_{\text{int}}(p,d^0_p)] / 2
    • bp=[Cint(p,dp0+1)Cint(p,dp01)]/2b_p = [C_{\text{int}}(p,d^0_p+1) - C_{\text{int}}(p,d^0_p-1)] / 2
    • cp=Cint(p,dp0)c_p = C_{\text{int}}(p,d^0_p)
  • Transform back to global disparity d=dp0+Δdd=d^0_p+\Delta d to produce Cp(d)=αpd2+βpd+γpC_p(d) = \alpha_p d^2 + \beta_p d + \gamma_p where
    • αp=ap\alpha_p = a_p
    • βp=bp2apdp0\beta_p = b_p - 2a_p d^0_p
    • γp=cp+ap(dp0)2bpdp0\gamma_p = c_p + a_p (d^0_p)^2 - b_p d^0_p

These parabolic models admit unique minima dploc=βp/(2αp)d_p^{\text{loc}} = -\beta_p/(2\alpha_p) and allow the curvature αp\alpha_p to serve as a confidence measure, with flatter parabolas used in ambiguous regions by scaling αp\alpha_p.

2. Path-wise Quadratic Aggregation and Smoothness Constraint

CCA applies semi-global matching by aggregating quadratic cost functions along RR scanline directions. For each path rr, the propagated cost is expressed as a quadratic Lpr(d)L_p^r(d), combining the local parabola and a quadratic smoothness penalty. At pixel pp:

  • Let the predecessor’s optimum mp1=argmindLp1r(d)m_{p-1} = \arg\min_d L_{p-1}^r(d)
  • Aggregate: Lpr(d)=Cp(d)+Padapt(dmp1)2L_p^r(d) = C_p(d) + P_{\text{adapt}}(d-m_{p-1})^2
  • The coefficients update as:
    • Apr=αp+PadaptA_p^r = \alpha_p + P_{\text{adapt}}
    • Bpr=βp+Padapt(Bp1r/Ap1r)B_p^r = \beta_p + P_{\text{adapt}} \cdot (B_{p-1}^r/A_{p-1}^r)
    • Γpr=γp+Padapt(Bp1r/(2Ap1r))2\Gamma_p^r = \gamma_p + P_{\text{adapt}} (B_{p-1}^r/(2A_{p-1}^r))^2
    • mp1=Bp1r/(2Ap1r)m_{p-1} = -B_{p-1}^r / (2A_{p-1}^r)

The adaptive smoothness weight Padapt=PAp1rexp[(IpIp1)2/σ2]P_{\text{adapt}} = P A_{p-1}^r \exp[-(I_p - I_{p-1})^2 / \sigma^2] adjusts smoothing strength according to local image gradients, preserving discontinuities at strong edges.

3. Closed-form Subpixel Disparity Extraction

After aggregation over all RR directions, each pixel pp has RR quadratic costs {Apr,Bpr,Γpr}\{A_p^r, B_p^r, \Gamma_p^r\}. Summation yields a total quadratic cost:

Sp(d)=r=1RLpr(d)=(Apr)d2+(Bpr)d+(Γpr)S_p(d) = \sum_{r=1}^R L_p^r(d) = \left(\sum A_p^r \right) d^2 + \left(\sum B_p^r \right) d + \left(\sum \Gamma_p^r \right)

The subpixel disparity is extracted in closed form by solving for the minimum:

dp=(rBpr)/(2rApr)d_p^* = -\left(\sum_r B_p^r\right) / \left(2 \sum_r A_p^r\right)

This direct coefficient inversion eliminates the need for discrete label selection and enables efficient pixel-wise minimization.

4. Multi-scale Aggregation and Pyramid Fusion

CCA implements a multi-scale strategy to enhance robustness against depth-dependent defocus blur. Using an image pyramid with scales s=S,,1s = S,\dots,1:

  1. Run CCA at coarsest scale SS to obtain coefficient maps Ap,SA_{p,S}, Bp,SB_{p,S}.
  2. Upsample priors for scale S1S-1:
    • Bp,S1prior=Upsample(Bp,SF)B_{p,S-1}^{\text{prior}} = \text{Upsample}(B_{p,S} \cdot F)
    • Ap,S1prior=Upsample(Ap,S/F2)A_{p,S-1}^{\text{prior}} = \text{Upsample}(A_{p,S} / F^2)
  3. Inject priors with weight ww before fine-scale CCA:
    • αp,S1αp,S1+wAp,S1prior\alpha_{p,S-1} \leftarrow \alpha_{p,S-1} + w \cdot A_{p,S-1}^{\text{prior}}
    • βp,S1βp,S1+wBp,S1prior\beta_{p,S-1} \leftarrow \beta_{p,S-1} + w \cdot B_{p,S-1}^{\text{prior}}
  4. Re-run CCA down to full resolution.

In areas with weak texture or blur, the coarse-scale belief steers aggregation; in regions with high local confidence, fine-scale data predominate.

5. Algorithmic Structure

CCA proceeds as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
for s = S down to 1:
    # 2. compute sparse cost-volume -> local parabolas
    for each pixel p:
        compute C_int(p,d) for d in D
        find d_p = argmin_d C_int(p,d)
        fit (a_p,b_p,c_p) via 3-point parabola
        form (α_p,β_p,γ_p)
        if s<S:
            α_p += w * A_prior(p)
            β_p += w * B_prior(p)
    # 3. multi-iteration, multi-direction aggregation
    initialize (A_p,B_p,Γ_p)=0 for all p
    repeat T_s times:
        for direction r in 1...R:
            for pixel p along scanline r:
                if p is first in line:
                    A_prev=α_p; B_prev=β_p
                else:
                    m = -B_prev/(2*A_prev)
                    P_adapt = P*A_prev*exp[-(I_p-I_{p-1})^2/σ^2]
                    A_curr = α_p + P_adapt
                    B_curr = β_p + P_adapt * (B_prev/A_prev)
                accumulate:
                    A_p += A_curr
                    B_p += B_curr
                (A_prev,B_prev)  (A_curr,B_curr)
    # optional: normalize (A_p,B_p)
    # 4. extract subpixel disparity
    for each p:
        d*_p = -B_p/(2*A_p)
    # 5. prepare priors for next scale
    if s>1:
        A_prior = Upsample(A_p)/F^2
        B_prior = Upsample(B_p)*F

6. Experimental Protocols and Quantitative Results

CCA has been quantitatively evaluated on several datasets:

  • DSLR: Canon DP dataset (Punnappurath et al., ICCP 2020)
  • Phone: Google Pixel 2/3 (Garg et al., ICCV 2019)
  • Standard Stereo: Middlebury 2014 (quarter-resolution)

Metrics include affine-invariant errors (AI(1), AI(2)), 1ρs1-|ρ_s|, bad pixel rates at thresholds 0.5, 1, 2 pixels, and RMSE. Key results below:

Table 1. DSLR Results (geometric mean of [AI(1),AI(2),1−|ρ_s|], lower is better)

| Method | AI(1) | AI(2) | 1−|ρ_s| | Geo. Mean | |------------------------|-------|-------|-------|-----------| | SDoF | 0.087 | 0.129 | 0.291 | 0.144 | | DPdisp | 0.047 | 0.074 | 0.082 | 0.065 | | DPE | 0.061 | 0.098 | 0.103 | 0.110 | | CCA | 0.041 | 0.068 | 0.061 | 0.053 | | CCA + filter | 0.036 | 0.061 | 0.049 | 0.048 |

Table 2. Phone (Pixel) Results (same metrics, lower is better)

| Method | AI(1) | AI(2) | 1−|ρ_s| | Geo. Mean | |---------------|-------|-------|-------|-----------| | SDoF | 0.027 | 0.037 | 0.236 | 0.063 | | CCA | 0.026 | 0.036 | 0.225 | 0.059 | | CCA + filter | 0.025 | 0.035 | 0.217 | 0.057 |

Table 3. Middlebury ¼-res Results (non-occluded, lower is better)

Method bad<0.5 px bad<1 px bad<2 px RMSE
SGM 26.1 % 17.2 % 12.2 % 9.90
CCA 26.2 % 18.3 % 13.2 % 5.20
SGM + filter 23.5 % 15.2 % 10.5 % 4.04
CCA + filter 24.6 % 16.7 % 11.6 % 4.04

CCA delivers continuous, subpixel disparities in closed form and operates with O(WHR+WHD)\mathcal{O}(WHR + WHD) time and O(WH)\mathcal{O}(WH) memory, circumventing the need for full cost-volume storage. On DP images, the method surpasses prior learning and non-learning baselines; on standard stereo, it is comparable to Semi-Global Matching (SGM), with quantitative advantages in memory efficiency.

7. Context and Implications

CCA addresses specific limitations of DP disparity extraction, namely sensitivity to PSF variation and the infeasibility of conventional stereo matching due to small disparities. The quadratic form persists throughout coefficient propagation, enabling closed-form minimization and robust multi-scale fusion. A plausible implication is that CCA’s framework could generalize to other continuous-label matching tasks where cost functions are locally convex and aggregatable under quadratic constraints. Its memory and time efficiency offer practical advantages for embedded or real-time applications. CCA’s competitive performance on standard stereo benchmarks (Middlebury) with a reduced computational footprint demonstrates its potential for broader adoption across passive depth sensing modalities (Monin et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Continuous Cost Aggregation (CCA).