Papers
Topics
Authors
Recent
Search
2000 character limit reached

RDIS Algorithm: Optimization, Depth & Imputation

Updated 15 June 2026
  • RDIS Algorithm is a multi-faceted framework that tackles nonconvex optimization, monocular depth estimation, and time series imputation through distinct yet conceptually unified approaches.
  • In nonconvex optimization, it utilizes hypergraph-based variable selection and recursive decomposition to break down complex problems and achieve exponential speedups.
  • For depth estimation and time series imputation, RDIS leverages ordinal pretraining and ensemble self-training with random drop imputation to enhance prediction robustness and accuracy.

The acronym RDIS refers to three distinct algorithms and frameworks in the literature, each addressing a separate core problem: (1) Recursive Decomposition for Nonconvex Optimization; (2) Relative Depth in Stereo for monocular depth estimation pretraining; (3) Random Drop Imputation with Self-Training for incomplete time series imputation. The following article focuses on each framework in turn, providing precise definitions, core methodology, and empirical findings.

1. Recursive Decomposition for Nonconvex Optimization

1.1. Problem Formulation

RDIS solves the global optimization problem minxRnf(x)\min_{x \in \mathbb{R}^n} f(x) where ff is continuously differentiable and possesses at least one global minimizer xx^* with finite f=f(x)>f^* = f(x^*) > -\infty. The variable indices are denoted I={1,2,,n}I = \{1,2,\ldots,n\}. For any subset CIC \subseteq I, the variable block xx is partitioned into xCx_C and xUx_U with U=ICU = I \setminus C. A partial assignment ff0 fixes ff1; ff2 denotes the function with ff3 fixed (Friesen et al., 2016).

1.2. Recursive Decomposition Strategy

RDIS alternates two phases:

  • Value Selection: Select a cutset ff4 of variables (via hypergraph partitioning of the factorized ff5), then optimize ff6 over ff7 (holding ff8 fixed) to obtain an assignment ff9.
  • Decomposition: Simplify xx^*0 by omitting or approximating negligible terms. Identify xx^*1 independent subfunctions xx^*2 on disjoint xx^*3, then recurse on each.

This approach exploits local separability after key variable assignments, similar to DPLL-style SAT solvers and recursive conditioning in inference (Friesen et al., 2016).

1.3. Variable Selection via Hypergraph Partitioning

For xx^*4, define a hypergraph xx^*5 with one vertex per term and one hyperedge per variable, connecting all terms involving a variable. A xx^*6-way partition minimizes the number of cut hyperedges under balance, yielding a small xx^*7 whose assignment decomposes the residual function. Tools such as PaToH are used for hypergraph partitioning (Friesen et al., 2016).

1.4. Pseudocode and Algorithmic Structure

The main RDIS pseudocode proceeds as follows:

  1. Choose cutset xx^*8 via hypergraph cut.
  2. For each restart, partition xx^*9, optimize f=f(x)>f^* = f(x^*) > -\infty0 via a user-chosen nonconvex subspace optimizer f=f(x)>f^* = f(x^*) > -\infty1, yielding assignment f=f(x)>f^* = f(x^*) > -\infty2.
  3. Simplify f=f(x)>f^* = f(x^*) > -\infty3 using a tolerance f=f(x)>f^* = f(x^*) > -\infty4.
  4. Decompose the simplified function into independent components and recurse on each.
  5. Update the global record if a new best function value is found.
  6. Terminate according to a preset criterion, e.g., fixed outer restarts or full variable assignment (Friesen et al., 2016).

1.5. Theoretical Guarantees

Assuming at every recursion a decomposition into f=f(x)>f^* = f(x^*) > -\infty5 subproblems and cut-block size f=f(x)>f^* = f(x^*) > -\infty6, let f=f(x)>f^* = f(x^*) > -\infty7 be the number of subspace optimizer calls required on f=f(x)>f^* = f(x^*) > -\infty8 variables. Recurrence analysis yields:

f=f(x)>f^* = f(x^*) > -\infty9

This result demonstrates exponential speedups over grid search or random restart descent, under mild technical conditions on I={1,2,,n}I = \{1,2,\ldots,n\}0 for global convergence. For I={1,2,,n}I = \{1,2,\ldots,n\}1 and I={1,2,,n}I = \{1,2,\ldots,n\}2 satisfying Armijo/gradient-norm decrease, all limit points are stationary and global optimality is achieved with high probability under random restarts (Friesen et al., 2016).

1.6. Use of Standard Optimizers

RDIS is agnostic to the choice of local optimizer I={1,2,,n}I = \{1,2,\ldots,n\}3 at recursion leaves. I={1,2,,n}I = \{1,2,\ldots,n\}4 may be gradient descent with restart, Levenberg–Marquardt, or similar; the optimizer focuses on the chosen cutset block, treating other variables as fixed. If the remaining variable set is empty, RDIS directly applies I={1,2,,n}I = \{1,2,\ldots,n\}5 to the full initial problem (Friesen et al., 2016).

1.7. Empirical Evaluation

Empirical results demonstrate RDIS's advantage in several domains:

  • Structure from Motion: On bundle adjustment (up to 23,000 variables), RDIS consistently finds lower reprojection error than Levenberg–Marquardt (LM) and block-coordinate LM, with advantages increasing at scale.
  • Highly Multimodal Synthetic Functions: RDIS outperforms conjugate gradient and block variants by orders of magnitude in objective value and time.
  • Protein Sidechain Placement: On 21 proteins (up to 943 variables), RDIS attains lower energy than CGD and BCD-CGD, with simplification tolerance I={1,2,,n}I = \{1,2,\ldots,n\}6 trading speed and final energy.

The combination of intelligent cutset selection, function simplification, and recursive decomposition yields exponential gains over standard multistart or block-coordinate techniques (Friesen et al., 2016).

2. RDIS Dataset and Method for Monocular Depth Estimation

2.1. Dataset Construction and Label Semantics

The RDIS dataset is built from 70 rectified 3D movies, yielding 97,652 stereo keyframes. Semi-Global Matching (SGM) is used to compute dense disparity maps, with subsequent boundary correction and quality control. Relative depth ground-truth is encoded as ordinal relationships on point pairs: "closer" (I={1,2,,n}I = \{1,2,\ldots,n\}7), "farther" (I={1,2,,n}I = \{1,2,\ldots,n\}8), or "equal" (I={1,2,,n}I = \{1,2,\ldots,n\}9), based on a thresholded disparity difference (Cao et al., 2018).

2.2. Network Architecture

The approach employs a "wide" ResNet (seven units) with pre-activation BatchNorm–ReLU style, leveraging ImageNet+Places365 pretraining. The network head is configured for regression (CIC \subseteq I0) in pretraining and per-pixel classification (CIC \subseteq I1) in finetuning (Cao et al., 2018).

2.3. Pretraining on Ordinal Depth

Pretraining utilizes sampled ordinal pairs with a ranking loss:

CIC \subseteq I2

Empirically, CIC \subseteq I3 pairs per image optimizes transfer performance (Cao et al., 2018).

2.4. Depth as Classification and Information Gain Loss

Finetuning discretizes depth into CIC \subseteq I4 bins (log-space), with network outputs as per-pixel logits. The multinomial logistic loss is modulated by an information gain matrix:

CIC \subseteq I5

with CIC \subseteq I6. This allows near-correct predictions to be weighted, improving gradient signal for ambiguous cases (Cao et al., 2018).

2.5. Evaluation and Ablation

Ablation studies confirm benefits of RDIS pretraining, network width, and the information gain loss. The full method achieves state-of-the-art metrics on NYU v2 and KITTI, outperforming prior methods on root-mean-squared error, relative error, log error, and accuracy within thresholds. The pretraining enables generalization to relative-depth benchmarks (DIW test: WHDR CIC \subseteq I7, improving on prior best of CIC \subseteq I8) (Cao et al., 2018).

3. Random Drop Imputation with Self-Training (Time Series)

3.1. Problem Context and Main Steps

Given incomplete multivariate time series CIC \subseteq I9 with original mask xx0, RDIS trains an imputation model xx1 to estimate the unobserved entries. The methodology consists of:

  • Random-Drop Imputation (RDI): Mask a random subset of observed entries, optimize xx2 to recover them using explicit loss.
  • Self-training: Train an ensemble of xx3 models. The ensemble is used to generate pseudo-labels for the original missing entries, filtered by prediction-variance, and used for further fine-tuning (Choi et al., 2020).

3.2. Explicit Imputation and Self-training Losses

The loss on each random-dropped instance is:

xx4

where xx5 indicates the newly dropped entries. The self-training loss incorporates pseudo-value targets xx6 at confident unobserved positions (variance threshold xx7):

xx8

with xx9 (Choi et al., 2020).

3.3. Pseudocode and Model-Agnosticity

The RDIS framework supports any xCx_C0 (e.g., GRU, Bi-GRU, Transformer, TCN, GAN). The pseudocode details alternating RDI training (explicit dropout and recovery) and periodic self-training cycles (ensemble pseudo-label generation, entropy filtering, model update). Ensemble size, drop probability, entropy threshold, and update-frequency are principal hyperparameters (Choi et al., 2020).

3.4. Empirical Validation

Empirical comparisons on the Air Quality (11 variables, 48 time steps) and Gas Sensor (19 variables) datasets show that RDIS (with Bi-GRU) delivers minimal mean squared error among baselines (e.g., at 50% missing, BRITS: xCx_C1 vs. RDIS(Bi-GRU): xCx_C2). Ablative comparisons indicate that ensemble RDI and the self-training stage both confer performance advantages; the largest gains accrue at elevated missing rates (xCx_C3) (Choi et al., 2020).

3.5. Practical Considerations

Utilizing an ensemble increases computational cost and necessitates careful tuning of drop rate, entropy threshold, and update frequency. Pseudo-label quality hinges on ensemble variance estimation, and RDIS only imputes point estimates; extensions for full predictive distributions are left to further work (Choi et al., 2020).

4. Comparison of RDIS Methodologies

Context RDIS Meaning Core Principle
Nonconvex Optimization Recursive Decomposition for Nonconvex Optimization Divide-and-conquer opt.
Depth Estimation Relative Depth in Stereo Dataset and Pretraining Ordinal pretraining
Time Series Imputation Random Drop Imputation with Self-Training Explicit drop + ensembling

All RDIS variants emphasize decomposition (explicit or statistical) and leverage auxiliary structures (graph partitioning, ordinal structure, ensemble consensus) to enhance model performance in the respective domains. Each achieves state-of-the-art or competitive empirical results in its target application (Friesen et al., 2016, Cao et al., 2018, Choi et al., 2020).

5. Significance and Research Impact

RDIS as recursive decomposition for nonconvex optimization has advanced scalable global optimization by importing principles from combinatorial problem solving and outperforming prior continuous methods across vision and molecular modeling tasks (Friesen et al., 2016). As a depth estimation pretraining method, RDIS democratizes dense ordinal-depth supervision, bridging the scarcity of metric ground truth and producing robust monocular predictors (Cao et al., 2018). In imputation, RDIS delivers explicit supervision and confidence-calibrated pseudo-labels for incomplete time series, outperforming strong baselines at high missingness (Choi et al., 2020). The common thread is the augmentation of training with problem-informed structure, be it decompositional, ordinal, or variance-filtered.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to RDIS Algorithm.