Papers
Topics
Authors
Recent
Search
2000 character limit reached

PW-DICE: Ultrasound & Imitation Advances

Updated 3 April 2026
  • PW-DICE for ultrasound imaging employs a score-based diffusion shortcut to reconstruct high-quality compounded B-mode images from a single plane wave, reducing diffusion steps by roughly 60%.
  • PW-DICE for offline imitation learning uses a primal optimal transport framework with f-divergence regularization to achieve robust state-occupancy matching between learner and expert.
  • Both methods deliver practical computational gains and theoretical unification while highlighting challenges in hyperparameter tuning and scalability.

PW-DICE encompasses two distinct methodological advances in machine learning: one in inverse problems for ultrasound (US) reconstruction via diffusion models, and another in offline imitation learning, where Primal Wasserstein DICE provides a regularized, optimal transport-based approach for state-occupancy matching. Despite the coincident acronym, the two contributions are unrelated in their application domains and theoretical underpinnings. This entry documents both lines in depth, reflecting their independent developments as established in (Li et al., 2023) and (Yan et al., 2023).

1. PW-DICE for Ultrasound Imaging

PW-DICE—"single plane wave takes a shortcut to plane wave compounding"—is a reconstruction paradigm designed to synthesize high-quality compounded B-mode US images from a single low-quality plane-wave (PW) acquisition using a score-based diffusion model shortcut (Li et al., 2023). The core innovation is leveraging the structural similarity between single-PW and compounded-PW images to bypass initialization from pure Gaussian noise, thereby reducing sampling cost.

Problem Formulation

US image reconstruction is modeled as a linear inverse problem. Let y∈RL⋅ry \in \mathbb{R}^{L \cdot r} be the measured radio-frequency data from LL transducer elements, rr samples; x∈Rpx \in \mathbb{R}^p discretizes the field of scattering coefficients, and H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p} models the physical measurement. Signal formation is y=Hx+ny = H x + n, with nn Gaussian noise. Standard delay-and-sum (DAS) beamforming is x′=H⊤yx' = H^\top y for a single PW (xsx_s) or for compounded (multi-angle) acquisitions (xcx_c). The goal is to reconstruct high-quality LL0 given only LL1.

Algorithmic Structure

PW-DICE modifies the standard diffusion generative process as follows:

  1. Rather than sampling from maximum noise (LL2), it forward-diffuses the accessible single PW image LL3 up to an intermediate noise level LL4, forming LL5, LL6.
  2. The reverse diffusion trajectory is initialized from LL7 and carried out for LL8 steps, using the same EDM (Elucidated Diffusion Model) sampling, typically employing the Heun 2nd-order solver.
  3. Measurement consistency is enforced at each reverse step by projecting onto the data-consistent set: LL9.

Empirically, starting from rr0 (with rr1 and rr2) reduces diffusion steps from 50 (conventional) to 20—representing ~60% reduction in computational workload. The resulting images match or slightly exceed the contrast-to-noise ratio (CNR) and generalized CNR (gCNR) of both conventional diffusion models and the multi-angle (75 PW) DAS reconstructions.

Architecture and Training

The denoiser/score network adopts the EDM architecture: a U-Net conditioned on the noise level. The denoising objective is the standard rr3 score-matching loss. The model is trained on 300 in-vivo carotid frames (compounded from 75 PWs, angles uniformly distributed), with rr4, rr5, and rr6 diffusion steps. Experiments are conducted on the Verasonics Vantage 256 system, test data comprises 100 frames.

Ablation and Sensitivity

Performance is sensitive to the noise initialization rr7: lower rr8 enables fewer reverse steps but increases risk of structural bias, while higher rr9 allows more robust convergence. No principled approach is provided for tuning these hyperparameters, with future adaptation proposed.

Results

PW-DICE achieves comparable or superior CNR/gCNR to both DAS and conventional EDM using 40% of the computational cost. Variance across runs is reduced due to the structured single-PW prior. For all tested x∈Rpx \in \mathbb{R}^p0, beyond 10–20 steps, gCNR surpasses the 75-PW baseline (Li et al., 2023).

2. PW-DICE for Offline Imitation from Observation

Primal Wasserstein DICE (PW-DICE) is an advancement in distribution correction estimation (DICE) methods for offline imitation learning from observation (LfO), offering a primal optimal transport–based approach to discounted state-occupancy matching between the learner and expert with theoretical unification over prior x∈Rpx \in \mathbb{R}^p1-divergence-based approaches (Yan et al., 2023).

State-Occupancy Matching in LfO

Given a discounted MDP with policy x∈Rpx \in \mathbb{R}^p2, the state-occupancy measure is x∈Rpx \in \mathbb{R}^p3. In LfO, the goal is to match x∈Rpx \in \mathbb{R}^p4 (learner) to x∈Rpx \in \mathbb{R}^p5 (expert), given expert state-only dataset x∈Rpx \in \mathbb{R}^p6 and a non-expert dataset x∈Rpx \in \mathbb{R}^p7 of state–action–next-state tuples.

Primal Wasserstein Formulation

PW-DICE replaces x∈Rpx \in \mathbb{R}^p8-divergence objectives with the primal 1-Wasserstein distance: x∈Rpx \in \mathbb{R}^p9 subject to

H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}0

and Bellman-flow constraints linking H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}1 to the state–action occupancy H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}2.

Because the primal LP is intractable for high-dimensional H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}3, pessimistic H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}4-divergence regularizers are introduced: H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}5 A single-level dual, parameterized by functions H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}6, emerges through Lagrangian and Fenchel duality, resulting in a convex, unconstrained objective suitable for SGD estimation.

Theoretical Unification

With suitable costs and limits, PW-DICE's dual strictly generalizes SMODICE and LobsDICE, unifying f-divergence and Wasserstein objectives under one framework. Theorem 2 explicitly shows that for appropriate cost and regularizers, the dual reduces to the standard KL-divergence SMODICE formulation.

Learning the Cost Metric

Rather than a fixed cost, PW-DICE defines and learns H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}7:

  • The reward-like term H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}8 is computed as a log-ratio of smoothed non-expert and expert state densities, estimated using a learned state discriminator.
  • The reachability component, H∈R(Lr)×pH \in \mathbb{R}^{(L r) \times p}9, is derived from a contrastively trained embedding y=Hx+ny = H x + n0 with InfoNCE loss over adjacent non-expert transitions.

This flexible, contrastively informed metric allows adaptation to the geometry of the state space.

Implementation and Optimization

All networks (dual heads, discriminator, embeddings) are MLPs with 256 units/layer; training uses Adam with learning rates (3e-4 for potentials, 1e-3 for policy). Policy extraction is reduced to weighted behavior cloning: y=Hx+ny = H x + n1. The algorithmic process alternates between training cost-embedding, discriminator, dual potentials, and finally fitting the policy via weighted BC.

Empirical Evaluation

Empirical benchmarks include:

  • Tabular random MDPs: PW-DICE achieves lowest regret and TV distance.
  • MuJoCo continuous control (Hopper, HalfCheetah, Walker2d, Ant): achieves or surpasses state-of-the-art normalized return.
  • Ablations confirm superiority of the joint R + contrastive cost and robustness to regularization hyperparameters.

3. Comparative Summary Table

Domain Objective/Model Main Contribution
US Imaging (Li et al., 2023) Score-based diffusion (EDM) Reduced steps for PWC-quality via shortcut initialization from single PW
Offline LfO (Yan et al., 2023) Primal OT + f-divergence regularized DICE Unified, convex dual for flexible cost occupancy matching

PW-DICE thus refers to two domain-specific advances: one enabling efficient ultrasound compounding by initializing diffusion from a single measured instance, the other generalizing distribution-matching techniques via primal optimal transport and learned state geometry.

4. Significance and Context

For US imaging, PW-DICE reduces the computational cost of generative reconstruction while maintaining anatomical detail, with practical implications for real-time imaging and applications where compounded data are unavailable.

For offline RL and imitation, PW-DICE circumvents the limitations of fixed-metric optimal transport and y=Hx+ny = H x + n2-divergence approaches—enabling metric learning and providing theoretical continuity among prior DICE algorithms. Its convex, unconstrained dual formulation and superior empirical results advance the state of the art for imitation from observation, particularly where collecting action-labeled expert data is challenging.

5. Limitations and Open Directions

PW-DICE for US imaging relies on careful noise-level selection (y=Hx+ny = H x + n3, y=Hx+ny = H x + n4); no closed-form procedures or adaptive rules are currently provided. For the RL setting, the expressivity and stability of the cost metric learning, as well as scalability to highly complex state/action spaces, remain areas for further study. Both lines suggest follow-up work in hyperparameter adaptation, structured priors (imaging), and advanced cost function design (RL) (Li et al., 2023, Yan et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PW-DICE.