Papers
Topics
Authors
Recent
Search
2000 character limit reached

OneNet: Unified Neural Architectures

Updated 1 February 2026
  • OneNet is a unified framework of specialized neural architectures that efficiently handles tasks from image segmentation to entity linking by eliminating redundant processing steps.
  • Its segmentation variant employs channel-wise 1D convolutions with pixel-unshuffle and pixel-shuffle, reducing parameters by up to 78% while preserving spatial locality.
  • The detection, forecasting, and entity linking modules integrate innovative methods such as Hungarian matching, RL-augmented ensembling, and multi-stage LLM prompting to achieve competitive empirical performance.

OneNet refers to a set of specialized neural architectures and frameworks that provide efficient, end-to-end solutions for tasks in image segmentation, object detection, time series forecasting under concept drift, and few-shot entity linking via LLM prompting. These systems share the principle of architectural or algorithmic unification designed to streamline classical pipelines, frequently reducing computational overhead or eliminating post-processing steps. The following overview synthesizes the main research contributions titled "OneNet" across these domains, highlighting their methodological foundations, technical choices, and demonstrated empirical impact.

1. Channel-Wise 1D Convolutional U-Net for Semantic Segmentation

OneNet for image segmentation implements a lightweight adaptation of the classical U-Net model by replacing all 2D convolutional and pooling/upsampling operations with channel-wise 1D convolutions, pixel-unshuffle downsampling, and pixel-shuffle upsampling modules (Byun et al., 2024).

Architectural Design

  • Encoder path utilizes pixel-unshuffle to transform spatial local neighborhoods into contiguous channel groups, converting a tensor XRC×H×WX \in \mathbb{R}^{C \times H \times W} to a shape (s2C)×(H/s)×(W/s)(s^2C) \times (H/s) \times (W/s), where ss is the scale factor.
  • Channel-wise 1D convolution processes each channel cc independently along a flattened spatial sequence, enabling spatial relationships to be encoded in the channel dimension:

yc,t=Δt=KKwc,Δtxc,t+Δt+bcy_{c,t} = \sum_{\Delta t = -K}^K w_{c, \Delta t} \cdot x_{c, t+\Delta t} + b_c

  • Decoder path applies pixel-shuffle to redistribute extra channels into higher spatial resolution.

Parameter Efficiency

  • Per encoder block, parameter count drops from 54C254C^2 (standard U-Net) to 12C212C^2, a reduction of approximately 78%.
  • Full network variants show 47–71% reduction in parameters and 78% reduction in FLOPS, with negligible accuracy drop for medical segmentation tasks and minor degradation (<15%<15\%) for high-class-count tasks.

Empirical Results

  • On PASCAL VOC and Oxford Pet, OneNet achieved mean IoU and parameter/FLOPS cuts closely matching baselines.
  • On MSD Heart/Brain/Lung, accuracy differences are <1%<1\%.

Implication: OneNet demonstrates that spatial locality preservation via channel manipulation allows dispensation with dense 2D kernels, favoring edge deployment without sacrificing performance for most segmentation scenarios.

2. End-to-End Object Detection via One-to-One Assignment

OneNet in object detection achieves fully differentiable, end-to-end pipeline operation by eliminating Non-Maximum Suppression (NMS) and enforcing one-to-one prediction through Hungarian bipartite matching (Sun et al., 2020).

Core Principles

  • Assigns exactly one positive candidate per ground-truth object, solving a cost matrix C(i,j)C(i,j) between NN candidates and MM ground-truths via the Hungarian algorithm.
  • Matching cost combines localization and classification criteria:

Ccls(i,j)=logpi(cj)C_\mathrm{cls}(i,j) = -\log p_i(c_j)

Cloc(i,j)=λ1bib^j1+λ2(1GIoU(bi,b^j))C_\mathrm{loc}(i,j) = \lambda_1 \|b_i - \hat{b}_j\|_1 + \lambda_2 (1 - \mathrm{GIoU}(b_i, \hat{b}_j))

C(i,j)=Ccls(i,j)+Cloc(i,j)C(i,j) = C_\mathrm{cls}(i,j) + C_\mathrm{loc}(i,j)

  • The inclusion of classification cost increases the score gap, crucial for producing one high-confidence prediction per object and collapsing duplicates without NMS.

Objective and Implementation

  • For matched pairs (i,j)(i^*, j): apply classification and localization losses.
  • For unmatched candidates: only background classification loss.
  • Empirically matches or exceeds standard one-stage detectors with NMS on COCO and significantly outperforms in crowded scenarios like CrowdHuman.

Significance: The architecture proves that learning to suppress duplicates is possible at the assignment/cost level without altering head/backbone designs, setting the precedent for NMS-free one-stage detection.

3. Online Ensembling Network for Time Series Forecasting under Concept Drift

OneNet for concept drift time series forecasting ensembles two distinct predictors: one modeling cross-time dependencies and another modeling cross-variate dependencies (Zhang et al., 2023).

Framework Components

  • At each timestep tt, OneNet observes an MM-variate window xtx_t and predicts HH future steps y^t\hat{y}_t.
  • Cross-time forecaster f1f_1 treats channels independently; cross-variable forecaster f2f_2 models channels jointly.
  • Ensemble weights wtw_t are updated via exponential-weighted average (EWA), a classic online convex programming approach:

wt+1,i=wt,iexp(ηt,i)j=12wt,jexp(ηt,j)w_{t+1,i} = \frac{w_{t,i} \exp(-\eta \ell_{t,i})}{\sum_{j=1}^2 w_{t,j} \exp(-\eta \ell_{t,j})}

  • An RL-based module introduces short-term bias btb_t optimizing a policy πθ\pi_\theta on the concatenated state sts_t, yielding a normalized final weight vector.

Algorithmic Adaptation

  • RL bias enables faster adaptation to abrupt concept drift, mitigating slow-switch phenomena inherent in classical EWA.
  • Standard forecaster updates are decoupled; the RL block is trained via supervised regression using the online error.

Performance

  • On benchmarks (ETTh2, ETTm1, WTH, ECL), OneNet reduces cumulative MSE by 53.1% over FSNet and shows rapid responsiveness to environmental drift.
  • Internal regret and error spikes are suppressed relative to baselines.

A plausible implication is that the hybrid OCP-EWA with RL guidance sets a template for robust drift handling across mixed dependency modeling frameworks.

4. Few-Shot Entity Linking Framework via LLM Prompting

OneNet adopts a fine-tuning-free, multi-stage LLM prompt pipeline for robust few-shot entity linking (Liu et al., 2024).

Pipeline Modules

  • Entity Reduction Processor (ERP): Summarizes and filters entity candidates to fit LLM token limits; achieves recall of 0.8–0.9, reducing candidate set from ~50 to 2–5.
  • Dual-Perspective Entity Linker (DEL):
    • Contextual: Applies chain-of-thought exemplars matched by a composite similarity score.
    • Prior: Prompts without context for intrinsic entity priors.
  • Entity Consensus Judger (ECJ): Resolves disagreement between DEL branches via a final LLM call.

Prompt Designs

  • Templates distill entity features (categories, context, semantic meaning), and adaptive selection of in-context exemplars outperforms random/category-only strategies.

Benchmark Results

  • On seven standard datasets, OneNet achieves micro-F1_1 improvements of 4–11 points over strongest prior few-shot and LLM baselines (e.g., Zephyr-7B-beta).
  • Efficiency analysis shows runtime per mention of 1–15s and sublinear token cost relative to raw entity linking.

Limitations

  • Inference is slower due to multiple LLM calls; mention detection is assumed gold.
  • Future work includes optimizing attention, integrating mention spotting, and context summarization.

This suggests practical viability for LLM-based EL without the need for fine-tuning, particularly in domain-specific or low-resource regimes.

5. Commonalities, Technical Distinctions, and Application-Specific Optimizations

Across OneNet variants, several systematic themes emerge:

Variant Unifying Mechanism Key Efficiency Feature Eliminated Post-Processing
Segmentation (Byun et al., 2024) Pixel-(un)shuffle + 1D conv Channel-local encoding, low parameter 2D convolutions/pooling
Detection (Sun et al., 2020) Hungarian matching (1-to-1) Score-gap maximization Non-Maximum Suppression
Forecasting (Zhang et al., 2023) RL-augmented ensemble OCP with short-term adaptation None (on-line)
Entity Linking (Liu et al., 2024) LLM prompt pipeline Summarization, dual-perspective voting Fine-tuning, feature engineering

Distinct OneNet solutions target different bottlenecks: memory and computation for segmentation, differentiability for detection, drift responsiveness for time series, and data scarcity for entity linking. Editor's term: "unification by architectural or assignment-level reduction" applies broadly.

6. Technical and Practical Impact

OneNet architectures have demonstrably advanced the state of the art in their respective fields:

  • Segmentation models are more efficient and tractable for edge deployment.
  • Object detectors are end-to-end trainable, with competitive AP and recall, especially for crowded scenes.
  • Time series forecasters are robust to sudden distributional shifts, outperforming established baselines by substantial error margins.
  • Entity linking frameworks generalize to new domains without retraining, leveraging LLM summarization and reasoning heuristics.

Implication: The general OneNet philosophy motivates the search for minimalistic, unified systems which remove legacy dependencies and hand-crafted steps, foregrounding neural assignment, modular prompt orchestration, and hybrid adaptive logic.

7. Limitations, Open Problems, and Future Directions

Current OneNet instantiations are subject to certain constraints:

  • Segmentation accuracy can degrade when receptive field requirements exceed what channel-shuffling provides; hybrid architectures warrant exploration.
  • Object detection could benefit from integration with transformer backbones for contextual reasoning beyond local assignments.
  • RL blocks incur nontrivial additional overhead; more scalable meta-learners are of interest.
  • LLM-based entity linking requires advances in efficient context encoding and automated mention detection.

Prospective work includes:

  • Extending pixel-shuffle/channel-wise encoding to video, depth estimation, and diffusion models.
  • Dynamic assignment and matching strategies in detection, e.g. learning cost weights.
  • Automated prompt engineering and curriculum design for LLM-driven entity linking.

These suggest substantial further opportunities for unification and efficiency gains via principled architectural reduction, assignment reframing, and prompt-driven reasoning.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to OneNet.