OneNet: Unified Neural Architectures

Updated 1 February 2026

OneNet is a unified framework of specialized neural architectures that efficiently handles tasks from image segmentation to entity linking by eliminating redundant processing steps.
Its segmentation variant employs channel-wise 1D convolutions with pixel-unshuffle and pixel-shuffle, reducing parameters by up to 78% while preserving spatial locality.
The detection, forecasting, and entity linking modules integrate innovative methods such as Hungarian matching, RL-augmented ensembling, and multi-stage LLM prompting to achieve competitive empirical performance.

OneNet refers to a set of specialized neural architectures and frameworks that provide efficient, end-to-end solutions for tasks in image segmentation, object detection, time series forecasting under concept drift, and few-shot entity linking via LLM prompting. These systems share the principle of architectural or algorithmic unification designed to streamline classical pipelines, frequently reducing computational overhead or eliminating post-processing steps. The following overview synthesizes the main research contributions titled "OneNet" across these domains, highlighting their methodological foundations, technical choices, and demonstrated empirical impact.

1. Channel-Wise 1D Convolutional U-Net for Semantic Segmentation

OneNet for image segmentation implements a lightweight adaptation of the classical U-Net model by replacing all 2D convolutional and pooling/upsampling operations with channel-wise 1D convolutions, pixel-unshuffle downsampling, and pixel-shuffle upsampling modules (Byun et al., 2024).

Architectural Design

Encoder path utilizes pixel-unshuffle to transform spatial local neighborhoods into contiguous channel groups, converting a tensor $X \in \mathbb{R}^{C \times H \times W}$ to a shape $(s^2C) \times (H/s) \times (W/s)$ , where $s$ is the scale factor.
Channel-wise 1D convolution processes each channel $c$ independently along a flattened spatial sequence, enabling spatial relationships to be encoded in the channel dimension:

$y_{c,t} = \sum_{\Delta t = -K}^K w_{c, \Delta t} \cdot x_{c, t+\Delta t} + b_c$

Decoder path applies pixel-shuffle to redistribute extra channels into higher spatial resolution.

Parameter Efficiency

Per encoder block, parameter count drops from $54C^2$ (standard U-Net) to $12C^2$ , a reduction of approximately 78%.
Full network variants show 47–71% reduction in parameters and 78% reduction in FLOPS, with negligible accuracy drop for medical segmentation tasks and minor degradation ( $<15\%$ ) for high-class-count tasks.

Empirical Results

On PASCAL VOC and Oxford Pet, OneNet achieved mean IoU and parameter/FLOPS cuts closely matching baselines.
On MSD Heart/Brain/Lung, accuracy differences are $<1\%$ .

Implication: OneNet demonstrates that spatial locality preservation via channel manipulation allows dispensation with dense 2D kernels, favoring edge deployment without sacrificing performance for most segmentation scenarios.

2. End-to-End Object Detection via One-to-One Assignment

OneNet in object detection achieves fully differentiable, end-to-end pipeline operation by eliminating Non-Maximum Suppression (NMS) and enforcing one-to-one prediction through Hungarian bipartite matching (Sun et al., 2020).

Core Principles

Assigns exactly one positive candidate per ground-truth object, solving a cost matrix $C(i,j)$ between $N$ candidates and $M$ ground-truths via the Hungarian algorithm.
Matching cost combines localization and classification criteria:

$C_\mathrm{cls}(i,j) = -\log p_i(c_j)$

$C_\mathrm{loc}(i,j) = \lambda_1 \|b_i - \hat{b}_j\|_1 + \lambda_2 (1 - \mathrm{GIoU}(b_i, \hat{b}_j))$

$C(i,j) = C_\mathrm{cls}(i,j) + C_\mathrm{loc}(i,j)$

The inclusion of classification cost increases the score gap, crucial for producing one high-confidence prediction per object and collapsing duplicates without NMS.

Objective and Implementation

For matched pairs $(i^*, j)$ : apply classification and localization losses.
For unmatched candidates: only background classification loss.
Empirically matches or exceeds standard one-stage detectors with NMS on COCO and significantly outperforms in crowded scenarios like CrowdHuman.

Significance: The architecture proves that learning to suppress duplicates is possible at the assignment/cost level without altering head/backbone designs, setting the precedent for NMS-free one-stage detection.

3. Online Ensembling Network for Time Series Forecasting under Concept Drift

OneNet for concept drift time series forecasting ensembles two distinct predictors: one modeling cross-time dependencies and another modeling cross-variate dependencies (Zhang et al., 2023).

Framework Components

At each timestep $t$ , OneNet observes an $M$ -variate window $x_t$ and predicts $H$ future steps $\hat{y}_t$ .
Cross-time forecaster $f_1$ treats channels independently; cross-variable forecaster $f_2$ models channels jointly.
Ensemble weights $w_t$ are updated via exponential-weighted average (EWA), a classic online convex programming approach:

$w_{t+1,i} = \frac{w_{t,i} \exp(-\eta \ell_{t,i})}{\sum_{j=1}^2 w_{t,j} \exp(-\eta \ell_{t,j})}$

An RL-based module introduces short-term bias $b_t$ optimizing a policy $\pi_\theta$ on the concatenated state $s_t$ , yielding a normalized final weight vector.

Algorithmic Adaptation

RL bias enables faster adaptation to abrupt concept drift, mitigating slow-switch phenomena inherent in classical EWA.
Standard forecaster updates are decoupled; the RL block is trained via supervised regression using the online error.

Performance

On benchmarks (ETTh2, ETTm1, WTH, ECL), OneNet reduces cumulative MSE by 53.1% over FSNet and shows rapid responsiveness to environmental drift.
Internal regret and error spikes are suppressed relative to baselines.

A plausible implication is that the hybrid OCP-EWA with RL guidance sets a template for robust drift handling across mixed dependency modeling frameworks.

4. Few-Shot Entity Linking Framework via LLM Prompting

OneNet adopts a fine-tuning-free, multi-stage LLM prompt pipeline for robust few-shot entity linking (Liu et al., 2024).

Pipeline Modules

Entity Reduction Processor (ERP): Summarizes and filters entity candidates to fit LLM token limits; achieves recall of 0.8–0.9, reducing candidate set from ~50 to 2–5.
Dual-Perspective Entity Linker (DEL):
- Contextual: Applies chain-of-thought exemplars matched by a composite similarity score.
- Prior: Prompts without context for intrinsic entity priors.
Entity Consensus Judger (ECJ): Resolves disagreement between DEL branches via a final LLM call.

Prompt Designs

Templates distill entity features (categories, context, semantic meaning), and adaptive selection of in-context exemplars outperforms random/category-only strategies.

Benchmark Results

On seven standard datasets, OneNet achieves micro-F $_1$ improvements of 4–11 points over strongest prior few-shot and LLM baselines (e.g., Zephyr-7B-beta).
Efficiency analysis shows runtime per mention of 1–15s and sublinear token cost relative to raw entity linking.

Limitations

Inference is slower due to multiple LLM calls; mention detection is assumed gold.
Future work includes optimizing attention, integrating mention spotting, and context summarization.

This suggests practical viability for LLM-based EL without the need for fine-tuning, particularly in domain-specific or low-resource regimes.

5. Commonalities, Technical Distinctions, and Application-Specific Optimizations

Across OneNet variants, several systematic themes emerge:

Variant	Unifying Mechanism	Key Efficiency Feature	Eliminated Post-Processing
Segmentation (Byun et al., 2024)	Pixel-(un)shuffle + 1D conv	Channel-local encoding, low parameter	2D convolutions/pooling
Detection (Sun et al., 2020)	Hungarian matching (1-to-1)	Score-gap maximization	Non-Maximum Suppression
Forecasting (Zhang et al., 2023)	RL-augmented ensemble	OCP with short-term adaptation	None (on-line)
Entity Linking (Liu et al., 2024)	LLM prompt pipeline	Summarization, dual-perspective voting	Fine-tuning, feature engineering

Distinct OneNet solutions target different bottlenecks: memory and computation for segmentation, differentiability for detection, drift responsiveness for time series, and data scarcity for entity linking. Editor's term: "unification by architectural or assignment-level reduction" applies broadly.

6. Technical and Practical Impact

OneNet architectures have demonstrably advanced the state of the art in their respective fields:

Segmentation models are more efficient and tractable for edge deployment.
Object detectors are end-to-end trainable, with competitive AP and recall, especially for crowded scenes.
Time series forecasters are robust to sudden distributional shifts, outperforming established baselines by substantial error margins.
Entity linking frameworks generalize to new domains without retraining, leveraging LLM summarization and reasoning heuristics.

Implication: The general OneNet philosophy motivates the search for minimalistic, unified systems which remove legacy dependencies and hand-crafted steps, foregrounding neural assignment, modular prompt orchestration, and hybrid adaptive logic.

7. Limitations, Open Problems, and Future Directions

Current OneNet instantiations are subject to certain constraints:

Segmentation accuracy can degrade when receptive field requirements exceed what channel-shuffling provides; hybrid architectures warrant exploration.
Object detection could benefit from integration with transformer backbones for contextual reasoning beyond local assignments.
RL blocks incur nontrivial additional overhead; more scalable meta-learners are of interest.
LLM-based entity linking requires advances in efficient context encoding and automated mention detection.

Prospective work includes:

Extending pixel-shuffle/channel-wise encoding to video, depth estimation, and diffusion models.
Dynamic assignment and matching strategies in detection, e.g. learning cost weights.
Automated prompt engineering and curriculum design for LLM-driven entity linking.

These suggest substantial further opportunities for unification and efficiency gains via principled architectural reduction, assignment reframing, and prompt-driven reasoning.