Router Upcycling in SDN and ML

Updated 7 September 2025

Router upcycling is the repurposing of legacy router hardware and software using modern SDN, open-source frameworks, and programmable data planes to meet emerging network and ML demands.
It leverages architectural and algorithmic enhancements, such as SDN overlays and multi-tier FIB caching, to achieve rapid failover and scalable routing at reduced cost and energy consumption.
The concept extends to machine learning by transforming dense neural models into Mixture-of-Experts architectures, enabling adaptive routing and improved model performance.

Router upcycling refers to a range of architectural, algorithmic, and systemic techniques designed to prolong the utility, enhance the capability, or repurpose existing router hardware or software by means not originally envisioned during their manufacture. The concept is rooted in leveraging advances such as Software Defined Networking (SDN), programmable data planes, open-source software, and Mixture-of-Experts (MoE) model innovations, to retrofit legacy, commodity, or otherwise constrained routers for contemporary requirements in networking and machine learning.

1. Architectural Principles and Classification

Router upcycling encompasses methodologies for both hardware and software layers. At the network infrastructure level, techniques include pairing legacy IP routers with SDN-enabled switches to augment forwarding behavior and recovery performance (Chang et al., 2015). In the software domain, upcycling has been propelled by repurposing models or control logic (e.g., open-source routers, resource-efficient protocol daemons (Granderath et al., 2023)), and at the algorithmic level, by adaptively reusing model components, as seen in MoE architectures where dense neural model weights are transformed into expert modules via “model upcycling” (Vavre et al., 13 Dec 2024, Ran et al., 31 Aug 2025).

Router upcycling strategies can be broadly classified by their focal target:

Functional Upcycling: Extension of routing, forwarding, or management capabilities via software overlays or adjunct hardware.
Performance Upcycling: Retrofitting systems for improved failover, control-plane convergence, or FIB scalability.
Resource Upcycling: Utilizing previously underutilized hardware components (i.e., general-purpose CPUs, DRAM) or legacy devices to provide services at reduced cost and power.
Intelligent Upcycling: Incorporating modern ML routing or expert selection strategies (notably, upcycling “routers” in MoE systems for more expressive inference).

This taxonomy encompasses approaches in both traditional networks (IP/Ethernet routing) and machine learning infrastructures.

2. Upcycling in Legacy Network Routers via SDN Integration

A key advancement in physical network device upcycling is the use of SDN overlays to “supercharge” convergence performance in legacy IP routers (Chang et al., 2015). In the canonical workflow:

A legacy router is paired with an SDN-enabled switch.
Routing announcement traffic (e.g., BGP UPDATEs) is intercepted by a routing daemon positioned logically between the router and its peers; this daemon rewrites next-hop fields to reference a Virtual IP Next-Hop (VNH) and associated Virtual MAC (VMAC).
The router resolves the VNH via ARP, which the SDN controller intercepts and replies to, ensuring the FIB routes point to SDN-resolved pointers rather than physical next-hops.
Both primary and backup forwarding entries for each “backup group”—the set of routers sharing a failover fate—are precomputed and installed in the SDN switch. On failure, only a small number of rules (per affected backup group) need to be atomically switched.

This hierarchical arrangement enables near-instantaneous data-plane reconfiguration (measured at ~150 ms, representing a 900× improvement over typical flat FIB convergence), without requiring any modification of the legacy router’s firmware or protocol implementation. The solution is exemplified in live deployments over Cisco Nexus 7k hardware and OpenFlow switches, with FPGA-based traffic measurements providing microsecond precision in quantifying failover durations. Operational and economic implications for operators include dramatically extended device lifetimes and removal of cost barriers to SDN deployment.

3. Upcycling of Forwarding Information Bases via Programmable Data Planes

FIB upcycling focuses on re-architecting the forwarding tables in routers to accommodate expanding Internet routing scales and limited, expensive memory. Employing a programmable FIB caching approach (Grigoryan et al., 2018), the router’s full set of routes $R$ is partitioned among three memory tiers:

$R = R_\text{TCAM} \cup R_\text{SRAM} \cup R_\text{DRAM}$

TCAM: Stores only the ~6,000 most frequently accessed prefixes, sufficient to cover over 99.9% of traffic. TCAM is fast but expensive and power-hungry (~15W/Mb).
SRAM: Accommodates the next ~16,000 moderately popular routes, providing a compromise between lookup speed and cost.
DRAM: Stores the entire routing table (e.g., 560K entries) but only serves traffic on cache misses, which in empirical evaluation amounts to below 0.1% of packets.

This multi-stage cache-based forwarding minimizes reliance on expensive TCAM, enabling upcycled routers to handle expanding Internet tables economically and with orders of magnitude reduced energy consumption. The programmable nature (e.g., via P4/PISA) allows dynamic adaptation to traffic patterns and controller-driven updates, ensuring ongoing scalability and flexibility for evolving routing workloads.

4. Software and Open Source Router Upcycling

Open-source firmware and modular routing stacks further underpin router upcycling by enabling the repurposing of commodity hardware (PCs, low-cost ASICs, wireless access points) as viable routing devices (Fatahi et al., 2022). Common projects and architectures include:

Control Plane Implementations: Quagga, XORP (protocol suite daemons communicating via inter-process APIs).
Data Plane Frameworks: Click (graph-based pluggable packet processing pipelines).
Embedded Distributions: TomatoUSB, OpenWrt; the latter supports upcycling via extensible, resource-efficient protocol implementations such as “orc,” a RESTCONF daemon that enables automated configuration on devices with very limited CPU and memory (Granderath et al., 2023).

Underlying technical mechanisms notably include standardized separation between user-space control functions and kernel- or hardware-based data planes (ForCES interface, Linux netlink), modular stacking, and (where applicable) hardware acceleration using platforms such as NetFPGA and HERO. These approaches mitigate traditional constraints (CPU/memory limitations, bus bandwidth, lack of specialized forwarding engines) and facilitate distributed, virtualized, or hybrid control—yielding flexible, energy-aware routers adaptable to new workloads or topologies.

5. Router Upcycling in Machine Learning: Mixture-of-Experts and Mixture-of-Routers

The concept of upcycling extends to deep learning by transforming pre-trained dense neural networks into high-capacity Mixture-of-Experts (MoE) models (Vavre et al., 13 Dec 2024, Ran et al., 31 Aug 2025). In “model upcycling,” weights from a dense feed-forward layer are replicated across multiple “experts,” and new routing networks (routers) are initialized—typically stochastically or randomly—to provide expert assignment.

Router Upcycling advances this paradigm by initializing the routers themselves from attention components (query and key matrices) of the dense model’s prior attention layers (Ran et al., 31 Aug 2025):

Multiple sub-routers are created by extracting and concatenating $(W_Q, K_A)$ matrices from individual attention heads, increasing representation diversity.
Each token embedding $x$ is projected into multiple subspaces (one per router), yielding queries $Q^j = W^j x$ .
Corresponding expert keys $K_i$ are matched with these queries via attention-style inner products $S_i^j = (Q^j)^T K_i / \sqrt{d'}$ , the aggregated score $S_i = \sum_j S_i^j$ is then softmaxed to yield routing weights $R$ for the token’s top- $k$ expert assignments.

Empirically, Router Upcycling provides over 2% improvement in benchmark performance versus conventional single-router upcycling approaches, lessens reliance on pretraining compute (often below 1% of that required for full pretraining), and achieves measurable gains in expert specialization (reduced cosine similarity among expert outputs across layers). The practical implication is improved scaling efficiency and robustness in high-capacity neural architectures.

6. Operational and Economic Implications

Router upcycling yields a range of operational benefits for both network operators and machine learning practitioners:

Cost Savings: Repurposing or extending the operational life of legacy hardware avoids capital expenditure on new devices; using commodity or commodity-adjunct hardware (e.g., SDN switches) achieves performance improvements at a fraction of the cost of proprietary upgrades (Chang et al., 2015, Grigoryan et al., 2018).
Energy Efficiency: Data-plane caching and minimal TCAM reliance reduce power consumption (Grigoryan et al., 2018); software upcycling allows dormant or underutilized hardware to perform advanced roles without additional energy costs.
Performance and Flexibility: Hierarchical forwarding, fast failover, and programmable routing open possibilities for dynamic reallocation and policy-driven adaptability, reducing downtime and facilitating modern management frameworks.
Scalability and Extensibility: Approaches such as programmable FIB caches and SDN overlays guarantee upcycled systems can scale as route tables and data-plane demands grow.

A plausible implication is that as upcycling techniques mature and become more standardized, the separation between purpose-built hardware and upcycled, software-driven solutions may diminish, leading to further democratization of advanced routing functionality.

7. Ongoing Research and Prospective Developments

Contemporary research addresses broadening the applicability and efficiency of router upcycling:

Generalization: Extending backup-group overlays and hierarchical FIBs to other protocols (OSPF, IS-IS) and service chaining (Chang et al., 2015).
Cache and Data Structure Optimization: Enhanced dynamic route reallocation and programmable cache management algorithms, leveraging emerging P4/PISA frameworks (Grigoryan et al., 2018).
Hybrid Architectures: Further decoupling of control and data plane, via OpenFlow/ForCES or integration with NFV stacks (Fatahi et al., 2022).
Algorithmic Upcycling: Adaptive router initialization in MoE networks, expanded diversity in query subspaces, and theoretical understanding of dynamic token routing (Ran et al., 31 Aug 2025).
Resource Optimization: Minimizing memory and compute overhead for control-plane daemons within open source router firmware, e.g., optimizing YANG-JSON-UCI mappings (Granderath et al., 2023).

This suggests that the trajectory of router upcycling will continue to integrate advances from SDN, programmable hardware, open-source frameworks, and machine learning, with future work focusing on generalized applicability, formal guarantees on reliability, and further quantitative assessments of performance and resource consumption.