BSPPO: Trust-Based UAV Routing

Updated 26 January 2026

BSPPO is a framework that employs a security degree metric, combining historical credibility and adjacent reliability, to assess UAV trustworthiness.
It integrates SDN and blockchain technologies to enable tamper-proof, real-time trust updates and dynamic rerouting in adversarial environments.
The design features tunable parameters and efficient computation, enhancing network resilience, reducing latency, and lowering energy consumption.

A security degree metric is a quantitative measure that assesses the trustworthiness of a node—specifically, a UAV (Unmanned Aerial Vehicle)—in a network with respect to its authentication history and the reliability of adjacent nodes. Within UAV communication systems utilizing software-defined networking (SDN) and blockchain-based trust ledgers, the security degree metric enables dynamic, attack-resilient routing by providing a real-time indicator of each UAV’s operational integrity. Adoption of the security degree metric is central to architectures that require robust, low-latency, and energy-efficient communications in adversarial or failure-prone environments, as exemplified by its core role in the BSPPO (Beam Search–Proximal Policy Optimization) framework (Han et al., 19 Jan 2026).

1. Formal Definition and Mathematical Formulation

The security degree metric for a UAV node $i$ at hop $h$ , denoted $SD_i$ , is constructed as a weighted combination of two principal components: historical credibility and adjacent reliability.

Historical credibility $A_i^{h+1}$ : This component quantifies a node’s self-authentication performance:

$A_i^{\,h+1} = \frac{S_i^{\,h+1}}{S_i^{\,h+1} + F_i^{\,h+1}}, \quad A_i^{\,h+1}\in[0,1]$

Here, $S_i^{h}$ and $F_i^{h}$ are the cumulative successful and failed authentications at hop $h$ , incremented per event by

$\begin{aligned} S_i^{\,h+1} &= S_i^h + (1-\delta_{\rm data}) \ F_i^{\,h+1} &= F_i^h + \delta_{\rm data} \end{aligned}$

with $\delta_{\rm data}\in\{0,1\}$ indicating whether an attack occurred.

Adjacent reliability $RE_i^h$ : This measures the mean trust level attributed to $i$ by its one-hop neighbors $\Gamma(i)$ . Each neighbor $j$ tracks its own reliability $RE_j^h\in[0,1]$ , updated as

$RE_j^{\,h+1} = RE_j^h - \beta\,\delta_{\rm data}$

and aggregated as

$RE_i^h = \frac{1}{|\Gamma(i)|}\sum_{j\in\Gamma(i)} RE_j^h$

Final security degree $SD_i$ : The SDN controller accesses the blockchain ledger for $A_i$ and $RE_i$ , computing

$SD_i = \alpha\,RE_i + (1-\alpha)\,A_i, \quad \alpha\in[0,1]$

This blend accords tunable weight to a UAV’s personal authentication history and the reliability context of its neighborhood.

2. Integration within Secure Routing Architectures

The security degree metric is designed to operate in conjunction with software-defined networking and blockchain trust management. The blockchain ensures tamper-proof updating and querying of $\{A_i,RE_i\}$ values, which the SDN controller aggregates to determine $SD_i$ . This integration is critical for secure, centralized, and auditable decision-making, especially in environments where node compromise may be transient or dynamic. The metric’s real-time recalculation with every authentication event provides the agility required for rapid rerouting in response to attacks.

3. Role in the BSPPO Routing Framework

Within the BSPPO (Beam Search–Proximal Policy Optimization) routing framework, $SD_i$ is pivotal at multiple algorithmic stages:

Beam Search Candidate Screening: The average security degree along candidate paths serves as the scoring function:

$\mathrm{Score}_{\rm avg}(p) = \frac{1}{|p|}\sum_{v\in p} SD_v$

Only paths where all nodes satisfy $SD_v\ge\theta_{SD}$ are retained.

Dynamic Rerouting: Upon attack detection at any hop ( $\delta_{\rm data}=1$ ), the affected node’s $SD_j$ is immediately updated and the node is removed from the candidate graph, triggering a fresh beam search.

This use of $SD_i$ ensures that only trustworthy UAVs participate in communication routes. The bi-level architecture—beam search for feasible high-security path selection, PPO for adaptive hop-by-hop rerouting—directly leverages the security degree metric to maintain network resilience against adversarial disruptions (Han et al., 19 Jan 2026).

4. Algorithmic and Systemic Properties

The metric’s design aligns with several desirable system properties:

Adaptivity: $SD_i$ is updated in real time upon receiving new authentication outcomes and upon neighbor reliability changes. This facilitates immediate reaction to attacks or failures.
Transparency and Audibility: By storing all $A_i$ and $RE_i$ values on a blockchain ledger, the trust computation is auditable and tamper-resistant.
Parameterization: The blend parameter $\alpha$ allows system operators to tune the sensitivity of routing decisions to neighborhood context versus self-history, supporting environment-specific risk trade-offs.
Distributed Computability: While aggregation is orchestrated via SDN, the underlying updates ( $S_i$ , $F_i$ , $RE_j$ ) are locally computable and easy to broadcast.

A plausible implication is that these properties make the security degree metric suitable for highly dynamic environments where global topology or attack patterns may change rapidly, and where distributed computation is essential.

5. Significance and Applications in Adversarial Networks

The deployment of the security degree metric directly addresses the challenge of minimizing delay, energy cost, and packet loss in UAV networks subject to adversarial behavior or transient node failure. Simulations demonstrate that BSPPO—via its use of $SD_i$ —consistently surpasses alternative schemes (e.g., PPO-only, beam-search Q-learning, BS-actor critic) under varied attack densities, real-time reroute demands, and fluctuating packet sizes (Han et al., 19 Jan 2026).

This suggests the security degree metric is critical not only as a trust quantification tool but also as an enabler of robust, self-healing network control mechanisms over architectural substrates that combine SDN, blockchain, and hierarchical decision layers.

6. Hyperparameterization and Complexity Considerations

Implementation of security degree-based computation and its integration into routing algorithms requires tuning several hyperparameters, notably:

Beam width $B$ (number of paths per beam search iteration)
Maximum number of hops $H_{\max}$
Lower security threshold $\theta_{SD}$
Blend ratio $\alpha$
Update decrement factor $\beta$ for $RE_j^h$

The per-iteration computational cost incorporating security degree calculation and selection is, for beam search, $\mathcal O(H_{\max} B \bar b \log (B \bar b))$ and, for PPO rollouts and updates, $\mathcal O(H_{\max} d^2 + M d^2)$ ( $d$ is the network hidden layer size, $M$ minibatch size). Therefore, the metric’s efficient computation is compatible with online, large-scale deployment scenarios.

7. Potential Extensions and Broader Relevance

While the described application is tailored to SDN-blockchain UAV networks, the approach to quantifying node trustworthiness via a weighted combination of local performance history and neighbor reputation has broader applicability. Potential extensions could investigate alternative formulations of historical credibility or aggregating adjacency (e.g., maximizing or minimizing rather than averaging $RE_j$ ) in other dynamic, adversarial networks. Its role as a real-time filter for candidate selection in hierarchical or hybrid control schemes suggests generalizability to other networked cyber-physical systems requiring resilient, transparent trust management.

The security degree metric provides a rigorous, computationally efficient, and auditable method for trust assessment in dynamic adversarial networks, with demonstrated criticality for secure, low-latency, and energy-efficient UAV routing (Han et al., 19 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

SDN-Blockchain Based Security Routing for UAV Communication via Reinforcement Learning (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Beam Search-Proximal Policy Optimization (BSPPO).