Papers
Topics
Authors
Recent
Search
2000 character limit reached

SCLAR Framework: Cross-Layer Wireless Networks

Updated 27 February 2026
  • SCLAR Framework is a rigorous cross-layer metric that combines physical-layer (SINR and capacity) and MAC-layer (collision and jamming) outcomes in wireless networks.
  • It employs a ResNet-based deep Q-network to autonomously learn optimal transmission policies, achieving significant performance gains and rapid convergence in multi-cell uplink scenarios.
  • The design couples channel quality and transmission success, ensuring spectral efficiency while facing challenges like high offline training requirements and limited real-time adaptability.

The Sum Cross-Layer Achievable Rate (SCLAR) framework provides a rigorous, cross-layer metric and methodology for managing channel access in time-slotted uplink wireless networks subject to stochastic transmission schedules and adversarial jamming. Targeting multi-cell and multi-user environments, SCLAR explicitly couples physical-layer modulation, coding, and signal-to-interference-plus-noise ratio (SINR) with medium access control (MAC)-layer outcomes such as successful transmission, collision, and interference. Leveraging deep reinforcement learning (DRL), specifically residual network-based Deep Q-Networks (ResDQN), SCLAR enables intelligent user equipment to autonomously learn transmission policies that maximize network-wide throughput under hostile and uncertain conditions (Basit et al., 20 Jan 2025, Basit et al., 2024).

1. Mathematical Definition of SCLAR

At its core, SCLAR is defined in terms of physical-layer channel capacity, MAC-layer success, and a summation across users and time slots. For user equipment nn in cell kk during slot tfst_f^s:

  • The instantaneous post-SIC SINR is

SINR^UEn[k][tfs]=[AUE[k]]n,nPUEn[k]hUEn[k]4nn[AUE[k]]n,nPUEn[k]hUEn[k]HhUEn[k]2+m[IJ[k]]m,mPJm[k]hUEn[k]HhJm[k]2+hUEn[k]2σ2\widehat{\mathrm{SINR}}^{[k]}_{\mathrm{UE}_n}[t_f^s] = \frac{[A^{[k]}_{\mathrm{UE}}]_{n,n} P^{[k]}_{\mathrm{UE}_n} \|h^{[k]}_{\mathrm{UE}_n}\|^4} { \sum_{n'\neq n} [A^{[k]}_{\mathrm{UE}}]_{n',n'} P^{[k]}_{\mathrm{UE}_{n'}} |h^{[k]\sf H}_{\mathrm{UE}_n}h^{[k]}_{\mathrm{UE}_{n'}}|^2 + \sum_m [I^{[k]}_{\mathrm{J}}]_{m,m} P^{[k]}_{\mathrm{J}_m} |h^{[k]\sf H}_{\mathrm{UE}_n} h^{[k]}_{\mathrm{J}_m}|^2 + \|h^{[k]}_{\mathrm{UE}_n}\|^2 \sigma^2 }

  • The physical-layer capacity (in bits/s/Hz) is

CUEn[k][tfs]=log2(1+SINR^UEn[k][tfs])C^{[k]}_{\mathrm{UE}_n}[t_f^s] = \log_2\bigl(1 + \widehat{\mathrm{SINR}}^{[k]}_{\mathrm{UE}_n}[t_f^s]\bigr)

  • The MAC-layer success rate ξUEn[k][tfs]\xi^{[k]}_{\mathrm{UE}_n}[t_f^s] is the probability that user nn's packet is successfully received in slot tfst_f^s (i.e., no collision, no jamming).
  • The cross-layer achievable rate (CLAR) is

RUEn[k][tfs]=ξUEn[k][tfs]CUEn[k][tfs]R^{[k]}_{\mathrm{UE}_n}[t_f^s] = \xi^{[k]}_{\mathrm{UE}_n}[t_f^s] C^{[k]}_{\mathrm{UE}_n}[t_f^s]

  • The sum cross-layer achievable rate (SCLAR) for cell kk over a frame tft_f of SS slots (and aggregated across the NUE[k]N^{[k]}_{\mathrm{UE}} UEs) is

SCLAR[k][tf]=n=1NUE[k]s=1SRUEn[k][tfs]\mathrm{SCLAR}^{[k]}[t_f] = \sum_{n=1}^{N^{[k]}_{\mathrm{UE}}} \sum_{s=1}^S R^{[k]}_{\mathrm{UE}_n}[t_f^s]

This composite figure of merit incentivizes physical-layer spectral efficiency only if transmission is MAC-successful; packets lost to collisions or jamming contribute zero.

2. System and Channel Model

SCLAR is conceived for a multi-cell, time-slotted uplink network with the following structure:

  • Each of KK cells contains a cluster head (CH) equipped with LL antennas and NUE[k]N^{[k]}_{\mathrm{UE}} single-antenna user equipments (UEs)—one intelligent UE (iUE) and NUE[k]1N^{[k]}_{\mathrm{UE}}-1 predefined UEs (pUEs).
  • Transmissions are organized into frames indexed by f=1,,Ff=1,\dots,F, each with SS time slots tfst_f^s.
  • pUE transmission schedules are i.i.d. Bernoulli random variables with parameter Ω\Omega: Pr([AUE[k]]n,n=1)=Ω\Pr\big([A^{[k]}_{\mathrm{UE}}]_{n,n}=1\big)=\Omega.
  • Malicious jammers (MJ[k]M^{[k]}_{\mathrm{J}} in each cell) follow periodic on/off activation over (Son,Soff)(S_{\mathrm{on}}, S_{\mathrm{off}}) slots, independent of UE scheduling.
  • Channel vectors for UEs and jammers experience small-scale Rayleigh fading and quasi-static spatial variation, with no large-scale fading modeled.
  • The received signal at CH is

y[k]=HUE[k](PUE[k])1/2AUE[k]xUE[k]+HJ[k](PJ[k])1/2IJ[k]xJ[k]+ik()+n[k]y^{[k]} = H^{[k]}_{\mathrm{UE}} (P^{[k]}_{\mathrm{UE}})^{1/2} A^{[k]}_{\mathrm{UE}} x^{[k]}_{\mathrm{UE}} + H^{[k]}_{\mathrm{J}} (P^{[k]}_{\mathrm{J}})^{1/2} I^{[k]}_{\mathrm{J}} x^{[k]}_{\mathrm{J}} + \sum_{i\neq k}(\cdots) + n^{[k]}

  • Detection at the CH uses linear receivers (x^UE[k]=V[k]Hy[k]\hat x^{[k]}_{\mathrm{UE}} = V^{[k]\sf H} y^{[k]}), followed by matched-filter successive interference cancellation (SIC).

3. SCLAR Maximization as a POMDP/MDP

The SCLAR maximization problem seeks the optimal transmit policies AUE[k]A^{[k]}_{\mathrm{UE}} (binary slot allocation) for all UEs, maximizing

n=1NUE[k]rn[k]Tan[k]\sum_{n=1}^{N^{[k]}_{\mathrm{UE}}} r_n^{[k]\sf T} a_n^{[k]}

subject to an[k](tfs){0,1}a_n^{[k]}(t_f^s)\in\{0,1\} and unknown jammer actions IJ[k]I^{[k]}_{\mathrm{J}}. Here rn[k]r_n^{[k]} is the CLAR vector of UE nn over SS slots.

This dynamic scenario is naturally formulated as a partially observed Markov decision process (POMDP):

  • Agent: The iUE in each cell.
  • Action space: A={dispatch,hold}\mathcal{A} = \{\mathtt{dispatch}, \mathtt{hold}\} for each slot.
  • Observation: After each slot, a 6-element one-hot encoding signals "idle," "busy," "success," "collision," "jammed-UE," or "jammed-pUE."
  • State: Concatenation, over all UEs (pUEs+iUE), of their (last action,last ACK,last reward)(\text{last action}, \text{last ACK}, \text{last reward}).
  • Reward: Network-level,

rt+1=νnet(UtiUE+n=1N[k]Un,tpUE)r_{t+1} = \nu^{\mathrm{net}} \Bigl(\mathbb{U}^{\mathrm{iUE}}_t + \sum_{n=1}^{N^{[k]}} \mathbb{U}_{n,t}^{\mathrm{pUE}}\Bigr)

where U\mathbb{U} are scaled CLARs, and νnet\nu^{\mathrm{net}} sets reward magnitudes based on outcomes ("good", "excellent", "bad", "worst").

The transition model, incorporating all pUE and jammer schedules, is unknown; learning proceeds from observed transitions and rewards.

4. Deep Reinforcement Learning: ResNet-Based Q-Learning

A ResNet-based DQN (“ResDQN”) is used to approximate the optimal action-value function Q(s,a)Q^*(s,a):

  • Input: State vector (st|s_t|-dimensional, e.g., 39).
  • Architecture: Five residual blocks in series, each with two fully-connected 32-unit ReLU layers and skip connections. The output is passed to two 128-unit dense ReLU layers, then to a two-unit output for QQ-values (one per action).
  • Loss function: Mean-squared Bellman temporal difference (TD) error,

L(θ)=E(s,a,r,s)[(r+γmaxaQ(s,a;θ)Q(s,a;θ))2]\mathcal{L}(\theta) = \mathbb{E}_{(s,a,r,s')}\left[\left(r + \gamma\max_{a'} Q(s', a'; \theta^-) - Q(s, a; \theta)\right)^2\right]

  • Training protocol:
    • Choose ata_t via ϵ\epsilon-greedy on Q(st,;θ)Q(s_t, \cdot; \theta).
    • Apply ata_t, observe (ACK,rt+1,st+1)(\text{ACK}, r_{t+1}, s_{t+1}).
    • Store (st,at,rt+1,st+1)(s_t, a_t, r_{t+1}, s_{t+1}) in D\mathcal{D}.
    • Sample minibatch from D\mathcal{D}; update via gradient descent.
    • Target network updated via soft update θτθ+(1τ)θ\theta^- \leftarrow \tau\theta + (1-\tau)\theta^-.
    • Decay ϵ\epsilon as learning proceeds.
    • 3. Return π(s)=argmaxaQ(s,a;θ)\pi^*(s) = \arg\max_a Q(s,a;\theta) at convergence.

Key DRL hyperparameters include learning rate 5×1045\times10^{-4}, discount factor $0.99$, replay buffer size 10510^5, minibatch size $64$, and soft update factor τ=103\tau=10^{-3} (Basit et al., 20 Jan 2025, Basit et al., 2024).

5. Performance Evaluation and Comparative Results

SCLAR framework performance has been benchmarked using:

  • Simulation settings: Multi-cell, multi-UE (1 iUE + 10–35 pUEs per cell), 2–5 jammers, frame sizes 5–30 slots.
  • Baselines: Fully connected DQN (FC-DQN), GRU-DQN, network-aware UE (omniscient optimum), tabular Q-learning (Basit et al., 2024).
  • Key metrics: Instantaneous and average SCLAR, cumulative reward, convergence rate, and training loss.

Observed outcomes include:

  • ResDQN achieves within a few percent of the omniscient optimum SCLAR across all frame sizes.
  • Outperforms FC-DQN and GRU-DQN by 15–25% in final SCLAR and by 20–35% in convergence speed. For example, in (Basit et al., 2024), average SCLAR (bits/s/Hz) over slots 21–100: Tabular Q = 5.2, FC-DNN DQN = 11.8, ResDNN DQN = 18.7.
  • Demonstrated robustness to increased numbers of pUEs and jammers; convergence to optimal reward in 2,000\leq 2{,}000 episodes.
  • Training loss declines monotonically; learning produces judicious action patterns that avoid collision and jamming.

6. Insights, Limitations, and Prospects

The SCLAR framework exhibits the following properties:

  • Cross-layer reward formulation directly couples physical-layer channel quality (via SINR) and MAC-layer success, enabling agents to adapt without explicit coordination for coexistence.
  • ResNet skip connections in the Q-network benefit training by facilitating identity mappings and more stable policy updates in partially observed MDPs.
  • Principal limitations are the need for substantial offline training, high memory requirements due to large replay buffers, and limited adaptability to rapid online environment changes due to partial observability.
  • Current design assumes a fixed number of iUEs per cell and a static frame structure; scenario generalization requires further development.

Extension avenues include multi-agent DRL for multiple iUEs and inter-cell coordination, continuous action spaces for joint time-slot and power allocations, transfer learning for mobility or variable topologies, and integration with reconfigurable intelligent surfaces for enhanced anti-jamming capabilities (Basit et al., 20 Jan 2025).

The SCLAR framework unifies and advances DRL-based channel access initiatives by providing a mathematically rigorous, physically grounded, and simulation-validated cross-layer performance metric. Direct comparison with DRL alternatives, including tabular Q-learning and fully-connected DNNs, demonstrates that the addition of residual connectivity in the Q-network architecture is crucial for robust and rapid learning in jam-prone, partially observed settings (Basit et al., 20 Jan 2025, Basit et al., 2024). SCLAR's general methodology supports diverse wireless scenarios subject to adversarial interference, limited feedback, and stringent coexistence requirements.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SCLAR Framework.