ColO-RAN: Developing Machine Learning-based xApps for Open RAN Closed-loop Control on Programmable Experimental Platforms (2112.09559v2)

Published 17 Dec 2021 in cs.NI and cs.LG

Abstract: In spite of the new opportunities brought about by the Open RAN, advances in ML-based network automation have been slow, mainly because of the unavailability of large-scale datasets and experimental testing infrastructure. This slows down the development and widespread adoption of Deep Reinforcement Learning (DRL) agents on real networks, delaying progress in intelligent and autonomous RAN control. In this paper, we address these challenges by proposing practical solutions and software pipelines for the design, training, testing, and experimental evaluation of DRL-based closed-loop control in the Open RAN. We introduce ColO-RAN, the first publicly-available large-scale O-RAN testing framework with software-defined radios-in-the-loop. Building on the scale and computational capabilities of the Colosseum wireless network emulator, ColO-RAN enables ML research at scale using O-RAN components, programmable base stations, and a "wireless data factory". Specifically, we design and develop three exemplary xApps for DRL-based control of RAN slicing, scheduling and online model training, and evaluate their performance on a cellular network with 7 softwarized base stations and 42 users. Finally, we showcase the portability of ColO-RAN to different platforms by deploying it on Arena, an indoor programmable testbed. Extensive results from our first-of-its-kind large-scale evaluation highlight the benefits and challenges of DRL-based adaptive control. They also provide insights on the development of wireless DRL pipelines, from data analysis to the design of DRL agents, and on the tradeoffs associated to training on a live RAN. ColO-RAN and the collected large-scale dataset will be made publicly available to the research community.

Citations (122)

View on Semantic Scholar

Summary

The paper introduces ColO-RAN, a novel framework that integrates deep reinforcement learning-based xApps with SDR-driven experimental platforms for closed-loop Open RAN control.
It details the development of three xApps that jointly manage slicing and scheduling policies using autoencoders to enhance DRL agent resilience and performance.
Experimental evaluations on a 3.4 GB live RAN dataset demonstrate significant improvements in PRB allocation and throughput for lower-percentile users.

The paper "ColO-RAN: Developing Machine Learning-based xApps for Open RAN Closed-loop Control on Programmable Experimental Platforms" introduces ColO-RAN, an O-RAN testing framework with software-defined radios-in-the-loop. The framework facilitates the design, training, testing, and experimental evaluation of deep reinforcement learning (DRL)-based closed-loop control in the Open Radio Access Network (RAN).

The key contributions of this work are:

The introduction of ColO-RAN, an open, large-scale, experimental O-RAN framework for training and testing machine learning (ML) solutions for next-generation RANs. It combines O-RAN components, a softwarized RAN framework, and Colosseum, an open wireless network emulator based on software-defined radios (SDR). ColO-RAN uses Colosseum as a wireless data factory to generate large-scale datasets for ML training, considering propagation and fading characteristics of real-world deployments. The ML models are deployed as xApps on the near-real-time RAN Intelligent Controller (RIC), which connects to RAN nodes through O-RAN-compliant interfaces for data collection and closed-loop control.
The development of three xApps for closed-loop control of RAN scheduling and slicing policies, and for the online training of DRL agents on live production environments. An innovative xApp design is proposed, based on the combination of an autoencoder with the DRL agent to improve the resilience and robustness to imperfect network telemetry. ColO-RAN is used to provide insights on the performance of the DRL agents for adaptive RAN control at scale. The autoencoders and agents are trained over a 3.4 GB dataset with more than 73 hours of live RAN performance traces, and performs evaluations of DRL agents autonomously driving a programmable, software-defined RAN with 49 nodes.
An analysis of the tradeoffs of training of DRL agents on live networks using Colosseum and Arena with commercial smartphones. The RAN performance is profiled during the DRL exploration phase and after the training, showing how an extra online training step adapts a pre-trained model to deployment-specific parameters, fine-tuning its weights.

The O-RAN architecture includes two RICs that perform network control procedures over different time scales: near-real-time and non-real-time. The non-real-time RIC performs operations at time scales larger than 1 second and can involve thousands of devices, such as service management and orchestration (SMO), policy management, and training and deployment of ML models. The near-real-time RIC implements control loops that span from 10 ms to 1 s, involving hundreds of Centralized Units (CU) and Distributed Units (DU). Procedures for load balancing, handover, RAN slicing policies, and scheduler configuration are examples of near-real-time RIC operations.

The O-RAN specifications include guidelines for the management of ML models in cellular networks, with a ML workflow for O-RAN through five steps: (1) data collection; (2) model design; (3) model training and testing; (4) model deployment as xApp, and (5) runtime inference and control.

Colosseum provides a hybrid Radio Frequency (RF) and compute environment for the deployment of ColO-RAN. The SMO features three compute nodes to train large ML models, 64 Terabytes of storage for models and datasets, and the xApp catalog. The near-real-time RIC provides E2 connectivity to the RAN and support for multiple xApps interacting with the base stations and is implemented as a standalone Linux Container (LXC) that can be deployed on a Colosseum Standard Radio Node (SRN). The near-real-time RIC connects to the RAN base stations through the E2 interface. The base stations leverage a joint implementation of the Third Generation Partnership Project (3GPP) DUs and CUs, running the SCOPE framework, which extends srsRAN with open interfaces for runtime reconfiguration of base station parameters and automatic collection of relevant Key Performance Measurement (KPM)s. The RAN supports network slicing with 3 slices for different Quality of Service (QoS): enhanced Mobile Broadband (eMBB), representing users requesting video traffic; massive Machine-Type Communications (mMTC) for sensing applications, and Ultra Reliable and Low Latency Communications (URLLC) for latency-constrained applications. For each slice, the base stations can adopt 3 different scheduling policies independently of that of the other slices, namely, Round Robin (RR), Water Filling (WF), and Proportional Fair (PF) scheduling policies.

Three xApps were developed to evaluate the impact of different ML strategies for closed-loop RAN control: sched-slicing, sched, and online-training. The control actions available to the xApps are the selection of the slicing policy (the number of Physical Resource Blocks (PRB) allocated to each slice) and the scheduling policy. The xApps have been developed by extending the O-RAN Software Community (OSC) basic xApp framework, and include an interface to the RIC, which implements the Service Model (SM) and performs Abstract Syntax Notation One (ASN.1) encoding/decoding of RAN data and control, and the ML infrastructure itself, which includes autoencoders and DRL agents.

The DRL agents considered in this paper have been trained using the Proximal Policy Optimization (PPO) algorithm. To reduce the size of the observation fed to the DRL agent, autoencoders are used. The DRL agent of sched-slicing jointly selects the slicing and scheduling policy for a single base station and all slices, with three DRL models trained: baseline (DRL-base), an agent that explores a reduced set of actions (DRL-reduced-actions) and an agent where input data is fed directly to the agent (DRL-no-autoencoder). The sched xApp includes three DRL agents that select in parallel the scheduling policy for each slice (eMBB, mMTC, and URLLC). Each agent has been trained using slice-specific data. The online-training xApp supports training a DRL agent using live data from the RAN and performing exploration steps on the online RAN infrastructure itself.

The DRL agents are trained on a dataset collected in experiments on Colosseum. The large-scale RF scenario mimics a real-world cellular deployment in downtown Rome, Italy, with the positions of the base stations derived from the OpenCelliD database.

A key step in the design process of machine learning (ML)-driven xApps is the selection of the features that should be reported for RAN closed-loop control, which can be enabled by large-scale, heterogeneous datasets and wireless data factories. A correlation analysis can help identify the KPMs that provide a meaningful description of the network state with minimal redundancy. The DRL agents for the xApps in this paper consider as input metrics the number of Transport Blocks (TB), the buffer occupancy (or the ratio of PRB granted and requested, which has a high correlation with the buffer status), and the downlink rate.

The paper compares the performance for the sched and sched-slicing xApps, which perform different control actions. The first assumes a fixed slicing profile and includes three DRL agents that select the scheduling policy for each slice, while the second jointly controls the slicing (i.e., number of PRBs allocated to each slice) and scheduling policies with a single DRL agent. The joint control of slicing and scheduling improves the relevant metric for each slice, with the most significant improvements in the PRB ratio and in the throughput for the users below the 40th percentile.

PDF Markdown

ColO-RAN: Developing Machine Learning-based xApps for Open RAN Closed-loop Control on Programmable Experimental Platforms (2112.09559v2)

Summary

Related Papers