- The paper introduces a federated fine-tuning architecture that partitions heavy encoder tasks to ground servers and head tuning to satellites, enhancing real-time processing.
- It proposes a microservice inference framework that coordinates modular tasks across satellites, significantly cutting redundant computations.
- The study explores advanced communication strategies, energy-efficient neuromorphic computing, and generative AI for optimizing satellite edge AI deployments.
Remote sensing (RS) satellites generate massive amounts of data, driving a need for intelligent processing, especially for time-critical applications like extreme weather nowcasting, disaster monitoring, and battlefield surveillance. Traditional approaches involve downlinking raw data to ground stations for processing, which introduces significant latency and potential trustworthiness issues. This paper (2504.01676) explores satellite edge AI using Large AI Models (LAMs) as a solution, proposing architectures and technologies for performing AI tasks directly on-board satellites within space computing power networks (Space-CPN). The work focuses on two key phases: fine-tuning LAMs on-board and performing multi-task, multimodal inference.
The paper proposes a satellite federated fine-tuning architecture for training LAMs at the satellite edge. Given the resource constraints of satellites, fine-tuning large models entirely on-board is challenging. The proposed approach partitions the LAM (e.g., a Vision Transformer-based RS model like SpectralGPT (2504.01676)), deploying parameter-heavy components like the encoder on a ground cloud server and lighter components such as the embedding and head layers on LEO satellites. The fine-tuning process then involves parameter-efficient head tuning on the satellites.
The practical implementation of this fine-tuning architecture involves two phases:
- Feature Extraction: LEO satellites process local RS data through their on-board embedding layers to generate embedding vectors. These vectors are gathered within orbits via intra-orbit inter-satellite links (ISLs) and then sent to ground stations (GSs) via satellite-ground links (SGLs). The GSs relay data to the ground cloud server, which uses the encoder to extract features and sends them back to the satellites.
- Model Update: Satellites train their local head layers using the received feature vectors. Intra-orbit model aggregation is performed via ISLs to accelerate convergence. Periodically, updated heads are sent from satellites to GSs for global aggregation on the ground, and the resulting global head is broadcast back to the satellites. A decentralized head fine-tuning approach using only ISLs is also considered to mitigate single-point ground failure.
Implementing this requires efficient communication schemes. Intra-orbit communication leverages stable ring topologies for data gathering and model aggregation using a ring all-reduce based algorithm. Inter-orbit communication, used for decentralized aggregation, faces challenges due to time-varying links and Doppler shift; the paper suggests using shortest path algorithms like Floyd-Warshall on a network graph weighted by link capacities to facilitate parallel transmission. For satellite-ground communication, where multiple satellites can connect to multiple GSs, the problem of maximizing overall data rate is formulated as a network flow problem (solvable with Ford-Fulkerson) to coordinate data transmission from orbits to the ground. Advanced techniques like MIMO and Over-the-Air Computation (AirComp), potentially enhanced with lattice quantization for robustness against low SNR, are discussed for improving SGL efficiency and aggregation.
For satellite edge LAM inference, the paper introduces a microservice-empowered architecture to handle multi-task, multimodal downstream applications. Traditional monolithic inference services deployed on-board lead to computation redundancy as common modules are repeatedly scheduled for different tasks. The microservice architecture virtualizes LAM functional modules (modality encoders, input projectors, backbone, etc.) into independent, loosely coupled microservices. This allows shared modules to be invoked simultaneously by different tasks, reducing redundant computation.
The practical application of this architecture involves:
- Microservice Deployment: Determining which microservices should be pre-deployed on each satellite. This is framed as a complex optimization problem (matching inference task DAGs to time-varying satellite networks) considering heterogeneous satellite resources and dynamic topologies. The paper proposes formulating this as a Markov Decision Process (MDP) and using deep reinforcement learning (specifically PPO) to learn effective, latency-aware deployment strategies. Multi-agent RL is suggested for distributed deployment across sub-networks.
- Microservice Orchestration: Coordinating the sequence of microservice executions across different satellites for a given inference request. This involves designing energy-efficient service routing. If all microservices were on all satellites, it would be a minimum spanning tree problem; with distributed microservices and relay nodes, it becomes a minimum directed Steiner tree (DST) problem on an augmented graph representing computation and communication energy costs. A heuristic algorithm combining Dijkstra and path merging is proposed for static topologies. For dynamic topologies, combining RL with Graph Neural Networks (GNNs like GAT) is suggested to handle the time-varying nature of the satellite network graph.
Looking ahead, the paper discusses several future directions for satellite edge LAM:
- Task-Oriented Communications: Instead of transmitting all data, extract only information relevant to the downstream task. For multimodal RS data, this could involve using the Multimodal Information Bottleneck (MIB) to compress data while preserving task-relevant features. Variational information bottleneck and lattice coding can enhance robustness against noise in SGLs.
- Neuromorphic Computing: Address the high energy consumption of LAMs. Spiking Neural Networks (SNNs) with their event-driven, sparse computation offer a more energy-efficient paradigm. Integrating SNNs with emerging neuromorphic processors can provide a suitable hardware platform for energy-constrained satellites, as explored in projects like ESA's Neuro SatCom (2504.01676).
- Generative AI for Optimization: Leverage Generative AI (like diffusion models and LLMs) to solve complex resource allocation and optimization problems (deployment, orchestration, scheduling) in satellite edge AI networks. Diffusion models can be applied to combinatorial optimization problems like microservice deployment by formulating them as energy minimization. LLMs could potentially act as solvers for optimization problems, allowing natural language problem descriptions, although challenges remain in handling large-scale problems and complex constraints.
In summary, the paper (2504.01676) provides a conceptual framework for satellite edge AI with LAMs, proposing distributed architectures and specific technical solutions for fine-tuning and inference tasks, while highlighting key challenges and promising future research avenues like task-oriented communication, neuromorphic computing, and the application of generative AI for network optimization.