SpaceLM: LLM-Based Spacecraft Controller
- SpaceLM is a controller framework that fine-tunes decoder-only transformers (e.g., Llama-2 models) to output high-precision numeric control vectors for spacecraft guidance.
- It employs low-rank adaptation (LoRA) and supervised training with expert trajectories, achieving near-optimal performance with significantly fewer data points than traditional DNN controllers.
- The model supports real-time inference with low latency, robust safety checks, and versatile applications in tasks like low-thrust transfers, cislunar navigation, and powered descent guidance.
SpaceLM is a family of foundation-model-based controllers for space systems, leveraging fine-tuned LLMs, specifically decoder-only transformers, to solve a range of spacecraft control, guidance, and trajectory optimization tasks using string-based numeric outputs. SpaceLM demonstrates that general-purpose LLMs, such as Llama-2-7B and Llama-2-13B, can be efficiently adapted via supervised fine-tuning to output high-precision control vectors for diverse spaceflight problems, with data efficiency, generalization, and real-time deployment properties that are competitive with or superior to task-specific deep neural networks (Zucchelli et al., 28 Jan 2025).
6-2^ Model Architecture and Numeric Encoding
SpaceLM is built on open-source decoder-only transformer architectures without modification to their attention or feed-forward structures. The core components include:
- Base Model: Tested variants include Llama-2-7B (7×10⁹ parameters, embedding dim ~4096, 32 layers) and Llama-2-13B (13×10⁹ parameters, embedding dim ~5120, 40 layers).
- Low-Rank Adaptation (LoRA): All weight matrices are augmented with a rank- update ( for 7B, for 13B) as for each parameter matrix . This enables efficient supervised fine-tuning without requiring full model retraining.
- Tokenization and Numeric I/O: The tokenizer must include all characters necessary for high-precision floating-point output ("0–9", "-", ".", "[", "]", ",", " "). Each numeric output is formatted as a string with up to 10 significant digits, e.g., "0.0354481279". Consistent zero padding enhances alignment and prediction performance.
- Mixed-Precision Serving: The fine-tuned SpaceLM can be quantized to 16-bit or 8-bit precision using GPTQ-style or block-wise methods with negligible loss in accuracy.
6.2^ Fine-Tuning Protocols and Task Coverage
The fine-tuning process is entirely supervised, using trajectories generated by classical optimal-control solvers as "expert" demonstrations. Each data sample is presented as a state-action text sequence that maps spacecraft state prompts to controlled outputs, with loss applied to token-level cross-entropy (for text output) and optionally direct regression on parsed numeric values.
Specific tasks addressed include:
- 3D Linear Spring Toy: , LTI state dynamics, infinite-horizon LQR control generation, with prompt: "State: r=[…] v=[…] → Control: [uₓ,u_y,u_z]"
- Low-Thrust Orbit Transfer: Nonlinear gravitational dynamics, optimal energy cost transfer; prompt: "Input: pos=[…] vel=[…] → Output thrust=[…,.,.]"
- Cislunar Linear Transfer (CR3BP Linearization): CR3BP rotating-frame linearized dynamics, linearly constrained , with discrete token encoding for control sign.
- Powered Descent Guidance: Losslessly convexified 3-DoF descent via SOCP, full state-to-thrust output, up to 11 significant digits.
Fine-tuning employs the AdamW optimizer, learning rate , LoRA ranks as above, batch size 1–4 (with gradient accumulation), 800k update steps per task (summed for multitask fine-tuning), and weight decay60–9201.
3. Data Efficiency and Generalization
SpaceLM exhibits strong data efficiency compared to task-specific DNNs and high robustness to distribution shift:
- 3D Spring Toy: Achieves (i.e., optimal) with just 30 trajectories; typical DNNs need . Out-of-distribution (10 initial perturbation) .
- Orbit Transfer: Reaches sub- DU (distance unit) RMSE with 400–1600 trajectories. 13B model shows increased sample efficiency at low data regimes compared to 7B.
- Cislunar Transfer: Mean final distance error 6 km with 1000 trajectories; degrades smoothly with reduced data volume.
- Powered Descent, Multitasking: Joint fine-tuning across landing and orbit transfer produces only 5% degradation on landing and 2% on orbit transfer versus single-task, with thrust constraints met 95% of the time.
Relative data requirements and generalization:
| Model/Task | Data for Optimality | OOD Robustness | Multi-task Degrade |
|---|---|---|---|
| SpaceLM | 30–400 trajectories | Slow RMSE increase | 5% loss |
| DNN Controller | 100–1600 trajectories | Doubled RMSE on bias | Task-specific |
For orbit transfer, LLMs succeed in 100/100 test cases versus 83/100 for classical shooting methods.
4. Performance Metrics and Empirical Results
SpaceLM controllers use problem-specific metrics, including output cost ratio (), RMSE of final state or position, and compliance with physical or mission constraints:
- 3D spring toy: approaches unity with small .
- Low-thrust transfer: Final position RMSE DU, matching or exceeding classical optimal controllers with fewer data.
- Cislunar linearization: Mean 6 km final error (1000 trajectories).
- Powered descent: Final position/velocity RMSE within 2% of optimal, thrust violations 5%.
- Multitask SpaceLM: Minor performance loss compared to independent models, strong positive transfer effect.
5. Deployment, Inference, and Safety
- Inference Loop: At runtime, spacecraft state is serialized to a text prompt (200 tokens). Model inference (decoding numeric output) requires 1–20 ms per step on an A100-class GPU in 8-bit quantization. Demonstrated control-loop frequencies: 10–50 Hz, with end-to-end latency 30–100 ms.
- Safety Protocols:
- Monte-Carlo validation on out-of-distribution trajectories to ensure bounded error and cost.
- Enforced rejected-action protocols: any infeasible or unsafe control output triggers fallback to a classical controller.
- Continual prediction-actual consistency checks, with triggers for re-fine-tuning or controller reversion on drift.
- Cross-validation with parallel classical controllers recommended for high-assurance missions.
6. Comparison to Conventional Controllers
- Data Requirement: SpaceLM achieves near-optimal control policy learning with an order of magnitude less data than standard DNNs.
- Generalization: Robustness to initial-state distribution and multi-task transfer is superior to classical architectures.
- Deployment: SpaceLM requires no architectural changes, relies only on LoRA-based adaptation, and is easily quantized.
- Latency: Real-time inference (30–100 ms) covers the requirements for guidance and navigation loops in space systems.
7. Significance and Applications
SpaceLM establishes that foundation-model-based LLMs—after modest data-efficient LoRA-based fine-tuning—can serve as real-time, generalizing, multitask controllers for a variety of space system problems, matching or exceeding DNN and classical controller benchmarks on data efficiency and deployment practicality. The architecture supports direct numeric output via text strings and exhibits strong inductive bias toward learning high-precision control laws from relatively small expert datasets.
Demonstrated use cases include low-thrust trajectory optimization, cislunar transfer, powered descent, and hybrid multitask scenarios, supporting robust deployment, safety validation, and fallback fail-safes. This suggests a path toward unified, adaptable controller models for autonomous spacecraft and complex mission profiles (Zucchelli et al., 28 Jan 2025).