PortAgent: LLM-driven VDS Adaptation
- PortAgent is an LLM-driven vehicle dispatching system for container terminals that leverages RAG, few-shot learning, and self-correction to overcome transfer challenges.
- The system’s Virtual Expert Team integrates knowledge retrieval, planning, coding, and debugging modules to ensure rapid, data-efficient deployment across heterogeneous terminal environments.
- Empirical validation demonstrates that one-shot prompting with a closed-loop self-correction workflow improves solver success rates while eliminating heavy reliance on port specialists.
PortAgent is an LLM-driven vehicle dispatching agent designed for Automated Container Terminals (ACTs) that automates the Vehicle Dispatching System (VDS) transferring workflow. The system addresses the longstanding challenges of poor VDS transferability across heterogeneous terminal environments, which stem from strong reliance on port specialists, high demands for terminal-specific data, and labor-intensive manual deployment. PortAgent eliminates these barriers by orchestrating a Virtual Expert Team (VET), leveraging Retrieval-Augmented Generation (RAG), few-shot example learning, and a closed-loop, role-prompted, self-correcting workflow to realize fast, data-efficient, fully automated VDS adaptation (Hu et al., 16 Dec 2025).
1. Transferability Challenges in VDS for Container Terminals
Automated Container Terminals deploy VDSs to coordinate fleets of Automated Guided Vehicles (AGVs) for container movement. Traditional VDSs exhibit poor transferability due to several intrinsic factors:
- Heterogeneous Network Topologies: Variations such as unidirectional loops vs. bidirectional grids require distinct algorithmic logic, rendering solutions non-generalizable.
- Diverse Resource Configurations: Differences in AGV fleet size, vehicle types, or crane placements compel substantial bespoke parameter tuning for each terminal.
- Terminal-Specific Operational Constraints: Vehicle prioritization schemes, route access rules, and speed limitations are highly customized, necessitating frequent dispatch logic rewrites.
Consequently, VDS transfer typically demands:
- Intensive involvement of engineers or operations research specialists,
- Large quantities of terminal-specific data, especially for ML- or RL-based solutions,
- Iterative manual deployment with repeated context handoffs.
PortAgent was developed to circumvent these bottlenecks, providing a scalable, rapid, and specialist-free approach to VDS transfer across terminal domains (Hu et al., 16 Dec 2025).
2. Virtual Expert Team Architecture
PortAgent decomposes the VDS transfer task into a chain-of-expert modules instantiated as role-prompts within a single LLM. The architecture comprises four principal modules collectively termed the Virtual Expert Team (VET):
| Module | Primary Function | Typical Output |
|---|---|---|
| Knowledge Retriever | RAG-driven retrieval of domain exemplars | Prompt with few-shot examples |
| Modeler | Chain-of-Thought VDS mathematical planning | JSON: “plan” (NL formulation), CoT |
| Coder | Maps plan to Python+Gurobi/Pyomo code | JSON: “code” (script), “reasoning” (CoT) |
| Debugger | Static analysis, sandbox execution, feedback | Validated code or correction instructions |
The VET operates on structured inputs (network/configuration/requirements in JSON) and cycles outputs through a feedback loop for iterative refinement. All roles are defined within the LLM via explicit role-prompting, orchestrated as a closed process without human-in-the-loop validation.
3. Retrieval-Augmented Generation and Few-Shot Learning
The core data-efficiency of PortAgent derives from its knowledge base , which aggregates tuples —pairs of environment descriptions and corresponding optimal VDS programs—alongside dictionaries of modeling primitives and full code exemplars. For a target environment , a similarity function ranks past instances, and top- examples (typically ) are incorporated into the input prompt as few-shot demonstrations. This RAG-driven process grounds the LLM’s responses with domain-relevant precedents, minimizing terminal-specific data requirements.
Empirical results demonstrate that single-example (1-shot) prompting achieves optimal knowledge transfer, outperforming both zero- and three-shot approaches by maximizing solver success rates while minimizing output noise (Hu et al., 16 Dec 2025).
4. Automated Self-Correction Workflow
The VET modules interoperate via a stepwise, self-correcting workflow inspired by the LLM Reflexion framework. The pipeline follows:
- Input Handling: Parse environment, configuration, and requirements as structured JSON.
- Knowledge Augmentation: Retrieve relevant exemplars from using RAG.
- Modeling: Modeler generates a natural-language mathematical programming plan using chain-of-thought (CoT) reasoning.
- Coding: Coder converts the plan into executable Python invoking Gurobi (via Pyomo), also employing CoT.
- Debugging: Debugger performs static analysis (AST), sandboxed code execution, and error reflection, producing correction instructions if needed.
- Self-Correction Loop: If errors persist and iteration count (maximum iterations), corrected prompts are dispatched to Modeler and Coder.
Convergence is reached when the code passes both static/dynamic validation and the produced Gurobi solution matches ground-truth within acceptable tolerance :
Default settings employ iterations, with mean convergence achieved in 1.3–1.8 steps (Hu et al., 16 Dec 2025).
5. Implementation Specifics
PortAgent is instantiated with the following hardware and software configuration:
- LLM: Google Gemini 2.5 Flash, temperature for deterministic output, accessed via standard API.
- Role Prompting: Explicit role definitions (“You are ... Modeler/Coder/Debugger”).
- Retrieval: Embedding-based similarity for in-context example selection ().
- I/O Schema: All module communication via JSON-structured prompts and responses.
- Code Execution: Static analysis via Python AST; isolated Python 3.9 interpreter with Gurobi 12.0.3 and Pyomo for optimization.
- Hardware: Intel i5-13500H CPU, 16 GB RAM.
6. Empirical Validation and Results
Evaluation uses Multi-AGV Path Planning (MAPP) as the reference VDS species on directed graphs , with an integer programming objective to minimize total vehicle travel cost, subject to flow balance and discrete movement constraints:
subject to
Test scenarios include (i) road closures, (ii) forbidden roads for specific vehicles, and (iii) designated routes for dangerous goods over 45 cases, benchmarking against manually coded Gurobi solutions.
Quantitative results:
| Metric | Value |
|---|---|
| CER | 100% |
| SSR | 93.33% (86.67–100%) |
| Avg. deploy time | 83.23 s |
| Specialist diff. | Not statistically significant () |
Ablation studies indicate a drastic SSR/CER decline without RAG (SSR: 26.7%, CER: 40%) or without self-correction (SSR/CER: 33.3%). One-shot prompting consistently outperforms both zero- and three-shot settings (Hu et al., 16 Dec 2025).
7. Limitations and Prospects
The principal failure mode for PortAgent is semantic misinterpretation: ambiguous natural-language requirements or LLM stochasticity can produce logically invalid, but syntactically correct, code. Future enhancements proposed include:
- Disambiguation dialogue engines or constraints-elicitation submodules to clarify user requirements,
- Consistency-enforcing approaches such as fine-tuning or retrieval-based reranking for deterministic output,
- Extension to more complex VDS classes (e.g., dynamic rerouting, joint quay-yard coordination) and continuous adaptation in streaming environments.
The VET modularization and minimal-example RAG are essential for PortAgent’s reliability and data efficiency in commercial-scale terminal deployments (Hu et al., 16 Dec 2025).