DScheLLM: Dual-System Dynamic Scheduling

Updated 21 January 2026

DScheLLM is a dynamic job shop scheduling framework that uses a dual-system LLM with fast and slow reasoning modes to manage production disruptions.
It integrates LoRA fine-tuning on a Huawei OpenPangu Embedded-7B backbone and supports both natural-language and standardized schedule representations for OR solvers.
Experimental evaluations show robust feasibility, rapid schedule generation, and effective handling of disturbances such as machine failures and job cancellations.

DScheLLM is a dynamic job shop scheduling framework that employs a fine-tuned LLM within a dual-system (fast-slow) reasoning architecture to tackle dynamic disruptions in production environments. It is constructed upon the Huawei OpenPangu Embedded-7B backbone, augmented with Low-Rank Adaptation (LoRA) for specialization and leverages both natural-language and standardized schedule representations. DScheLLM represents one of the earliest applications of LLMs for adaptive scheduling under dynamic disturbances, demonstrating robust feasibility, rapid schedule generation, and compatibility with operations research (OR) solvers (Zhang et al., 14 Jan 2026).

1. Formal Problem Setup: Dynamic Job Shop Scheduling

DScheLLM is designed to resolve the dynamic job-shop scheduling problem subject to multiple disturbance types, including processing-time variation, machine assignment change, machine maintenance/failure, job insertion, and job cancellation. The mathematical formulation employs the following sets, parameters, and decision variables:

$J=\{1,2,\dots,n\}$ denotes jobs; $M=\{1,2,\dots,m\}$ machines; $O_i$ operations per job $i$ ; $\mathcal{E}$ dynamic event set.
Parameters comprise nominal and event-modified processing times, designated and event-altered machine assignments, and time intervals for machine maintenance.
Decision variables include start/finish times $S_{i,k}, C_{i,k}$ and sequencing binaries $x_{i,k;j,\ell}$ .

Dynamic event handling updates $p_{i,k}^*$ and $m_{i,k}^*$ per event. The makespan objective is formulated as:

$\min C_{\max} = \min \max_{i \in J_\mathrm{final}} \left( \max_{k \in O_i} C_{i,k} \right)$

Constraints cover job precedence, machine capacity, and dynamic event exclusions such as maintenance intervals.

2. Dual-System LLM Reasoning Architecture

DScheLLM extends its pretrained model by introducing two LoRA-fine-tuned subspaces to emulate “fast” and “slow” reasoning modes:

Fast-Thinking ("quick think"): Optimized for single, minor disturbances. Inputs consist of the current schedule, one event, minimal constraints, and a special /no_think tag. Outputs are immediate, locally adjusted schedules formatted as machine lists: $(\mathrm{job}, \mathrm{op}, \mathrm{machine}, \mathrm{start}, \mathrm{end})$ . Intended latency is sub-second.
Slow-Thinking ("stepwise think"): Engineered for multi-event, major disruptions. Inputs aggregate the schedule, multiple events, and optionally /auto_think or no tag. Outputs include chain-of-thought traces and a standardized job shop problem (JSP) representation marked with [unused17], suitable for direct ingestion by an OR-Tools solver. Typical latency is several seconds plus solver computation.

Mode selection is user-controlled (via tag) or performed by an internal classifier. FAST mode is characterized by low latency and local adjustment; SLOW mode guarantees solver compatibility and complete schedule recomputation.

3. Dataset Generation and Fine-Tuning Protocol

Training data is synthesized using exact solutions from an OR-Tools CP-solver across randomly instantiated JSPs, perturbed with dynamic events for comprehensive coverage:

Minor event data: Solver computes new schedule; recorded for FAST fine-tuning.
Major event data: Human processor standardizes JSP and event description, solver generates optimal schedule, with chain-of-thought trace and standardized input/output stored for SLOW fine-tuning.

Each reasoning mode receives $10\,000$ samples. LoRA adaptation is applied at rank $r=16$ with scaling $\alpha=32$ and 0.05 dropout across all key projection layers. Cross-entropy loss (excluding prompt tokens), Adam optimizer (LR $=1\mathrm{e}{-4}$ ), 3 epochs, FP16 training on 8 Ascend NPUs are used.

4. Inference Algorithm and Dynamic Mode Switching

DScheLLM inference involves building an NL prompt encoding the schedule and events, appending a mode tag, and routing through the appropriate LoRA pathway:

function DScheLLM_Infer(current_schedule, events, mode_tag=None):
    prompt ← format_schedule_and_events(current_schedule, events)
    if mode_tag is "no_think":
        reasoning_mode ← FAST
    elif mode_tag is "auto_think":
        mode_decision ← DScheLLM.classify_mode(prompt)
        reasoning_mode ← mode_decision
    else:
        reasoning_mode ← SLOW

    if reasoning_mode == FAST:
        prompt ← prompt + " /no_think"
        output ← DScheLLM.generate(prompt, LoRA_subspace="fast")
        return parse_schedule(output)

    else:  # SLOW
        prompt ← prompt + " /auto_think"
        chain, std_jsp ← DScheLLM.generate_chain_and_std(prompt, LoRA_subspace="slow")
        solver_input ← parse_standardized_jsp(std_jsp)
        new_schedule ← ORToolsSolve(solver_input)
        return chain, new_schedule

FAST mode outputs are directly consumable by users; SLOW mode produces both interpretive chain and a solver-ready schedule.

5. Experimental Evaluation: Benchmarks and Results

Empirical assessment utilizes the FT06 (Fisher–Thompson JSP) benchmark with 6 machines and 6 jobs. Each scenario is tested with 30 event-driven instances. Evaluation metrics include feasibility rate, optimality rate (vs. OR-Tools optimum), makespan gap, and inference time.

Mode	Feasibility Rate (%)	Optimality Rate (%)	Mean Inference Time
Fast-Thinking	73.33	46.67	≈ 120 ms
Slow-Thinking	100	100	≈ 2 s (LLM) + 50 ms (solver)

Automatic mode selection yields 100% accuracy for fast (minor) problems and 33.3% for slow (complex) ones, often misclassifying the latter. Compared to cold-start OR-Tools (≈50 ms per instance, requires manual parsing), DScheLLM+OR-Tools streamlines NL-driven invocation, minimizing human intervention.

6. Adaptability, Limitations, and Future Extensions

DScheLLM accommodates all five disturbance types and unseen combinations without event-specific code adaptation. The chain-of-thought methodology supports interpretability and collaborative decision-making. Limitations include suboptimal automatic mode selection for complex scenarios, non-guaranteed global optimality in FAST mode, and scalability constraints to small/medium JSPs.

Future pathway suggestions include extending the system to flexible/flow-shop and multi-objective scheduling, calibration or reinforcement learning for improved mode discrimination, hierarchical decomposition to handle larger schedules, and further LoRA or RL-driven refinement for the FAST reasoning mode.

DScheLLM demonstrates that dual-system LLMs, fine-tuned on exact solver data, offer a unified, interpretable, and adaptive scheduling assistant paradigm, positioning LLMs as viable tools for dynamic, intelligent scheduling optimization in manufacturing environments (Zhang et al., 14 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

DScheLLM: Enabling Dynamic Scheduling through a Fine-Tuned Dual-System Large language Model (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to DScheLLM.