Papers
Topics
Authors
Recent
Search
2000 character limit reached

Tropical: Enhancing SLO Attainment in Disaggregated LLM Serving via SLO-Aware Multiplexing

Published 15 Jun 2026 in cs.DC | (2606.16264v1)

Abstract: To guarantee service quality in transformer based LLM serving, it is essential to meet the latency constraints of both the prefill phase (measured by Time-to-First-Token, TTFT) and the decode phase (measured by Time-per-Output-Token, TPOT). Non-disaggregated serving places prefill and decode on the same worker, while disaggregated serving places the prefill and decode on isolated workers. However, no single architecture excels in both TTFT and TPOT metrics. After conducting a root cause analysis, we concluded that indisaggregated LLM serving, prefill execution has minimal interference with decode execution but result in high queuing times. In contrast,non-disaggregated LLM serving effectively reduces queuing times but introduces significant interference between prefills and decodes. In order to leverage the best aspects of both non-disaggregated anddisaggregated LLM serving, we have designed and implemented Tropical.Tropical introduces an sevice-level objectives (SLO)-aware multiplexing strategy that balances the queuing time and the interference, enabling the LLM serving to achieve high TTFT and TPOT SLOs simultaneously. Our evaluation of real-world datasets reveals that Tropical outperforms both state-of-the-art non-disaggregated and disaggregated LLM serving systems, achieving up to 2.09 more requests within a 90% SLO attainment. Specially, compared to the disaggregated LLM serving system, Tropicalimproves P90 TTFT performance by 9 with only an 15% reduction in P90 TPOT. Against the non-disaggregated LLM serving systems, Tropicaldelivers a 2.8 performance improvement in P90 TPOT while maintaining the same P90 TTFT.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.