Two-Type Server Queueing Systems

Updated 27 January 2026

Two-Type Server Queueing Systems are models where jobs and servers are partitioned into two distinct types, each following specialized service disciplines like retrial, tandem, collaborative, polling, and FCFS-ALIS.
The analysis employs multidimensional probability generating functions, boundary-value problems, and Laplace–Stieltjes transforms to derive steady-state distributions and characterize delay asymptotics.
Optimal control policies feature threshold-based and deterministic assignment rules that effectively balance service priorities, minimize delays, and enhance performance in heterogeneous networks.

A two-type server queueing system consists of jobs and servers partitioned into two distinct types or functional roles. Systems of this class arise in retrial queues, tandem queues with server-type options, collaborative service networks, priority polling systems, and FCFS-compatibility graphs. These models encompass both single-server priority mechanisms with retrials and parallel many-server architectures with dynamic matching under compatibility graphs. The core analytical challenge is to characterize steady-state distributions, delay asymptotics, and optimal control policies, accounting for interactions between the two server/job types under diverse service disciplines and arrival processes.

1. System Architectures and Model Formulations

Two-type server queueing system architectures can be broadly categorized into retrial systems, tandem (clearing) systems with assignment control, collaborative-service networks, polling models with priorities, and parallel compatibility networks.

Retrial queues with priority: The $M_1,M_2/G_1,G_2/1$ retrial queue (Liu et al., 2019) has Poisson arrivals of type-1 and type-2 jobs with non-preemptive priority for type-1. Blocked type-2 customers join an infinite-capacity orbit and retry for service via a Poisson process.
Tandem/clearing systems with two-stage or collaborative assignment: In "Control policies for a two-stage queueing system with parallel and single server options" (Lu et al., 20 Jan 2026), two servers complete phase I service and post-completion, must choose between parallel or single-server phase II.
Collaborative network models: Type-I "flexible" servers can process jobs independently or in collaboration with dedicated Type-II servers; optimal assignment follows a dynamic threshold policy (Lu et al., 20 Jan 2026).
Polling models with priority: Systems consist of two queues with two-level priorities in Q₁, cyclic or Markovian server routing, and exhaustive/gated service disciplines (Boon et al., 2014, Chu et al., 2014).
Compatibility graph FCFS-ALIS models: Multiple server/customer types are matched according to a bipartite compatibility graph under FCFS-ALIS assignment (Adan et al., 2016).

Table 1 summarizes key model mechanisms:

System Type	Server Types / Roles	Job/Customer Types	Queueing Mechanism
Priority Retrial	Single, two customer types	Type-1, Type-2	FCFS, orbit retrials
Tandem/clearing	Flexible, dedicated	Homogeneous	Two-stage assignment
Collaborative	Type-I, Type-II	Homogeneous	Independent/collab.
Polling (priority)	Single, cyclic routing	High, Low, Type-2	Exhaustive/gated
Parallel FCFS-ALIS	Multiple, compatibility	Multiple	FCFS-ALIS matching

2. Queueing Disciplines, Service Assignment, and Priority Structures

Service assignment and priority discipline play fundamental roles in two-type systems:

Non-preemptive priority: In retrial queues (Liu et al., 2019), type-1 customers are always chosen over type-2 if waiting, but ongoing service is never interrupted.
Dynamic server assignment: Post phase-I completion, a control policy determines whether a job proceeds to parallel service or waits for a single-server higher-rate facility (Lu et al., 20 Jan 2026). The optimal policy exhibits threshold behavior based on system state and service rates.
Collaborative decisions: Flexible servers choose between immediate independent service or waiting for dedicated collaboration. The optimal assignment uses a single threshold in queue length before first interaction, informed by cost/service rate ratios (Lu et al., 20 Jan 2026).
Polling service disciplines: Gated, globally gated, and exhaustive service affect waiting-time distributions, cycle times, and queue-length statistics for high- vs. low-priority customers (Boon et al., 2014, Chu et al., 2014).
Compatibility-based assignment: In FCFS-ALIS models, assignment depends on both server and customer type compatibility. Matching rates are determined by system capacity fractions and arrival rates (Adan et al., 2016).

3. Performance Metrics, Generating Functions, and Analytical Tools

Two-type systems require multi-dimensional probability generating functions (PGFs), boundary-value problems, and explicit asymptotic analysis:

Steady-state distributions: In retrial models, joint PGFs for queue and orbit sizes conditional on server status are derived. Stochastic decomposition expresses these as sums of independent simpler blocks, aiding in tail asymptotic analysis (Liu et al., 2019).
Functional equations and boundary-value reduction: Two-input/stream retrial systems and two-class retrials are analyzed using multi-dimensional PGFs that satisfy functional equations reduced to Riemann boundary-value problems (Avrachenkov et al., 2012, Dimitriou, 2018).
Cycle and waiting time LSTs: Polling systems use Laplace–Stieltjes transforms of cycle/intervisit/waiting times, with formulas varying by service discipline (Boon et al., 2014, Chu et al., 2014).
Matching rates: In FCFS-ALIS, matching rate $r_{i,j}$ for customer $i$ and server $j$ is given by bipartite-matching model; the design heuristic uses these rates for staffing (Adan et al., 2016).
Heavy-traffic and tail asymptotics: Systems are analyzed in heavy-traffic ( $\rho\to1$ ) or large switch-over regimes. For retrial priority queues, the queue-length/orbit-size tail probability decays as $j^{-\alpha_{1}}L(j)$ , with $\alpha_1$ depending on type-1 service tail (Liu et al., 2019); for polling, limiting delay distributions are Gamma mixtures or uniform in large switch-over limit (Chu et al., 2014).

4. Control Policies and Structure of Optimal Assignments

Optimal server/job assignment frequently exhibits threshold structures and strong monotonicity properties:

Threshold-based control: In tandem and collaborative systems, for each fixed downstream state, there exists a unique threshold in upstream queue length above/below which assignment switches from parallel/independent to single server/collaborative (Lu et al., 20 Jan 2026, Lu et al., 20 Jan 2026).
Monotonicity in states and parameters: In the collaborative network, thresholds are increasing functions of in-progress job counts and holding costs. Comparative statics show threshold shifts with upstream rate, holding costs, and downstream blocking (Lu et al., 20 Jan 2026, Papachristos et al., 2019).
Partitioning and assignment: For optimal partition of server capacity and customer assignment, only one type (or none) is ever split probabilistically, and in the joint partition-assignment problem, the optimal solution is always deterministic (Cao et al., 2021).
Polling system priorities: Service of high-priority jobs always takes precedence within queue 1, and waiting-time formulas reveal the impact of priority structures in comparison to non-priority baselines (Boon et al., 2014, Chu et al., 2014).

5. Tail Asymptotics, Stability, and Extreme Regimes

Tail asymptotics and stability properties are critical for evaluating congestion probabilities and system robustness:

Heavy-tail service distributions: If type-1 service times have regularly varying tails, queue and orbit sizes decay polynomially; type-2 dominates only if its service is even heavier-tailed (Liu et al., 2019).
Stability conditions: Systems with server capacity-sharing, retrials, or buffer limitations have stability conditions involving arrival rates and retrial rates, e.g., $\lambda/\mu(1+\lambda_{i}/\mu_{i})<1$ for each stream to guarantee positive recurrence (Avrachenkov et al., 2012).
Extreme congestion probabilities: Asymptotics for $P\{R_{orb}>B\}\sim B^{-\alpha_{1}}L(B)$ define overflow probabilities crucial in system dimensioning (Liu et al., 2019).
Large switch-over and heavy-traffic: For polling systems, delays can be uniformly distributed or Gamma-mixtures in the limit of large switch-over times or as $\rho\uparrow 1$ (Chu et al., 2014).

6. Applications, Computational Algorithms, and Comparative Insights

Two-type server queueing systems find applications in telecommunications, collaborative computing, healthcare, multiserver call centers, and manufacturing.

Algorithmic implications: Stochastic decomposition, boundary-value techniques, and threshold heuristics yield efficient numerical algorithms for rare-event and steady-state computation (Liu et al., 2019, Dimitriou, 2018, Cao et al., 2021).
Dimensioning and design: Analytical tail formulas inform state-space truncation in implementation. FCFS-ALIS matching rates underpin staffing and resource allocation in large-scale parallel systems (Adan et al., 2016).
Heuristic policy comparison: Simple fixed-threshold heuristics often incur vastly greater cost than optimal threshold policies; adaptive blocking-aware rules outperform naive benchmarks by substantial margins (Lu et al., 20 Jan 2026, Lu et al., 20 Jan 2026).
Priority benefits and trade-offs: Introducing priorities yields lower mean delays for high-priority traffic at the expense of increased delay variability for low-priority customers (Boon et al., 2014).
Collaborative/partition optimization: Capacity partition and job assignment should account for job-type heterogeneity and the interplay of waiting costs and queueing delays; optimal deterministic routing emerges when server partition and assignment are jointly optimized (Cao et al., 2021).

7. Key Equations and Representative Results

Priority retrial orbit tail: $P\{R^{0}>j\}\sim j^{-\alpha_{1}}L(j)$ (Liu et al., 2019).
Polling model mean waiting times:

$E[W_H]=(1+\rho_H)E[C_{res}],\quad E[W_L]=(1+2\rho_H+\rho_L)E[C_{res}]$

(Boon et al., 2014).

FCFS-ALIS matching rates, two-type graph:

$r_{1,1}=\beta_1,\quad r_{1,2}=\alpha_1-\beta_1,\quad r_{2,2}=\alpha_2,\quad r_{2,1}=0$

for case $\beta_1\le\alpha_1$ (Adan et al., 2016).

Collaborative/partition optimal capacity split:

$\mu_A^* = \frac12[ \mu + \frac{\lambda_1+\lambda_2}{(\lambda_1/\mu_1)+(\lambda_2/\mu_2)} ]$

(Cao et al., 2021).

These models provide unified analytical foundations for two-type server queueing systems, encompassing retrials, assignment, polling, collaboration, and parallel matching. Their rigorous characterization of performance metrics and optimal control rules facilitates principled design and robust operation in heterogeneous service networks.