Multi-Secretary Problem
- Multi-secretary problem is an extension of the classical secretary problem that permits up to k irrevocable selections from sequentially arriving candidates to maximize total value or success probability.
- Adaptive online policies like the Budget-Ratio and simulated-greedy algorithms dynamically adjust thresholds and achieve near-optimal performance with uniformly bounded regret compared to the offline optimum.
- Extensions of this framework apply to matroid selection, multi-winner elections, and online fairness, influencing practical applications such as hiring, resource allocation, and participatory budgeting.
The multi-secretary problem, also known as the k-choice secretary problem, extends the classical optimal stopping setting introduced by Cayley and Moser to the case where a decision maker can select up to of items that arrive sequentially in random or controlled order. Each item is revealed irrevocably and only relative or realized quality information up to the present time is available. The objective is to maximize a set function—typically total value or the probability of capturing the best elements—subject to an online, capacity-constrained selection process.
1. Formal Models and Fundamental Variants
The canonical multi-secretary problem involves independent random variables observed sequentially, each from a known distribution with finite support . At each time , the algorithm may irrevocably select , subject to a total budget of allowed selections. The expected online reward is
$\ALG_n(\pi) = \mathbb{E}\Bigl[\sum_{t=1}^n X_t \sigma_t \Bigr]$
for some policy with , . The comparison benchmark is the offline optimum
$\OPT_n = \mathbb{E}\Bigl[\sum_{j=1}^m a_j S_j^n \Bigr],$
where counts the number of selections among the largest having value (Arlotto et al., 2017).
Variant formulations include:
- The uniform matroid or linear -secretary problem, where are arbitrary weights in adversarial or random order (Ma et al., 2011).
- The -secretary problem (multi-choice, multi-best), with the objective to select elements maximizing count or value among the true best (Chan et al., 2013).
- Models with multiple items per rank or parallel (multi-queue) settings (Pinsky, 2022, Sun et al., 2014).
- Non-linear objectives and fairness constraints, notably in online social choice (Papasotiropoulos et al., 28 Nov 2025).
2. Optimal and Near-Optimal Online Policies
A foundational result is that, in the setting with independent values drawn from a known common finite-support law, adaptivity in selection is critical for minimizing regret relative to the offline optimum. The Budget-Ratio (BR) policy of Arlotto & Gurvich achieves a uniformly bounded regret across all and by dynamically tuning selection thresholds based on the current residual budget per remaining period. Specifically, the policy sets at time and determines acceptance thresholds based on the cumulative distribution. The next arrival is accepted if its value exceeds the type dictated by (Arlotto et al., 2017).
For non-adaptive policies, the best achievable regret is , fundamentally due to concentration bounds on binomial selection variability. Adaptive schemes such as BR, or multiple-phase thresholding in the -secretary context, maintain a near-perfect tracking of offline allocation except for expected mistakes (Arlotto et al., 2017).
For random-order -secretary (linear weight) problems, the simulated-greedy algorithm achieves a constant (specifically, 9.6) competitive ratio, using an initial sample to set a threshold and then accepting any subsequent value exceeding this until the quota is filled (Ma et al., 2011). Asymptotically, the optimal policy for approaches the offline optimum, with error decaying as (Sun et al., 2014).
Parallel and free-order settings utilize multiple threshold structures or dimensionally reduced orderings; entropy-optimal order selection can yield competitive ratios using bits of randomness (Hajiaghayi et al., 2022).
3. Structural and Asymptotic Characterizations
The continuous LP framework provides an exact characterization of the asymptotics of the -choice -best secretary problem. In infinite and large- limits, optimal policies correspond to -threshold algorithms: each quota unlocks at critical times to select up to -potentials, with thresholds determined by complementary slackness of the LP. When , the thresholds are explicitly rational and enable precise calculation of the limiting probability of successfully capturing the top items (Chan et al., 2013).
For -multiplicity per rank, skipping the first items and selecting the first item at least as good as the best seen so far gives, in the limit, success probabilities that converge to 1 rapidly as increases—e.g., , , and (Pinsky, 2022).
In multi-threshold positional strategies for selection, successive selection thresholds are defined recursively, and the optimal thresholds for selections in the limit coincide with the classical Gilbert–Mosteller (Dowry problem) values, with the "right-hand-based" property: the later-stage thresholds are independent of total initial selection allowance (Liu et al., 2023).
4. Extensions: Parallels, Generalizations, and Online Fairness
Parallel and shared-quota multi-secretary problems involve the distribution of candidates among queues or the division of the selection budget across multiple simultaneous streams. Linear-programming duality characterizes optimal threshold structures, and the Adaptive Observation–Selection Protocol achieves tight or near-tight competitive ratios, with closed-form expressions available for special cases such as (Sun et al., 2014).
The Stable Secretaries variant models online matching with stability constraints, quantifying the fraction of agents not in blocking pairs. Under random arrival, a constant fraction of agents can avoid blocking, but the fraction of stable pairs is and can be as low as under adversarial input (Babichenko et al., 2017).
Recent work explicitly links the multi-secretary problem to online multi-winner elections under cardinal preferences and fairness (e.g., Extended Justified Representation). Impossibility theorems establish that EJR and its relaxations are unattainable deterministically online, but sample-then-select hybrid algorithms (Online MES, BOS) achieve high-probability EJR-like guarantees under random order, and Greedy Budgeting ensures basic justified representation (Papasotiropoulos et al., 28 Nov 2025).
5. Principal Results and Algorithmic Insights
The following table summarizes key competitive guarantees and regret rates for main models:
| Model / Policy | Achievable Regret / Competitive Ratio | Reference |
|---|---|---|
| Adaptive Budget-Ratio (BR) | Regret uniformly bounded in | (Arlotto et al., 2017) |
| Non-adaptive (static) policy | Regret | (Arlotto et al., 2017) |
| Simulated-greedy (random order) | 9.6-competitive, uniform matroid | (Ma et al., 2011) |
| LP-threshold (infinite items) | Optimal rational thresholds, large | (Chan et al., 2013) |
| Parallel deterministic protocol | Case : comp. ratio | (Sun et al., 2014) |
| Free-order, randomness-efficient | Ratio , entropy-opt. | (Hajiaghayi et al., 2022) |
| -multiplicity per rank | Success rapidly for | (Pinsky, 2022) |
| Online fairness (Random order) | -EJR for slots | (Papasotiropoulos et al., 28 Nov 2025) |
Algorithmically, all near-optimal policies—across classical, parallel, free-order, and fairness-augmented domains—share a thresholding structure, dynamically or statically tuned via offline sample statistics or adaptive control over budget exhaustion.
6. Open Directions and Challenges
Despite substantial progress, several research directions remain:
- Online fairness and proportionality constraints in adversarial or unknown-horizon settings appear inherently limited (no deterministic EJR possible online) (Papasotiropoulos et al., 28 Nov 2025).
- Extensions to general submodular objectives within matroid constraints retain only constant-factor guarantees, and closing these gaps for broad classes of valuation functions remains unresolved (Ma et al., 2011).
- Asymptotic and finite-n analyses for more general arrival processes (non-uniform, adversarial with limited power) and richer informational settings (feedback, delayed response) are underdeveloped.
- The optimal use of randomness (entropy-minimized strategies) and deterministic derandomizations for multi-secretary and related prophet models continue to be active research areas (Hajiaghayi et al., 2022).
7. Connections to Broader Theoretical and Applied Domains
The multi-secretary problem is deeply intertwined with the theory of matroid secretary problems, prophet inequalities, and online allocation under feasibility and fairness constraints. It provides a tractable mathematical foundation for real-world online selection tasks, such as online hiring, resource allocation in participatory budgeting, public goods selection under initiative processes, and more generally, the study of irrevocable online decision-making under uncertainty with combinatorial structure.
References: (Arlotto et al., 2017, Ma et al., 2011, Chan et al., 2013, Sun et al., 2014, Pinsky, 2022, Liu et al., 2023, Hajiaghayi et al., 2022, Papasotiropoulos et al., 28 Nov 2025, Babichenko et al., 2017)