Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Demand Balancing in Primal-Dual Optimization for Blind Network Revenue Management (2404.04467v1)

Published 6 Apr 2024 in stat.ML and cs.LG

Abstract: This paper proposes a practically efficient algorithm with optimal theoretical regret which solves the classical network revenue management (NRM) problem with unknown, nonparametric demand. Over a time horizon of length $T$, in each time period the retailer needs to decide prices of $N$ types of products which are produced based on $M$ types of resources with unreplenishable initial inventory. When demand is nonparametric with some mild assumptions, Miao and Wang (2021) is the first paper which proposes an algorithm with $O(\text{poly}(N,M,\ln(T))\sqrt{T})$ type of regret (in particular, $\tilde O(N{3.5}\sqrt{T})$ plus additional high-order terms that are $o(\sqrt{T})$ with sufficiently large $T\gg N$). In this paper, we improve the previous result by proposing a primal-dual optimization algorithm which is not only more practical, but also with an improved regret of $\tilde O(N{3.25}\sqrt{T})$ free from additional high-order terms. A key technical contribution of the proposed algorithm is the so-called demand balancing, which pairs the primal solution (i.e., the price) in each time period with another price to offset the violation of complementary slackness on resource inventory constraints. Numerical experiments compared with several benchmark algorithms further illustrate the effectiveness of our algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Linear contextual bandits with knapsacks. Advances in Neural Information Processing Systems, 29.
  2. Bandits with global convex constraints and objective. Operations Research, 67(5), 1486–1502.
  3. Dynamic pricing for nonperishable products with demand learning. Operations Research, 57(5), 1169–1188.
  4. Bandits with knapsacks. Journal of the ACM, 65(3), 1–55.
  5. The best of many worlds: Dual mirror descent for online allocation problems. Operations Research (to appear).
  6. Collaborative hyperparameter tuning. In International conference on machine learning, (pp. 199–207). PMLR.
  7. Bertsekas, D. P. (1997). Nonlinear programming. Journal of the Operational Research Society, 48(3), 334–334.
  8. Non-stationary stochastic optimization. Operations Research, 63(5), 1227–1244.
  9. Blind network revenue management. Operations Research, 60(6), 1537–1550.
  10. On the (Surprising) Sufficiency of Linear Models for Dynamic Pricing with Demand Learning. Management Science, 61(4), 723–739.
  11. An overview of pricing models for revenue management. Manufacturing & Service Operations Management, 5(3), 203–229.
  12. Dynamic pricing under a general parametric choice model. Operations Research, 60(4), 965–980.
  13. Context-based dynamic pricing with partially linear demand model. Advances in Neural Information Processing Systems, 35, 23780–23791.
  14. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning, 5(1), 1–122.
  15. Kernel-based methods for bandit convex optimization. Journal of the ACM, 68(4), 1–35.
  16. A re-solving heuristic with uniformly bounded loss for network revenue management. Management Science, 66(7), 2993–3009.
  17. Recent developments in dynamic pricing research: multiple products, competition, and limited demand information. Production and Operations Management, 24(5), 704–731.
  18. Nonparametric pricing analytics with customer covariates. Operations Research, 69(3), 974–984.
  19. Real-time dynamic pricing with minimal and flexible price adjustment. Management Science, 62(8), 2437–2455.
  20. Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Mathematics of Operations Research, 44(2), 601–631.
  21. Nonstationary stochastic optimization under lp,qsubscript𝑙𝑝𝑞l_{p,q}italic_l start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT-variation measures. Operations Research, 67(6), 1752–1765.
  22. Network revenue management with online inverse batch gradient descent method. Production and Operations Management (to appear).
  23. Cooper, W. L. (2002). Asymptotic behavior of an allocation policy for revenue management. Operations Research, 50(4), 720–727.
  24. Danskin, J. M. (2012). The theory of max-min and its application to weapons allocation problems, vol. 5. Springer Science & Business Media.
  25. den Boer, A. V. (2015). Dynamic pricing and learning: historical origins, current research, and new directions. Surveys in Operations Research and Management Science, 20(1), 1–18.
  26. Dynamic pricing and learning with finite inventories. Operations research, 63(4), 965–978.
  27. Dynamic pricing with a prior on market response. Operations Research, 58(1), 16–29.
  28. Online network revenue management using thompson sampling. Operations Research, 66(6), 1586–1602.
  29. Online convex optimization in the bandit setting: gradient descent without a gradient. In Proceedings of the 16th annual ACM-SIAM Symposium on Discrete Algorithms (SODA), (pp. 385–394).
  30. Frazier, P. I. (2018). A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811.
  31. Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Science, 40(8), 999–1020.
  32. A multiproduct dynamic pricing problem and its applications to network yield management. Operations Research, 45(1), 24–41.
  33. Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Science, 58(3), 570–586.
  34. Bandit convex optimization: Towards tight bounds. Advances in Neural Information Processing Systems (NeurIPS), 27, 784–792.
  35. Jasin, S. (2014). Reoptimization and self-adjusting price control for network revenue management. Operations Research, 62(5), 1168–1178.
  36. A re-solving heuristic with bounded revenue loss for network revenue management with customer choice. Mathematics of Operations Research, 37(2), 313–345.
  37. Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), (pp. 795–811). Springer.
  38. Khachiyan, L. G. (1979). A polynomial algorithm in linear programming. In Doklady Akademii Nauk, vol. 244, (pp. 1093–1096). Russian Academy of Sciences.
  39. Multidimensional bisection search for constrained optimization with noisy observations. Working Paper.
  40. Pricing multiple products with the multinomial logit and nested logit models: Concavity and implications. Manufacturing & Service Operations Management, 13(4), 549–563.
  41. Network revenue management with nonparametric demand learning:\\\backslash\sqrt {{\{{T}}\}}-regret and polynomial dimension dependency. Available at SSRN 3948140.
  42. A general framework for resource constrained revenue management with demand learning and large action space. Available at SSRN 3841273.
  43. Dynamic learning and pricing with model misspecification. Management Science, 65(11), 4980–5000.
  44. Approximate primal solutions and rate analysis for dual subgradient methods. SIAM Journal on Optimization, 19(4), 1757–1780.
  45. Problem complexity and method efficiency in optimization.
  46. Nesterov, Y. (2009). Primal-dual subgradient methods for convex problems. Mathematical Programming, 120(1), 221–259.
  47. An asymptotically optimal policy for a quantity-based network revenue management problem. Mathematics of Operations Research, 33(2), 257–282.
  48. Convergence rates of inexact proximal-gradient methods for convex optimization. Advances in neural information processing systems, 24.
  49. Secomandi, N. (2008). An analysis of the control-algorithm re-solving issue in inventory and revenue management. Manufacturing & Service Operations Management, 10(3), 468–483.
  50. Tropp, J. A. (2012). User-friendly tail bounds for sums of random matrices. Foundations of Computational Mathematics, 12(4), 389–434.
  51. Vaidya, P. M. (1996). A new algorithm for minimizing convex functions over convex sets. Mathematical programming, 73(3), 291–341.
  52. Multi-modal dynamic pricing. Management Science, 67(10), 6136–6152.
  53. Constant regret re-solving heuristics for price-based revenue management. arXiv preprint arXiv:2009.02861.
  54. Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Operations Research, 62(2), 219–482.
  55. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295–316.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com