A Restless Bandit Model for Energy-Efficient Job Assignments in Server Farms (2112.06275v3)
Abstract: We aim to maximize the energy efficiency, gauged as average energy cost per job, in a large-scale server farm with various storage or/and computing components modeled as parallel abstracted servers. Each server operates in multiple power modes characterized by potentially different service and energy consumption rates. The heterogeneity of servers and multiple power modes complicate the maximization problem, where optimal solutions are generally intractable. Relying on the Whittle relaxation technique, we resort to a near-optimal, scalable job-assignment policy. Under a mild condition related to the service and energy consumption rates of the servers, we prove that our proposed policy approaches optimality as the size of the entire system tends to infinity; that is, it is asymptotically optimal. For the non-asymptotic regime, we show the effectiveness of the proposed policy through numerical simulations, where the policy outperforms all the tested baselines, and we numerically demonstrate its robustness against heavy-tailed job-size distributions.
- Cisco, “Cisco global cloud index: Forecast and methodology, 2016-2021,” 2018, accessed: Dec. 2022. [Online]. Available: https://virtualization.network/Resources/Whitepapers/0b75cf2e-0c53-4891-918e-b542a5d364c5_white-paper-c11-738085.pdf
- A. Shehabi, S. Smith, D. Sartor, R. Brown, M. Herrlin, J. Koomey, E. Masanet, N. Horner, I. Azevedo, and W. Lintner, “United states data center energy usage report,” Jun. 2016.
- D. Kliazovich, P. Bouvry, F. Granelli, and N. L. S. da Fonseca, “Energy consumption optimization in cloud data centers,” in Cloud Services, Networking, and Management, N. L. S. da Fonseca and R. Boutaba, Eds. John Wiley & Sons, Inc, Apr. 2015, pp. 191–215. [Online]. Available: http://dx.doi.org/10.1002/9781119042655.ch8
- M. Dayarathna, Y. Wen, and R. Fan, “Data center energy consumption modeling: A survey,” IEEE Commun. Surveys Tuts., vol. 25, no. 4, pp. 2180 – 2194, Aug. 2017.
- T. Lu, M. Chen, and L. L. H. Andrew, “Simple and effective dynamic provisioning for power-proportional data centers,” IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 6, pp. 1161–1171, Apr. 2013.
- E. Gelenbe and R. Lent, “Energy-QoS trade-offs in mobile service selection,” Future Internet, vol. 5, no. 2, pp. 128–139, Apr. 2013.
- M. E. Gebrehiwot, S. Aalto, and P. Lassila, “Near-optimal policies for energy-aware task assignment in server farms,” in Proc. CCGrid 2017. Madrid, Spain: IEEE Press, May 2017, pp. 1017–1026.
- M. Chowdhury, M. R. Rahman, and R. Boutaba, “Vineyard: Virtual network embedding algorithms with coordinated node and link mapping,” IEEE/ACM Trans. Netw., vol. 20, no. 1, pp. 206–219, Feb. 2012.
- Q. Hu, Y. Wang, and X. Cao, “Resolve the virtual network embedding problem: A column generation approach,” in Proc. IEEE INFOCOM 2013, Turin, Italy, Apr. 2013, pp. 410–414.
- F. Esposito, D. Di Paola, and I. Matta, “On distributed virtual network embedding with guarantees,” IEEE/ACM Trans. Netw., vol. 24, no. 1, pp. 569–582, Feb. 2016.
- H. Feng, J. Llorca, A. M. Tulino, D. Raz, and A. F. Molisch, “Approximation algorithms for the NFV service distribution problem,” in IEEE INFOCOM 2017, Atlanta, GA, USA, May 2017, pp. 1–9.
- W. Q. M. Guo, A. Wadhawan, L. Huang, and J. T. Dudziak, “Server farm management,” Jan. 2014, US Patent 8,626,897. [Online]. Available: http://www.google.com/patents/US8626897
- A. Hameed, A. Khoshkbarforoushha, R. Ranjan, P. P. Jayaraman, J. Kolodziej, P. Balaji, S. Zeadally, Q. M. Malluhi, N. Tziritas, A. Vishnu, S. U. Khan, and A. Zomaya, “A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems,” Computing, vol. 98, no. 7, pp. 751–774, Jun. 2016.
- Z. Rosberg, Y. Peng, J. Fu, J. Guo, E. W. M. Wong, and M. Zukerman, “Insensitive job assignment with throughput and energy criteria for processor-sharing server farms,” IEEE/ACM Trans. Netw., vol. 22, no. 4, pp. 1257–1270, Aug. 2014.
- E. Hyytia¨¨a\rm\ddot{a}over¨ start_ARG roman_a end_ARG, R. Righter, and S. Aalto, “Task assignment in a heterogeneous server farm with switching delays and general energy-aware cost structure,” Performance Evaluation, vol. 75-76, pp. 17–35, 2014.
- J. Fu, J. Guo, E. W. M. Wong, and M. Zukerman, “Energy-efficient heuristics for insensitive job assignment in processor-sharing server farms,” IEEE J. Sel. Areas Commun., vol. 33, no. 12, pp. 2878–2891, Dec. 2015.
- T. Lin, T. Alpcan, and K. Hinton, “A game-theoretic analysis of energy efficiency and performance for cloud computing in communication networks,” IEEE Syst. J., vol. 11, no. 2, pp. 649–660, Jun. 2017.
- J. Fu, B. Moran, J. Guo, E. W. M. Wong, and M. Zukerman, “Asymptotically optimal job assignment for energy-efficient processor-sharing server farms,” IEEE J. Sel. Areas Commun., vol. 34, no. 12, Dec. 2016.
- X. Wei and M. J. Neely, “Data center server provision: Distributed asynchronous control for coupled renewal systems,” IEEE/ACM Transactions on Networking, vol. 18, no. 1, First Quarter 2016.
- S. K. Mishra, D. Putha, J. J. Rodrigues, B. Sahoo, and E. Dutkiewicz, “Sustainable service allocation using metaheuristic technique in fog server for industrial applications,” IEEE Trans. Ind. Informat., Jan. 2018, early Access.
- J. Fu and B. Moran, “Energy-efficient job-assignment policy with asymptotically guaranteed performance deviation,” IEEE/ACM Trans. Netw., vol. 28, no. 3, pp. 1325–1338, 2020.
- O. T. Akgun, D. G. Down, and R. Righter, “Energy-aware scheduling on heterogeneous processors,” IEEE Transactions on Automatic Control, vol. 59, no. 3, pp. 599–613, 2013.
- J. Li, Y. Zhu, J. Yu, C. Long, G. Xue, and S. Qian, “Online auction for IaaS clouds: Towards elastic user demands and weighted heterogeneous VMs,” IEEE Trans. Parallel Distrib. Syst., vol. 29, no. 9, pp. 2075–2089, Sep. 2018.
- J. Fu, B. Moran, and P. G. Taylor, “A restless bandit model for resource allocation, competition and reservation,” Operations Research, vol. 70, no. 1, Jan.-Feb. 2022.
- Q. Wang, J. Fu, J. Wu, B. Moran, and M. Zukerman, “Energy-efficient priority-based scheduling for wireless network slicing,” in Proc. IEEE GLOBECOM 2018, Abu Dhabi, UAE, Dec. 2018.
- J. Fu, B. Moran, P. G. Taylor, and C. Xing, “Resource competition in virtual network embedding with released physical resources,” Stochastic Models, pp. 231 – 263, dec. 2020.
- A. W. Lewis, N.-F. Tzeng, and S. Ghosh, “Runtime energy consumption estimation for server workloads based on chaotic time-series approximation,” ACM Transactions on Architecture and Code Optimization (TACO), vol. 9, no. 3, Sep. 2012, article No. 15.
- N. Bansal, H.-L. Chan, and K. Pruhs, “Speed scaling with an arbitrary power function,” ACM Transactions on Algorithms (TALG), vol. 9, no. 2, p. 18, Mar. 2013.
- X. Mei, Q. Wang, and X. Chu, “A survey and measurement study of GPU DVFS on energy conservation,” Digital Communications and Networks, vol. 3, no. 2, pp. 89–100, May 2017.
- P. Whittle, “Restless bandits: Activity allocation in a changing world,” J. Appl. Probab., vol. 25, pp. 287–298, 1988.
- C. H. Papadimitriou and J. N. Tsitsiklis, “The complexity of optimal queuing network control,” Math. Oper. Res., vol. 24, no. 2, pp. 293–305, May 1999.
- R. R. Weber and G. Weiss, “On an index policy for restless bandits,” J. Appl. Probab., no. 3, pp. 637–648, Sep. 1990.
- Z. Yu, Y. Xu, and L. Tong, “Deadline scheduling as restless bandits,” IEEE Transactions on Automatic Control, vol. 63, no. 8, pp. 2343–2358, 2018.
- V. S. Borkar, “Whittle index for partially observed binary markov decision processes,” IEEE Transactions on Automatic Control, vol. 62, no. 12, pp. 6614–6618, 2017.
- J. Wang, X. Ren, Y. Mo, and L. Shi, “Whittle index policy for dynamic multichannel allocation in remote state estimation,” IEEE Transactions on Automatic Control, vol. 65, no. 2, pp. 591–603, 2019.
- A. Abbou and V. Makis, “Group maintenance: A restless bandits approach,” INFORMS Journal on Computing, vol. 31, no. 4, pp. 719–731, 2019.
- J. Niño-Mora, “Restless bandits, partial conservation laws and indexability,” Advances in Applied Probability, pp. 76–98, 2001.
- J. Niño-Mora, “Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach,” Mathematical programming, vol. 93, no. 3, pp. 361–413, 2002.
- J. Niño-Mora, “Dynamic priority allocation via restless bandit marginal productivity indices,” Top, vol. 15, no. 2, pp. 161–198, 2007.
- J. Niño-Mora, “A verification theorem for threshold-indexability of real-state discounted restless bandits,” Mathematics of Operations Research, vol. 45, no. 2, pp. 465–496, 2020.
- W. Ouyang, A. Eryilmaz, and N. B. Shroff, “Downlink scheduling over markovian fading channels,” IEEE/ACM Transactions on Networking, vol. 24, no. 3, pp. 1801–1812, 2016.
- A. Maatouk, S. Kriouile, M. Assaad, and A. Ephremides, “Asymptotically optimal scheduling policy for minimizing the age of information,” in 2020 IEEE International Symposium on Information Theory (ISIT). Los Angeles, CA, USA, USA: IEEE, jun. 2020, pp. 1747–1752.
- J. Niño-Mora, “Restless bandits, partial conservation laws and indexability,” Advances in Applied Probability, vol. 33, no. 1, pp. 76–98, 2001.
- J. Niño-Mora, “Dynamic priority allocation via restless bandit marginal productivity indices,” TOP, vol. 15, no. 2, pp. 161–198, Sep. 2007.
- I. M. Verloop, “Asymptotically optimal priority policies for indexable and non-indexable restless bandits,” Ann. Appl. Probab., vol. 26, no. 4, pp. 1947–1995, Aug. 2016. [Online]. Available: http://projecteuclid.org/euclid.aoap/1472745449
- I. Takouna, W. Dawoud, and C. Meinel, “Accurate mutlicore processor power models for power-aware resource management,” in 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing. Sydney, NSW, Australia: IEEE, Dec. 2011, pp. 419–426.
- M. Pedram, “Energy-efficient datacenters,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 31, no. 10, pp. 1465–1484, 2012.
- H. Wu and K. Wolter, “Stochastic analysis of delayed mobile offloading in heterogeneous networks,” IEEE Trans. Mobile Comput., vol. 17, no. 2, pp. 461–474, 2017.
- R. A. Giri, I. I. Staff Engineer, and A. Vanchi, “Increasing data center efficiency with server power measurements,” Document. Intel Information Technology. IT@ Intel White Paper, 2010.
- T. Kaur and I. Chana, “Energy efficiency techniques in cloud computing: A survey and taxonomy,” ACM computing surveys (CSUR), vol. 48, no. 2, pp. 1–46, 2015.
- W. Winston, “Optimality of the shortest line discipline,” J. Appl. Probab., vol. 14, no. 1, pp. 181–189, Mar. 1977.
- J. Wilkes, “More Google cluster data,” Google research blog, Nov. 2011, posted at http://googleresearch.blogspot.com/2011/11/more-google-cluster-data.html, accessed: Jul. 8, 2019.
- C. Reiss, J. Wilkes, and J. L. Hellerstein, “Google cluster-usage traces: format + schema,” Google Inc., Mountain View, CA, USA, Technical Report, Nov. 2011, revised 2014-11-17 for version 2.1. Posted at https://github.com/google/cluster-data, accessed: Jul. 8, 2019.
- M. E. Crovella and A. Bestavros, “Self-similarity in World Wide Web traffic: evidence and possible causes,” IEEE/ACM Trans. Netw., vol. 5, no. 6, pp. 835–846, Dec. 1997.