Medium Access Control protocol for Collaborative Spectrum Learning in Wireless Networks (2111.12581v2)
Abstract: In recent years there is a growing effort to provide learning algorithms for spectrum collaboration. In this paper we present a medium access control protocol which allows spectrum collaboration with minimal regret and high spectral efficiency in highly loaded networks. We present a fully-distributed algorithm for spectrum collaboration in congested ad-hoc networks. The algorithm jointly solves both the channel allocation and access scheduling problems. We prove that the algorithm has an optimal logarithmic regret. Based on the algorithm we provide a medium access control protocol which allows distributed implementation of the algorithm in ad-hoc networks. The protocol utilizes single-channel opportunistic carrier sensing to carry out a low-complexity distributed auction in time and frequency. We also discuss practical implementation issues such as bounded frame size and speed of convergence. Computer simulations comparing the algorithm to state-of-the-art distributed medium access control protocols show the significant advantage of the proposed scheme.
- F. Jameel, Z. Hamid, F. Jabeen, S. Zeadally, and M. A. Javed, “A survey of device-to-device communications: Research issues and challenges,” IEEE Commun. Surveys Tuts., vol. 20, no. 3, pp. 2133–2168, 2018.
- L. Lei, Z. Zhong, C. Lin, and X. Shen, “Operator controlled device-to-device communications in LTE-advanced networks,” IEEE Wireless Commun., vol. 19, no. 3, pp. 96–104, 2012.
- S. Zafaruddin, I. Bistritz, A. Leshem, and D. Niyato, “Distributed learning for channel allocation over a shared spectrum,” IEEE J. Sel. Areas Commun., vol. 37, no. 10, pp. 2337–2349, 2019.
- H. Tibrewal, S. Patchala, M. K. Hanawal, and S. J. Darak, “Multiplayer multi-armed bandits for optimal assignment in heterogeneous networks,” arXiv preprint arXiv:1901.03868, 2019.
- A. Mehrabian, E. Boursier, E. Kaufmann, and V. Perchet, “A practical algorithm for multiplayer bandits when arm means vary among players,” in 23rd AISTATS. online: PMLR, Aug. 2020, pp. 1211–1221.
- P. Alatur, K. Y. Levy, and A. Krause, “Multi-player bandits: The adversarial case,” J. Mach. Learn. Res., vol. 21, no. 77, pp. 1–23, 2020.
- Y. Bar-On and Y. Mansour, “Individual regret in cooperative nonstochastic multi-armed bandits,” in Adv Neural Inf Process Syst., H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019, pp. 3116–3126.
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, “The nonstochastic multiarmed bandit problem,” SIAM J. Comput., vol. 32, no. 1, pp. 48–77, 2002.
- I. Bistritz and A. Leshem, “Game of thrones: Fully distributed learning for multiplayer bandits,” Math. Oper. Res., vol. 46, no. 1, pp. 159–178, 2021.
- W. Wang, A. Leshem, D. Niyato, and Z. Han, “Decentralized learning for channel allocation in IoT networks over unlicensed bandwidth as a contextual multi-player multi-armed bandit game,” IEEE Trans. Wireless Commun., vol. 21, no. 5, pp. 3162–3178, 2022.
- I. Bistritz, T. Baharav, A. Leshem, and N. Bambos, “My fair bandit: Distributed learning of max-min fairness with multi-player bandits,” in Proceedings of the 37th ICML, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, Jul 2020, pp. 930–940.
- A. Leshem, E. Zehavi, and Y. Yaffe, “Multichannel opportunistic carrier sensing for stable channel access control in cognitive radio systems,” IEEE J. Sel. Areas Commun., vol. 30, no. 1, pp. 82–95, 2011.
- O. Naparstek and A. Leshem, “Fully distributed optimal channel assignment for open spectrum access,” IEEE Trans. Signal Process., vol. 62, no. 2, pp. 283–294, 2014.
- N. Nayyar, D. Kalathil, and R. Jain, “On regret-optimal learning in decentralized multiplayer multiarmed bandits,” IEEE Trans. Control Netw. Syst., vol. 5, no. 1, pp. 597–606, 2016.
- O. Avner and S. Mannor, “Multi-user communication networks: A coordinated multi-armed bandit approach,” IEEE/ACM Trans. Netw., vol. 27, no. 6, pp. 2192–2207, 2019.
- S. J. Darak and M. K. Hanawal, “Multi-player multi-armed bandits for stable allocation in heterogeneous ad-hoc networks,” IEEE J. Sel. Areas Commun., vol. 37, no. 10, pp. 2350–2363, 2019.
- I. Bistritz, T. Z. Baharav, A. Leshem, and N. Bambos, “One for all and all for one: Distributed learning of fair allocations with multi-player bandits,” IEEE J. Sel. Areas Inf. Theory, vol. 2, no. 2, pp. 584–598, 2021.
- A. Leshem, “Optimal fair multi-agent bandits,” arXiv preprint arXiv:2306.04498, 2023.
- H. Liu, K. Liu, and Q. Zhao, “Learning in a changing world: Restless multiarmed bandit with unknown dynamics,” IEEE Trans. Inf. Theory, vol. 59, no. 3, pp. 1902–1916, 2012.
- C. Tekin and M. Liu, “Online learning of rested and restless bandits,” IEEE Trans. Inf. Theory, vol. 58, no. 8, pp. 5588–5611, 2012.
- W. Wang, A. Kwasinski, D. Niyato, and Z. Han, “A survey on applications of model-free strategy learning in cognitive wireless networks,” IEEE Commun. Surveys Tuts., vol. 18, no. 3, pp. 1717–1757, 2016.
- K. Cohen, A. Leshem, and E. Zehavi, “Game theoretic aspects of the multi-channel Aloha protocol in cognitive radio networks,” IEEE Journal on Selected Areas in Communications, vol. 31, no. 11, pp. 2276–2288, 2013.
- K. Cohen and A. Leshem, “Distributed game-theoretic optimization and management of multichannel Aloha networks,” IEEE/ACM Transactions on Networking, vol. 24, no. 3, pp. 1718–1731, 2015.
- O. Naparstek and K. Cohen, “Deep multi-user reinforcement learning for distributed dynamic spectrum access,” IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 310–323, 2018.
- A. Kwasinski, W. Wang, and F. S. Mohammadi, “Reinforcement learning for resource allocation in cognitive radio networks,” Machine Learning for Future Wireless Communications, pp. 27–44, 2020.
- Y. Yu, T. Wang, and S. C. Liew, “Deep-reinforcement learning multiple access for heterogeneous wireless networks,” IEEE J. Sel. Areas Commun., vol. 37, no. 6, pp. 1277–1290, 2019.
- M.-J. Youssef, V. V. Veeravalli, J. Farah, C. A. Nour, and C. Douillard, “Resource allocation in NOMA-based self-organizing networks using stochastic multi-armed bandits,” IEEE Trans. Commun., vol. 69, no. 9, pp. 6003–6017, 2021.
- D. Gale and L. S. Shapley, “College admissions and the stability of marriage,” Am. Math. Mon., vol. 69, no. 1, pp. 9–15, 1962.
- Q. Zhao and L. Tong, “Opportunistic carrier sensing for energy-efficient information retrieval in sensor networks,” EURASIP J. Wirel. Commun. Netw., vol. 2005, no. 2, pp. 1–11, 2005.
- T. Hößler, P. Schulz, E. A. Jorswieck, M. Simsek, and G. P. Fettweis, “Stable matching for wireless URLLC in multi-cellular, multi-user systems,” IEEE Trans. Commun., vol. 68, no. 8, pp. 5228–5241, 2020.
- B. Holfeld and E. Jorswieck, “On stable many-to-many matching for distributed medium access with reuse of spectral resources,” in 20th International ITG Workshop on Smart Antennas, Vienna, Austria, Apr. 2016, pp. 1–8.
- D. P. Bertsekas, “A distributed algorithm for the assignment problem,” Lab. for Information and Decision Systems Working Paper, MIT, 1979.
- A. Zappone, E. Jorswieck, and A. Leshem, “Distributed resource allocation for energy efficiency in MIMO OFDMA wireless networks,” IEEE J. Sel. Areas Commun., vol. 34, no. 12, pp. 3451–3465, 2016.
- Y. Xue, P. Zhou, S. Mao, D. Wu, and Y. Zhou, “Pure-exploration bandits for channel selection in mission-critical wireless communications,” IEEE Trans. Veh. Commun., vol. 67, no. 11, pp. 10 995–11 007, 2018.
- A. A. Saleh and R. Valenzuela, “A statistical model for indoor multipath propagation,” IEEE Journal on selected areas in communications, vol. 5, no. 2, pp. 128–137, 1987.
- A. Meijerink and A. F. Molisch, “On the physical interpretation of the Saleh–Valenzuela model and the definition of its power delay profiles,” IEEE Transactions on Antennas and Propagation, vol. 62, no. 9, pp. 4780–4793, 2014.