Periodic Bandits and Wireless Network Selection

Oh, Shunhao; Appavoo, Anuja Meetoo; Gilbert, Seth

doi:10.4230/LIPIcs.ICALP.2019.149

File

Author Details

Shunhao Oh

Department of Computer Science, National University of Singapore

Anuja Meetoo Appavoo

Department of Computer Science, National University of Singapore

Seth Gilbert

Department of Computer Science, National University of Singapore

Cite AsGet BibTex

Shunhao Oh, Anuja Meetoo Appavoo, and Seth Gilbert. Periodic Bandits and Wireless Network Selection. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 149:1-149:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ICALP.2019.149

Abstract

Bandit-style algorithms have been studied extensively in stochastic and adversarial settings. Such algorithms have been shown to be useful in multiplayer settings, e.g. to solve the wireless network selection problem, which can be formulated as an adversarial bandit problem. A leading bandit algorithm for the adversarial setting is EXP3. However, network behavior is often repetitive, where user density and network behavior follow regular patterns. Bandit algorithms, like EXP3, fail to provide good guarantees for periodic behaviors. A major reason is that these algorithms compete against fixed-action policies, which is ineffective in a periodic setting. In this paper, we define a periodic bandit setting, and periodic regret as a better performance measure for this type of setting. Instead of comparing an algorithm’s performance to fixed-action policies, we aim to be competitive with policies that play arms under some set of possible periodic patterns F (for example, all possible periodic functions with periods 1,2,*s,P). We propose Periodic EXP4, a computationally efficient variant of the EXP4 algorithm for periodic settings. With K arms, T time steps, and where each periodic pattern in F is of length at most P, we show that the periodic regret obtained by Periodic EXP4 is at most O(sqrt{PKT log K + KT log |F|}). We also prove a lower bound of Omega (sqrt{PKT + KT {log |F|}/{log K}}) for the periodic setting, showing that this is optimal within log-factors. As an example, we focus on the wireless network selection problem. Through simulation, we show that Periodic EXP4 learns the periodic pattern over time, adapts to changes in a dynamic environment, and far outperforms EXP3.

Subject Classification

ACM Subject Classification

Theory of computation → Online learning algorithms

Keywords

multi-armed bandits
wireless network selection
periodicity in environment

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Anuja Meetoo Appavoo, Seth Gilbert, and Kian-Lee Tan. Shrewd Selection Speeds Surfing: Use Smart EXP3! In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pages 188-199. IEEE, 2018.
Anuja Meetoo Appavoo, Seth Gilbert, and Kian-Lee Tan. Cooperation Speeds Surfing: Use Co-Bandit! arXiv preprint, 2019. URL: http://arxiv.org/abs/1901.07768.
E. Aryafar, A. Keshavarz-Haddad, C. Joe-Wong, and M. Chiang. Max-Min Fair Resource Allocation in HetNets: Distributed Algorithms and Hybrid Architecture. In ICDCS, 2017, pages 857-869. IEEE, 2017.
E. Aryafar, A. Keshavarz-Haddad, M.l Wang, and M. Chiang. RAT selection games in HetNets. In INFOCOM, pages 998-1006. IEEE, 2013.
Jean-Yves Audibert and Sébastien Bubeck. Minimax policies for adversarial and stochastic bandits. In COLT, pages 217-226, 2009.
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32(1):48-77, 2002.
O. Avner and S. Mannor. Multi-user lax communications: A multi-armed bandit approach. In IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications, pages 1-9, April 2016. URL: http://dx.doi.org/10.1109/INFOCOM.2016.7524557.
Y. Bejerano, S-J. Han, and L. E. Li. Fairness and load balancing in wireless LANs using association control. In MobiCom, pages 315-329. ACM, 2004.
Omar Besbes, Yonatan Gur, and Assaf Zeevi. Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 199-207. Curran Associates, Inc., 2014. URL: http://papers.nips.cc/paper/5378-stochastic-multi-armed-bandit-problem-with-non-stationary-rewards.pdf.
Sébastien Bubeck, Nicolo Cesa-Bianchi, et al. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trendsregistered in Machine Learning, 5(1):1-122, 2012.
M. H. Cheung, F. Hou, J. Huang, and R. Southwell. Congestion-Aware Distributed Network Selection for Integrated Cellular and Wi-Fi Networks. arXiv preprint, 2017. URL: http://arxiv.org/abs/1703.00216.
S. Deng, A. Sivaraman, and H. Balakrishnan. All your network are belong to us: A transport framework for mobile network selection. In HotMobile. ACM, 2014.
Yi Gai, Bhaskar Krishnamachari, and Rahul Jain. Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation. In New Frontiers in Dynamic Spectrum, 2010 IEEE Symposium on, pages 1-9. IEEE, 2010.
D. Golovin, M. Faulkner, and A. Krause. Online distributed sensor selection. In Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks, pages 220-231. ACM, 2010.
B. Kauffmann, F. Baccelli, A. Chaintreau, V. Mhatre, K. Papagiannaki, and C. Diot. Measurement-based self organization of interfering 802.11 wireless access networks. In INFOCOM 2007, pages 1451-1459. IEEE, 2007.
R. Kleinberg, G. Piliouras, and E. Tardos. Multiplicative updates outperform generic no-regret learning in congestion games. In ACM STOC, pages 533-542. ACM, 2009.
S. Maghsudi and S. Stanczak. Relay selection with no side information: An adversarial bandit approach. In WCNC, pages 715-720. IEEE, 2013.
A. Mishra, V. Brik, S. Banerjee, A. Srinivasan, and W. A. Arbaugh. A Client-Driven Approach for Channel Management in Wireless LANs. In Infocom, 2006.
E Monsef, A. Keshavarz-Haddad, E. Aryafar, J. Saniie, and M. Chiang. Convergence properties of general network selection games. In INFOCOM, pages 1445-1453. IEEE, 2015.
D. Niyato and E. Hossain. Dynamics of network selection in heterogeneous wireless networks: An evolutionary game approach. TVT, 58(4):2008-2017, 2009.
Shunhao Oh, Anuja Meetoo Appavoo, and Seth Gilbert. Periodic Bandits and Wireless Network Selection [full version of paper]. arXiv preprint, 2019. URL: http://arxiv.org/abs/1904.12355.
Allesiardo Robin, Raphaël Feraud, and Odalric-Ambrym Maillard. The Non-stationary Stochastic Multi-armed Bandit Problem. International Journal of Data Science and Analytics, March 2017. URL: http://dx.doi.org/10.1007/s41060-017-0050-5.
Yevgeny Seldin and Gábor Lugosi. A lower bound for multi-armed bandits with expert advice. In 13th European Workshop on Reinforcement Learning (EWRL), 2016.
SimPy. SimPy - Event discrete simulation for Python, 2016. , accessed 2018-19-12. URL: https://simpy.readthedocs.io/.
K. Sui, M. Zhou, D. Liu, M. Ma, D. Pei, Y. Zhao, Z. Li, and T. Moscibroda. Characterizing and improving wifi latency in large-scale operational networks. In MobiSys, pages 347-360. ACM, 2016.
Vasilis Syrgkanis, Akshay Krishnamurthy, and Robert Schapire. Efficient algorithms for adversarial contextual learning. In International Conference on Machine Learning, pages 2159-2168, 2016.
C. Tekin and M. Liu. Performance and Convergence of Multi-user Online Learning. In GAMENETS, pages 321-336. Springer, 2011.
H. A. Tran, S. Hoceini, A. Mellouk, J. Perez, and S. Zeadally. QoE-based server selection for content distribution networks. IEEE Transactions on Computers, 63(11):2803-2815, 2014.
Q. Wu, Z. Du, P. Yang, Y.-D. Yao, and J. Wang. Traffic-aware online network selection in heterogeneous wireless networks. TVT, 65(1):381-397, 2016.
K. Zhu, D. Niyato, and P. Wang. Network selection in heterogeneous wireless networks: Evolution with incomplete information. In WCNC, pages 1-6. IEEE, 2010.

Periodic Bandits and Wireless Network Selection

Authors Shunhao Oh, Anuja Meetoo Appavoo, Seth Gilbert

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Periodic Bandits and Wireless Network Selection

Authors Shunhao Oh, Anuja Meetoo Appavoo, Seth Gilbert

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

Supplementary Materials

References