ABSTRACT
Recently, much attention has been devoted to the question of whether/when traditional network protocol design, which relies on the application of algorithmic insights by human experts, can be replaced by a data-driven (i.e., machine learning) approach. We explore this question in the context of the arguably most fundamental networking task: routing. Can ideas and techniques from machine learning (ML) be leveraged to automatically generate "good" routing configurations? We focus on the classical setting of intradomain traffic engineering. We observe that this context poses significant challenges for data-driven protocol design. Our preliminary results regarding the power of data-driven routing suggest that applying ML (specifically, deep reinforcement learning) to this context yields high performance and is a promising direction for further research. We outline a research agenda for ML-guided routing.
Supplemental Material
- DeepMind AI Reduces Google Data Centre Cooling Bill by 40%. https://goo.gl/QTdU2T.Google Scholar
- IBM ILOG CPLEX Optimizer. http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/index.html.Google Scholar
- M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic flow scheduling for data center networks. In NSDI, 2010. Google ScholarDigital Library
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data Center TCP (DCTCP). ACM SIGCOMM, 2010. Google ScholarDigital Library
- M. Allalouf and Y. Shavitt. Maximum flow routing with weighted max-min fairness. In Quality of Service in the Emerging Networking Panorama.Google Scholar
- D. Andersen, H. Balakrishnan, F. Kaashoek, and R. Morris. Resilient overlay networks. 2001.Google ScholarDigital Library
- Y. Azar, E. Cohen, A. Fiat, H. Kaplan, and H. Racke. Optimal oblivious routing in polynomial time. STOC, 2003.Google ScholarDigital Library
- D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv, 2014.Google Scholar
- C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, 2006. Google ScholarDigital Library
- M. Chiesa, G. Rétvári, and M. Schapira. Lying your way to better traffic engineering. CoNEXT, 2016. Google ScholarDigital Library
- T. W. S. Chow and C. T. Leung. Nonlinear autoregressive integrated neural network model for short-term load forecasting. IEE Proceedings - Generation, Transmission and Distribution, 1996.Google ScholarCross Ref
- M. Dong, Q. Li, D. Zarchy, P. B. Godfrey, and M. Schapira. PCC: Re-architecting congestion control for consistent high performance. NSDI, 2015. Google ScholarDigital Library
- Y. Duan, X. Chen, R. Houthooft, J. Schulman, and P. Abbeel. Benchmarking deep reinforcement learning for continuous control. ICML, 2016. Google ScholarDigital Library
- P. T. Eugster, P. A. Felber, R. Guerraoui, and A.-M. Kermarrec. The many faces of publish/subscribe. ACM Comput. Surv., 2003. Google ScholarDigital Library
- B. Fortz, J. Rexford, and M. Thorup. Traffic engineering with traditional ip routing protocols. Comm. Mag., 2002. Google ScholarDigital Library
- B. Fortz and M. Thorup. Increasing internet capacity using local search. Computational Optimization and Applications, 2004. Google ScholarDigital Library
- B. Fortz and M. Thorup. Optimizing ospf/is-is weights in a changing world. IEEE J.Sel. A. Commun., 2006. Google ScholarDigital Library
- M. Ghodbadi, R. Mahajan, A. Phanishayee, H. Rastegarfar, P.-A. Blanche, M. Glick, D. Kilper, J. Kulkarni, G. Ranade, and N. Devanur. ProjecToR: Agile Reconfigurable Datacenter Interconnect. SIGCOMM, 2016. Google ScholarDigital Library
- D. Halperin, S. Kandula, J. Padhye, P. Bahl, and D. Wetherall. Augmenting data center networks with multi-gigabit wireless links. SIGCOMM, 2011. Google ScholarDigital Library
- N. Hamedazimi, Z. Qazi, H. Gupta, V. Sekar, S. R. Das, J. P. Longtin, H. Shah, and A. Tanwer. Firefly: A reconfigurable wireless data center fabric using free-space optics. SIGCOMM, 2014. Google ScholarDigital Library
- K. Janocha and W. M. Czarnecki. On loss functions for deep neural networks in classification. CoRR, 2017.Google ScholarCross Ref
- W. S. Jewell. Multi-commodity Network Solutions. 1966.Google Scholar
- J. Jiang, V. Sekar, I. Stoica, and H. Zhang. Unleashing the potential of data-driven networking. COMSNETS, 2017.Google ScholarCross Ref
- S. Kandula, D. Katabi, B. Davie, and A. Charny. Walking the tightrope: Responsive yet stable traffic engineering. 2005.Google Scholar
- S. Kassing, A. Valadarsky, G. Shahaf, M. Schapira, and A. Singla. Augmenting data center networks with multi-gigabit wireless links. SIGCOMM, 2011. Google ScholarDigital Library
- S. Knight, H. Nguyen, N. Falkner, R. Bowden, and M. Roughan. The internet topology zoo. IEEE Journal on Selected Areas in Communications, 2011.Google Scholar
- J. Kober, J. A. Bagnell, and J. Peters. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 2013. Google ScholarDigital Library
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. Google ScholarDigital Library
- P. Kumar, Y. Yuan, C. Yu, N. Foster, R. D. Kleinberg, and R. Soulé. Kulfi: Robust traffic engineering using semi-oblivious routing. CoRR, 2016.Google Scholar
- Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.Google ScholarCross Ref
- M. Majer, C. Bobda, A. Ahmadinia, and J. Teich. Packet routing in dynamically changing networks on chip. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04, IPDPS '05, pages 154.2--, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarDigital Library
- H. Mao, M. Alizadeh, I. Menache, and S. Kandula. Resource management with deep reinforcement learning. HotNets, 2016. Google ScholarDigital Library
- H. Mao, R. Netravali, and M. Alizadeh. Neural adaptive bitrate streaming with pensive. SIGCOMM, 2017. Google ScholarDigital Library
- A. Medina, N. Taft, K. Salamatian, S. Bhattacharyya, and C. Diot. Traffic matrix estimation: Existing techniques and new directions. 2002.Google ScholarDigital Library
- N. Michael and A. Tang. Halo: Hop-by-hop adaptive link-state optimal routing. IEEE/ACM Transactions on Networking, 2015. Google ScholarDigital Library
- V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous methods for deep reinforcement learning. ICML, 2016. Google ScholarDigital Library
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. A. Riedmiller. Playing atari with deep reinforcement learning. CoRR, 2013.Google Scholar
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 2015.Google Scholar
- M. Mohri, A. Rostamizadeh, and A. Talwalkar. Foundations of Machine Learning. The MIT Press, 2012. Google ScholarDigital Library
- M. Roughan, A. Greenberg, C. Kalmanek, M. Rumsewicz, J. Yates, and Y. Zhang. Experience in measuring backbone traffic variability: Models, metrics, measurements and meaning. IMW, 2002. Google ScholarDigital Library
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz. Trust region policy optimization. ICML, 2015. Google ScholarDigital Library
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal Policy Optimization Algorithms. ArXiv e-prints, 2017.Google Scholar
- F. Shahrokhi and D. W. Matula. The maximum concurrent flow problem. J. ACM, 1990. Google ScholarDigital Library
- D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature, 2016.Google Scholar
- R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. MIT press Cambridge, 1998. Google ScholarDigital Library
- H. Wang and B. Li. Lube: Mitigating bottlenecks in wide area data analytics. HotCloud, 2017. Google ScholarDigital Library
- C. J. Watkins and P. Dayan. Q-learning. Machine Learning, 1992.Google Scholar
- K. Winstein and H. Balakrishnan. Tcp ex machina: Computer-generated congestion control. SIGCOMM, 2013. Google ScholarDigital Library
- Y. Wu, E. Mansimov, S. Liao, R. B. Grosse, and J. Ba. Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. ArXiv e-prints, 2017.Google Scholar
- D. Xu, M. Chiang, and J. Rexford. Link-state routing with hop-by-hop forwarding can achieve optimal traffic engineering. INFOCOM, 2008.Google ScholarCross Ref
- X. Zhou, Z. Zhang, Y. Zhu, Y. Li, S. Kumar, A. Vahdat, B. Y. Zhao, and H. Zheng. Mirror mirror on the ceiling: Flexible wireless links for data centers. SIGCOMM, 2012. Google ScholarDigital Library
Index Terms
- Learning to Route
Recommendations
Optimal Route Reflection Topology Design
LANC '18: Proceedings of the 10th Latin America Networking ConferenceAutonomous Systems (ASes) exchange routing information about networks they can reach in the Internet, and the most widely extended way to connect them is by means of Border Gateway Protocol (BGP) sessions. ASes set up external BGP (eBGP) sessions ...
Designing optimal iBGP route-reflection topologies
NETWORKING'08: Proceedings of the 7th international IFIP-TC6 networking conference on AdHoc and sensor networks, wireless networks, next generation internetThe Border Gateway Protocol (BGP) is used today by all Autonomous Systems (AS) in the Internet. Inside each AS, iBGP sessions distribute the external routes among the routers. In large ASs, relying on a full-mesh of iBGP sessions between routers is not ...
Generating adaptive route instructions using hierarchical reinforcement learning
SC'10: Proceedings of the 7th international conference on Spatial cognitionWe present a learning approach for efficiently inducing adaptive behaviour of route instructions. For such a purpose we propose a two-stage approach to learn a hierarchy of wayfinding strategies using hierarchical reinforcement learning. Whilst the ...
Comments