skip to main content
10.1145/3627703.3629591acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections

Transparent Multicore Scaling of Single-Threaded Network Functions

Published:22 April 2024Publication History

ABSTRACT

This paper presents NFOS, a programming model, runtime, and profiler for productively developing software network functions (NFs) that scale on multicore machines. Writing shared-state concurrent systems that are both correct and scalable is still a serious challenge, which is why NFOS insulates developers from writing concurrent code.

In the NFOS programming model, developers write their NF as a sequential program, concerning themselves with the NF logic instead of parallelism and shared-state synchronization. The NFOS abstractions are both familiar to the NF programmer and convey to the NFOS runtime crucial information that enables it to correctly execute the NF's packet processing in parallel on multiple cores. Paired with NFOS's domain-specific concurrent data structures, this parallelism scales the NF transparently, obviating the need for developers to write concurrent code. We show that serial, stateful NFs run atop NFOS achieve scalability on par with their concurrent, hand-optimized counterparts in Cisco VPP [8].

Some scalability bottlenecks are inherent to the NF's semantics, and thus cannot be resolved while preserving those semantics. NFOS identifies the root causes of such bottlenecks and provides scalability recipes that guide developers in relaxing the NF's semantics to eliminate these bottlenecks. We present examples where such NFOS-guided relaxation of NF semantics further improves scalability by 2x to 91x.

References

  1. The CAIDA UCSD Anonymized Internet Traces - 2016. https://www.caida.org/catalog/datasets/passive_dataset. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  2. DPDK Release 20.11. https://doc.dpdk.org/guides-20.11/rel_notes/release_20_11.html. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  3. Fix of VPP NAT Race Condition on Address Mappings. https://gerrit.fd.io/r/c/vpp/+/31174. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  4. HTTP Caching. https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  5. Juniper Networks vSRX Virtual Firewall Datasheet. https://www.juniper.net/us/en/products/security/srx-series/vsrx-virtual-firewall-datasheet.html. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  6. netElastic Systems Carrier Grade NAT (CGNAT). https://netelastic.com/products/carrier-grade-nat-cgnat/. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  7. NFF-Go. https://github.com/aregm/nff-go. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  8. Vector Packet Processiong (VPP). https://github.com/FDio/vpp/tree/v21.01. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  9. The Year of 100GbE in Data Center Networks. https://www.datacenterknowledge.com/networks/year-100gbe-data-center-networks. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  10. Utpal Banerjee, Rudolf Eigenmann, Alexandra Nicolau, and David A. Padua. Automatic Program Parallelization. Proceedings of the IEEE, 81(2), 1993.Google ScholarGoogle ScholarCross RefCross Ref
  11. Tom Barbette, Georgios P Katsikas, Gerald Q Maguire Jr, and Dejan Kostić. RSS++: Load and State-Aware Receive Side Scaling. In Intl. Conf. on Emerging Networking Experiments and Technologies (CoNEXT), 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Theophilus Benson, Aditya Akella, and David A. Maltz. Network Traffic Characteristics of Data Centers in the Wild. In ACM Internet Measurement Conf. (IMC), 2010.Google ScholarGoogle Scholar
  13. Lusheng Ji Bo Han, Vijay Gopalakrishnan and Seungjoon Lee. Network Function Virtualization: Challenges and Opportunities for Innovations. IEEE Communications Magazine, 53, 2015.Google ScholarGoogle Scholar
  14. Michael D. Bond, Katherine E. Coons, and Kathryn S. McKinley. PACER: Proportional Detection of Data Races. In Intl. Conf. on Programming Language Design and Implementation (PLDI), 2010.Google ScholarGoogle Scholar
  15. Kevin Borders, Jonathan Springer, and Matthew Burnside. Chimera: A Declarative Language for Streaming Network Traffic Analysis. In USENIX Security Symp., 2012.Google ScholarGoogle Scholar
  16. Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, and Marcos K Aguilera. Black-Box Concurrent Data Structures for NUMA Architectures. In Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.Google ScholarGoogle Scholar
  17. LAN/MAN Standards Committee. IEEE Standard for Local and Metropolitan Area Network-Bridges and Bridged Networks. IEEE Std 802.1Q-2018 (Revision of IEEE Std 802.1Q-2014), 2018.Google ScholarGoogle Scholar
  18. Charlie Curtsinger and Emery D Berger. Coz: Finding Code that Counts with Causal Profiling. In ACM Symp. on Operating Systems Principles (SOSP), 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Arnaldo Carvalho de Melo. The New Linux Perf Tools. http://vger.kernel.org/~acme/perf/lk2010-perf-paper.pdf. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  20. Mihai Dobrescu, Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, and Sylvia Ratnasamy. RouteBricks: Exploiting Parallelism To Scale Software Routers. In ACM Symp. on Operating Systems Principles (SOSP), 2009.Google ScholarGoogle Scholar
  21. DPDK: Data Plane Development Kit. https://dpdk.org. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  22. Daniel E. Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, and Jinnah Dylan Hosein. Maglev: A Fast and Reliable Software Network Load Balancer. In Symp. on Networked Systems Design and Implementation (NSDI), 2016.Google ScholarGoogle Scholar
  23. Paul Emmerich, Sebastian Gallenmüller, Daniel Raumer, Florian Wohlfart, and Georg Carle. MoonGen: A Scriptable High-Speed Packet Generator. In ACM Internet Measurement Conf. (IMC), 2015.Google ScholarGoogle Scholar
  24. Aaron Gember-Jacobson, Raajay Viswanathan, Chaithan Prakash, Robert Grandl, Junaid Khalid, Sourav Das, and Aditya Akella. OpenNF: Enabling Innovation in Network Function Control. ACM SIGCOMM Computer Communication Review, 44(4), 2014.Google ScholarGoogle Scholar
  25. Cary G. Gray and David R. Cheriton. Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency. In ACM Symp. on Operating Systems Principles (SOSP), 1989.Google ScholarGoogle Scholar
  26. Manish Gupta, Sayak Mukhopadhyay, and Navin Sinha. Automatic Parallelization of Recursive Procedures. Intl. Journal of Parallel Programming, 28, 2000.Google ScholarGoogle Scholar
  27. Sangjin Han, Keon Jang, Aurojit Panda, Shoumik Palkar, Dongsu Han, and Sylvia Ratnasamy. SoftNIC: A Software NIC to Augment Hardware. Technical Report UCB/EECS-2015-155, 2015.Google ScholarGoogle Scholar
  28. Maurice Herlihy and J. Eliot B. Moss. Transactional Memory: Architectural Support for Lock-Free Data Structures. In Intl. Symp. on Computer Architecture (ISCA), 1993.Google ScholarGoogle Scholar
  29. Evolved Packet Core (EPC) for Communications Service Providers. https://networkbuilders.intel.com/docs/networkbuilders/Evolved-packet-core-EPC-for-communications-service-providers-ra.pdf. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  30. Muhammad Asim Jamshed, Jihyung Lee, Sangwoo Moon, Insu Yun, Deokjin Kim, Sungryoul Lee, Yung Yi, and KyoungSoo Park. Kargus: A Highly-Scalable Software-Based Intrusion Detection System. In ACM Conf. on Computer and Communications Security (CCS), 2012.Google ScholarGoogle Scholar
  31. Muhammad Asim Jamshed, YoungGyoun Moon, Donghwi Kim, Dongsu Han, and KyoungSoo Park. mOS: A Reusable Networking Stack for Flow Monitoring Middleboxes. In Symp. on Networked Systems Design and Implementation (NSDI), 2017.Google ScholarGoogle Scholar
  32. Cullen Jennings and Francois Audet. Network Address Translation (NAT) Behavioral Requirements for Unicast UDP. RFC 4787, Internet Engineering Task Force, 2007.Google ScholarGoogle Scholar
  33. Murad Kablan, Blake Caldwell, Richard Han, Hani Jamjoom, and Eric Keller. Stateless Network Functions. In ACM SIGCOMM Workshop on Hot Topics in Middleboxes and Network Function Virtualization, 2015.Google ScholarGoogle Scholar
  34. Charlie Kaufman, Paul Hoffman, Yoav Nir, Pasi Eronen, and Tero Kivinen. Internet Key Exchange Protocol Version 2 (IKEv2). RFC 7296, Internet Engineering Task Force, 2014.Google ScholarGoogle Scholar
  35. Junaid Khalid, Aaron Gember-Jacobson, Roney Michael, Anubhavnidhi Abhashkumar, and Aditya Akella. Paving the Way for NFV: Simplifying Middlebox Modifications Using StateAlyzr. In Symp. on Networked Systems Design and Implementation (NSDI), 2016.Google ScholarGoogle Scholar
  36. Jaeho Kim, Ajit Mathew, Sanidhya Kashyap, Madhava Krishnan Ramanathan, and Changwoo Min. MV-RLU: Scaling Read-Log-Update with Multi-Versioning. In Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2019.Google ScholarGoogle Scholar
  37. Eddie Kohler, Robert Morris, Benjie Chen, John Jannotti, and M. Frans Kaashoek. The Click Modular Router. ACM Transactions on Computer Systems (TOCS), 18(3), 2000.Google ScholarGoogle Scholar
  38. Bohuslav Krena, Zdenek Letko, Rachel Tzoref, Shmuel Ur, and Tomás Vojnar. Healing Data Races On-the-Fly. In Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging (PADTAD), 2007.Google ScholarGoogle Scholar
  39. Zdenek Letko, Tomás Vojnar, and Bohuslav Krena. AtomRace: Data Race and Atomicity Violation Detector and Healer. In Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging, 2008.Google ScholarGoogle Scholar
  40. Guangpu Li, Dongjie Chen, Shan Lu, Madanlal Musuvathi, and Suman Nath. SherLock: Unsupervised Synchronization-Operation Inference. In Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021.Google ScholarGoogle Scholar
  41. Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. Learning from Mistakes - A Comprehensive Study on Real World Concurrency Bug Characteristics. In Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2008.Google ScholarGoogle Scholar
  42. Brandon Lucia, Joseph Devietti, Karin Strauss, and Luis Ceze. Atom-Aid: Detecting and Surviving Atomicity Violations. In Intl. Symp. on Computer Architecture (ISCA), 2008.Google ScholarGoogle Scholar
  43. Joao Martins, Mohamed Ahmed, Costin Raiciu, Vladimir Olteanu, Michio Honda, Roberto Bifulco, and Felipe Huici. ClickOS and the Art of Network Function Virtualization. In Symp. on Networked Systems Design and Implementation (NSDI), 2014.Google ScholarGoogle Scholar
  44. Paul E McKenney and John D Slingwine. Read-Copy Update: Using Execution History to Solve Concurrency Problems. In Parallel and Distributed Computing and Systems, 1998.Google ScholarGoogle Scholar
  45. Moonpol. https://github.com/erkinkirdan/moonpol. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  46. Satish Narayanasamy, Zhenghao Wang, Jordan Tigani, Andrew Edwards, and Brad Calder. Automatically Classifying Benign and Harmful Data Races Using Replay Analysis. In Intl. Conf. on Programming Language Design and Implementation (PLDI), 2007.Google ScholarGoogle Scholar
  47. Aurojit Panda, Sangjin Han, Keon Jang, Melvin Walls, Sylvia Ratnasamy, and Scott Shenker. NetBricks: Taking the V out of NFV. In Symp. on Operating Systems Design and Implementation (OSDI), 2016.Google ScholarGoogle Scholar
  48. Francisco Pereira, Fernando M. V. Ramos, and Luis Pedrosa. Automatic Parallelization of Software Network Functions. In Symp. on Networked Systems Design and Implementation (NSDI), 2024.Google ScholarGoogle Scholar
  49. Shriram Rajagopalan, Dan Williams, Hani Jamjoom, and Andrew Warfield. Split/Merge: System Support for Elastic Execution in Virtual Middleboxes. In Symp. on Networked Systems Design and Implementation (NSDI), 2013.Google ScholarGoogle Scholar
  50. Introduction to Receive Side Scaling. https://docs.microsoft.com/en-us/windows-hardware/drivers/network/introduction-to-receive-side-scaling. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  51. Stuart E Schechter, Jaeyeon Jung, and Arthur W Berger. Fast Detection of Scanning Worm Infections. In Recent Advances in Intrusion Detection, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  52. Tomer Shanny and Adam Morrison. Occualizer: Optimistic Concurrent Search Trees From Sequential Code. In Symp. on Operating Systems Design and Implementation (OSDI), 2022.Google ScholarGoogle Scholar
  53. Nir Shavit and Dan Touitou. Software Transactional Memory. In Symp. on Principles of Distributed Computing, 1995.Google ScholarGoogle Scholar
  54. Pyda Srisuresh and Kjeld B. Egevang. Traditional IP Network Address Translator. RFC 3022, Internet Engineering Task Force, 2001.Google ScholarGoogle Scholar
  55. Mohammad Mejbah ul Alam, Tongping Liu, Guangming Zeng, and Abdullah Muzahid. SyncPerf: Categorizing, Detecting, and Diagnosing Synchronization Performance Bugs. In ACM EuroSys European Conf. on Computer Systems (EUROSYS), 2017.Google ScholarGoogle Scholar
  56. Hans Vandierendonck, Sean Rul, and Koen De Bosschere. The Paralax Infrastructure: Automatic Parallelization with a Helping Hand. In Intl. Conf. on Parallel Architectures and Compilation Techniques, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Kaushik Veeraraghavan, Peter M. Chen, Jason Flinn, and Satish Narayanasamy. Detecting and Surviving Data Races Using Complementary Schedules. In ACM Symp. on Operating Systems Principles (SOSP), 2011.Google ScholarGoogle Scholar
  58. Haris Volos, Andres Jaan Tack, Michael M. Swift, and Shan Lu. Applying Transactional Memory to Concurrency Bugs. In Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. The Vector Packet Processing (VPP) Platform. https://wiki.fd.io/view/VPP/What_is_VPP%3f. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  60. Intel VTune Performance Analyzer. https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html. [Last accessed on 2023-10-29].Google ScholarGoogle Scholar
  61. Shinae Woo, Justine Sherry, Sangjin Han, Sue Moon, Sylvia Ratnasamy, and Scott Shenker. Elastic Scaling of Stateful Network Functions. In Symp. on Networked Systems Design and Implementation (NSDI), 2018.Google ScholarGoogle Scholar
  62. Zhengming Yi, Yiping Yao, and Kai Chen. A Universal Construction to Implement Concurrent Data Structure for NUMA-Muticore. In Intl. Conf. on Parallel Processing, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Tingting Yu and Michael Pradel. SyncProf: Detecting, Localizing, and Optimizing Synchronization Bottlenecks. In Intl. Symp. on Software Testing and Analysis (ISSTA), 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Arseniy Zaostrovnykh, Solal Pirelli, Rishabh R. Iyer, Matteo Rizzo, Luis Pedrosa, Katerina J. Argyraki, and George Candea. Verifying Software Network Functions with No Verification Expertise. In ACM Symp. on Operating Systems Principles (SOSP), 2019.Google ScholarGoogle Scholar
  65. Minjia Zhang, Jipeng Huang, Man Cao, and Michael D Bond. Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics. In Symp. on Principles and Practice of Parallel Computing (PPoPP), 2015.Google ScholarGoogle Scholar
  66. Zhipeng Zhao, Hugo Sadok, Nirav Atre, James C Hoe, Vyas Sekar, and Justine Sherry. Achieving 100Gbps Intrusion Prevention on a Single Server. In Symp. on Operating Systems Design and Implementation (OSDI), 2020.Google ScholarGoogle Scholar

Index Terms

  1. Transparent Multicore Scaling of Single-Threaded Network Functions

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        EuroSys '24: Proceedings of the Nineteenth European Conference on Computer Systems
        April 2024
        1245 pages
        ISBN:9798400704376
        DOI:10.1145/3627703

        Copyright © 2024 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 April 2024

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate241of1,308submissions,18%
      • Article Metrics

        • Downloads (Last 12 months)79
        • Downloads (Last 6 weeks)79

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader