ABSTRACT
Modern deep packet inspection systems use regular expressions to define various patterns of interest in network data streams. Deterministic Finite Automata (DFA) are commonly used to parse regular expressions. DFAs are fast, but can require prohibitively large amounts of memory for patterns arising in network applications. Traditional DFA table compression only slightly reduces the memory required and requires an additional memory access per input character. Alternative representations of regular expressions, such as NFAs and Delayed Input DFAs (D2FA) require less memory but sacrifice throughput. In this paper we introduce the Content Addressed Delayed Input DFA (CD2FA), which provides a compact representation of regular expressions that match the throughput of traditional uncompressed DFAs. A CD2FA addresses successive states of a D2FA using their content, rather than a "content-less" identifier. This makes selected information available earlier in the state traversal process, which makes it possible to avoid unnecessary memory accesses. We demonstrate that such content-addressing can be effectively used to obtain automata that are very compact and can achieve high throughput. Specifically, we show that for an application using thousands of patterns defined by regular expressions, CD2FAs use as little as 10% of the space required by a conventional compressed DFA, and match the throughput of an uncompressed DFA.
- R. Sommer, V. Paxson, "Enhancing Byte-Level Network Intrusion Detection Signatures with Context," ACM conf. on Computer and Communication Security, 2003, pp. 262--271. Google ScholarDigital Library
- J. E. Hopcroft and J. D. Ullman, "Introduction to Automata Theory, Languages, and Computation," Addison Wesley, 1979. Google ScholarDigital Library
- S. Kumar et al, "Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection", in ACM SIGCOMM'06, Pisa, Italy, September 12-15, 2006. Google ScholarDigital Library
- Bro Intrusion detection system, Vern Paxson, http://www.icir.org/vern/bro-info.htmlGoogle Scholar
- M. Roesch, "Snort: Lightweight intrusion detection for networks," Systems Administration Conference (LISA), November 1999. Google ScholarDigital Library
- S. Antonatos, et al, "Generating realistic workloads for network intrusion detection systems," In ACM Workshop on Software and Performance, 2004. Google ScholarDigital Library
- A. V. Aho, M. J. Corasick, "Efficient string matching: An aid to bibliographic search," Comm. of ACM, 18(6):333--340, 1975. Google ScholarDigital Library
- B. Commentz-Walter, "A string matching algorithm fast on the average," Proceedings of ICALP, pages 118--132, July 1979. Google ScholarDigital Library
- S. Wu, U. Manber," A fast algorithm for multi-pattern searching," Tech. R. TR-94-17, Univ of Arizona, 1994.Google Scholar
- Fang Yu, et al., "Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection", UCB tech. report, EECS-2005-8.Google Scholar
- N. Tuck, T. Sherwood, B. Calder, and G. Varghese, "Deterministic memory-efficient string matching algorithms for intrusion detection," IEEE Infocom 2004, pages 333--340.Google Scholar
- L. Tan, and T. Sherwood, "A High Throughput String Matching Architecture for Intrusion Detection and Prevention," ISCA'05. Google ScholarDigital Library
- I. Sourdis et al, "Pre-decoded CAMs for Efficient and High-Speed NIDS Pattern Matching," FCCM, 2004, pp. 258--267. Google ScholarDigital Library
- S. Yusuf and W. Luk, "Bitwise Optimised CAM for Network Intrusion Detection Systems," IEEE FPL 2005.Google Scholar
- R. Sidhu and V. K. Prasanna, "Fast regular expression matching using FPGAs," IEEE FCCM, Rohnert Park, CA, April 2001. Google ScholarDigital Library
- C. R. Clark and D. E. Schimmel, "Efficient reconfigurable logic circuit for matching complex network intrusion detection patterns," In 13th FCCM conference.Google Scholar
- J. Moscola, et al, "Implementation of a content-scanning module for an internet firewall," IEEE Workshop on FPGAs for Custom Computing Machines, Napa, CA, USA, April 2003. Google ScholarDigital Library
- Phillip Hall, "On representatives of subsets," J. London Math. Soc.,vol. 10, pp. 26--30, 1936.Google Scholar
- Will Eatherton, John Williams, "An encoded version of reg-ex database from cisco systems provided for research purposes".Google Scholar
- TippingPoint X505, www.tippingpoint.com/products_ips.htmlGoogle Scholar
- Cisco IOS IPS Deployment Guide, www.cisco.comGoogle Scholar
- Tarari RegEx, www. tarari.com/PDF/RegEx_FACT_SHEET.pdfGoogle Scholar
- R. Motwani, "Average-case analysis of algorithms for matching and related problems," J. of the ACM, 41:1329--1356, 1994. Google ScholarDigital Library
- J. B. Kruskal, "On the shortest spanning subtree of a graph and the traveling salesman problem," Proc. of the American Mathematical Society, 7:48--50, 1956.Google ScholarCross Ref
- N.J. Larsson, "Structures of string matching and data compression," PhD thesis, Lund University, 1999.Google Scholar
- S. Dharmapurikar, et al, "Deep Packet Inspection using Parallel Bloom Filters," IEEE Hot Interconnects 12, August 2003. Google ScholarDigital Library
- Z. K. Baker, V. K. Prasanna, "Automatic Synthesis of Efficient Intrusion Detection Systems on FPGAs," in Field Programmable Logic and Applications, Aug. 2004, pp. 311--321.Google ScholarCross Ref
- Y. H. Cho, W. H. Mangione-Smith, "Deep Packet Filter with Dedicated Logic and Read Only Memories," Field Programmable Logic and Applications, Aug. 2004, pp. 125--134. Google ScholarDigital Library
- M. Gokhale, et al., "Granidt: Towards Gigabit Rate Network Intrusion Detection Technology," Field Programmable Logic and Applications, Sept. 2002, pp. 404--413. Google ScholarDigital Library
- J. Levandoski, E. Sommer, and M. Strait, "Application Layer Packet Classifier for Linux". http://l7-filter.sourceforge.net/.Google Scholar
- M. Hill and J. Elder, "DineroIV tracedriven uniprocessor cache simulator," http://www.cs.wisc.edu/markhill/DineroIV, 1998.Google Scholar
- SafeXcel Content Inspection Engine, regex acceleration IP.Google Scholar
- Network Services Processor, OCTEON CN30XX Family.Google Scholar
- D. Fotakis, et. al, "Space efficient hash tables with worst case constant access time," In STACS, 2003. Google ScholarDigital Library
Index Terms
- Advanced algorithms for fast and scalable deep packet inspection
Recommendations
Algorithms to accelerate multiple regular expressions matching for deep packet inspection
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communicationsThere is a growing demand for network devices capable of examining the content of data packets in order to improve network security and provide application-specific services. Most high performance systems that perform deep packet inspection implement ...
A hybrid finite automaton for practical deep packet inspection
CoNEXT '07: Proceedings of the 2007 ACM CoNEXT conferenceDeterministic finite automata (DFAs) are widely used to perform regular expression matching in linear time. Several techniques have been proposed to compress DFAs in order to reduce memory requirements. Unfortunately, many real-world IDS regular ...
Algorithms to accelerate multiple regular expressions matching for deep packet inspection
SIGCOMM '06: Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communicationsThere is a growing demand for network devices capable of examining the content of data packets in order to improve network security and provide application-specific services. Most high performance systems that perform deep packet inspection implement ...
Comments