Abstract
Computer network protocols define the rules in which two entities communicate over a network of unique hosts. Many protocol specifications are unknown, unavailable, or minimally documented, which prevents thorough analysis of the protocol for security purposes. For example, modern botnets often use undocumented and unique application-layer communication protocols to maintain command and control over numerous distributed hosts. Inferring the specification of closed protocols has numerous advantages, such as intelligent deep packet inspection, enhanced intrusion detection system algorithms for communications, and integration with legacy software packages. The multitude of closed protocols coupled with existing time-intensive reverse engineering methodologies has spawned investigation into automated approaches for reverse engineering of closed protocols. This article summarizes and organizes previously presented automatic protocol reverse engineering tools by approach. Approaches that focus on reverse engineering the finite state machine of a target protocol are separated from those that focus on reverse engineering the protocol format.
- Rakesh Agrawal and Srikant Ramakrishnan. 1994. Fast algorithms for mining association rules. In 20th International Conference on Very Large Data Bases (VLDB), Vol. 1215. Google ScholarDigital Library
- Glenn Ammons, Rastislav Bodík, and James R. Larus. 2002. Mining specifications. In 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’02). ACM, New York, NY, 4--16. DOI:10.1145/503272.503275http://doi.acm.org/10.1145/503272.503275 Google ScholarCross Ref
- João Antunes, Nuno Neves, and Paulo Verissimo. 2011. Reverse engineering of protocols from network traces. In 2011 18th Working Conference on Reverse Engineering (WCRE), 169,178. DOI:10.1109/WCRE.2011.28 Google ScholarDigital Library
- Marshall Beddoe. 2004. The protocol informatics project. Retrieved March 19, 2014 from http://www.4tphi.net/∼awalters/PI/PI.html.Google Scholar
- Nikita Borisov, David J. Brumley, Helen J. Wang, and Chuanxiong Guo. 2007. Generic application-level protocol analyzer and its language. In Network and Distributed System Security Symposium.Google Scholar
- Juan Caballero, Heng Yin, Zhenkai Liang, and Dawn Song. 2007. Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In 14th ACM Conference on Computer and Communications Security (CCS’07). ACM, New York, NY, 317--329. DOI:10.1145/1315245.1315286 http://doi.acm.org/10.1145/1315245.1315286 Google ScholarDigital Library
- Juan Caballero, Pongsin Poosankam, Christian Kreibich, and Dawn Song. 2009. Dispatcher: Enabling active botnet infiltration using automatic protocol reverse-engineering. In Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS’09). ACM, New York, NY, 621--634. DOI:10.1145/1653662.1653737 http://doi.acm.org/10.1145/1653662.1653737 Google ScholarDigital Library
- Juan Caballero and Dawn Song. 2013. Automatic protocol reverse-engineering: Message format extraction and field semantics inference. International Journal of Computer and Telecommunications Networking 57, 2. Elsevier, 451--474. Google ScholarDigital Library
- Chia Yuan Cho, Domagoj Babić, Eui Chul Richard Shin, and Dawn Song. 2010. Inference and analysis of formal models of botnet command and control protocols. In Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS’10). ACM, New York, NY, 426--439. DOI:10.1145/1866307.1866355 http://doi.acm.org/10.1145/1866307.1866355 Google ScholarDigital Library
- Paolo Milani Comparetti, Gilbert Wondracek, Christopher Kruegel, and Engin Kirda. 2009. Prospex: Protocol specification extraction. In 2009 30th IEEE Symposium on Security and Privacy, 110--125. DOI:10.1109/SP.2009.14 Google ScholarDigital Library
- Ed Crocker. 2008. Augmented BNF for Syntax Specifications: ABNF. Retrieved February 27, 2014 from http://tools.ietf.org/html/rfc5234.Google ScholarCross Ref
- Weidong Cui, Vern Paxson, Nicholas C. Weaver, and Randy H. Katz. 2006. Protocol-independent adaptive replay of application dialog. In Proceedings of the 13th Symposium on Network and Distributed System Security (NDSS’06).Google Scholar
- Weidong Cui, Jayanthkumar Kannan, and Helen J. Wang. 2007. Discoverer: Automatic protocol description generation from network traces. In USENIX Security Symposium. Google ScholarDigital Library
- Weidong Cui, Marcus Peinado, Karl Chen, Helen J. Wang, and Luis Irun-Briz. 2008. Tupni: Automatic reverse engineering of input formats. In 15th ACM Conference on Computer and Communications Security (CCS’08). ACM, New York, NY, 391--402. DOI:10.1145/1455770.1455820 http://doi.acm.org/10.1145/1455770.1455820 Google ScholarDigital Library
- Alberto Dainotti, Antonio Pescape, and Kimberly Claffy. 2012. Issues and future directions in traffic classification. IEEE Network 26, 1, (Jan.-Feb. 2012), 35--40. DOI:10.1109/MNET.2012.6135854 Google ScholarDigital Library
- Serge Gorbunov and Arnold Rosenbloom. 2010. AutoFuzz: Automated network protocol fuzzing framework. International Journal of Computer Science and Network Security 10, 8, 239--245.Google Scholar
- IEEE Standards Association. 2012. IEEE Standard for Electric Power Systems Communications—Distributed Network Protocol (DNP3).Google Scholar
- IETF.org. 1999. RFC 2616—Hypertext Transfer Protocol—HTTP/1.1. Retrieved July 20, 2015 from https://www.ietf.org/rfc/rfc2616.txt.Google Scholar
- IETF.org. 2014. RFC 7230—Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. Retrieved July 20, 2015 from https://tools.ietf.org/html/rfc7230.Google Scholar
- ITU.int. 2014. Introduction to ASN.1. Retrieved February 27, 2014 from http://www.itu.int/en/ITU-T/asn1/Pages/introduction.aspx.Google Scholar
- Jim Kurose and Keith Ross. 2013. Computer Networking: A Top-Down Approach (6th ed.). Addison-Wesley, Upper Saddle River, NJ. Google ScholarDigital Library
- Patrick LaRoche, A. Nur Zincir-Heywood, and Malcolm I. Heywood. 2012. Network protocol discovery and analysis via live interaction. In Applications of Evolutionary Computation. Springer, Berlin, 11--20. Google ScholarDigital Library
- Patrick LaRoche, Aimee Burrows, and A. Nur Zincir-Heywood. 2013. How far an evolutionary approach can go for protocol state analysis and discovery. In 2013 IEEE Congress on Evolutionary Computation, 3228--3235. DOI:10.1109/CEC.2013.6557965.Google ScholarCross Ref
- David Lee and Krishan Sabnani. 1993. Reverse-engineering of communication protocols. In IEEE International Conference on Network Protocols (ICNP), 208--216.Google ScholarCross Ref
- David Lee and Mihalis Yannakakis. 1996. Principles and methods of testing finite state machines—A survey. Proceedings of the IEEE 84, 8, 1090--1123. DOI:10.1109/5.533956Google ScholarCross Ref
- Corrado Leita, Ken Mermoud, and Marc Dacier. 2005. ScriptGen: An automated script generation tool for HoneyD. In 21st Annual Computer Security Applications Conference (ACSAC’05), 200--214. DOI:10.1109/CSAC.2005.49. Google ScholarDigital Library
- Xiangdong Li and Li Chen. 2011. A survey on methods of automatic protocol reverse engineering. In 2011 7th International Conference on Computational Intelligence and Security (CIS), 685--689. Google ScholarDigital Library
- Zhiqiang Lin, Xuxian Jiang, Dongyan Xu, and Xiangyu Zhang. 2008. Automatic protocol format reverse engineering through context-aware monitored execution. In NDSS, 1--15.Google Scholar
- Zhiqiang Lin, Xiangyu Zhang, and Dongyan Xu. 2010. Reverse engineering input syntactic structure from program execution and its applications. In IEEE Transactions on Software Engineering 36, 5 (2010) 688--703. DOI:10.1109/TSE.2009.54 Google ScholarDigital Library
- Min Liu, Chunfu Jia, Lu Liu, and Zhi Wang. 2013. Extracting sent message formats from executables using backward slicing. In 2013 4th International Conference on Emerging Intelligent Data and Web Technologies (EIDWT), 377--384. Google ScholarDigital Library
- Jian-Zhen Luo, and Shun-Zheng Yu. 2013. Position-based automatic reverse engineering of network protocols. Journal of Network and Computer Applications 36, 3 (2013), 1070--1077.Google ScholarCross Ref
- Justin Ma, Kirill Levchenko, Christian Kreibich, Stefan Savage, and Geoffrey M. Voelker. 2006. Unexpected means of protocol inference. In 6th ACM SIGCOMM Conference on Internet Measurement (IMC’06). ACM, New York, NY, 313--326. DOI:10.1145/1177080.1177123 http://doi.acm.org/10.1145/1177080.1177123 Google ScholarDigital Library
- George Mealy. 1955. A method for synthesizing sequential circuits. In Bell System Technical Journal 34, 5 (1955), 1045--1079.Google ScholarCross Ref
- Milton Mueller and Asghari Hadi. 2012. Deep packet inspection and bandwidth management: Battles over BitTorrent in Canada and the United States. Telecommunications Policy 36, 6 (2012), 462--475. Google ScholarDigital Library
- Saul Needleman and Christian Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48 (1970), 443--53. DOI:10.1016/0022-2836(70)90057-4Google ScholarCross Ref
- Norton.com. 2014. Bots and botnets—A growing threat. Retrieved February 26, 2014 from https://us.norton.com/botnet/promo.Google Scholar
- Sandip C. Patel, Ganesh D. Bhatt, and James H. Graham. 2009. Improving the cyber security of SCADA communication networks. Communications of the ACM 52, 7 (July 2009), 139--142. DOI:10.1145/1538788.1538820 http://doi.acm.org/10.1145/1538788.1538820 Google ScholarDigital Library
- PeachFuzzer.com. 2014. Peach Fuzzer Overview. Retrieved February 26, 2014 from http://peachfuzzer.com/pdf/Peach-Overview-DejaVuSecurity-Datasheet-2014.pdf.Google Scholar
- Christian Rossow and Christian J. Dietrich. 2013. Provex: Detecting botnets with encrypted command and control channels. In Detection of Intrusions and Malware, and Vulnerability Assessment, Lecture Notes in Computer Science, Vol. 7967. Springer, Berlin, 21--40. Google ScholarDigital Library
- Maxim Shevertalov and Spiros Mancoridis. 2007. A reverse engineering tool for extracting protocols of networked applications. In 14th Working Conference on Reverse Engineering (WCRE’07). 229--238. DOI:10.1109/WCRE.2007.6 Google ScholarDigital Library
- Skype.com. 2014. TLS and SRTP for Skype Connect: Technical Datasheet. Retrieved February 27, 2014 from https://support.skype.com/resources/sites/SKYPE/content/live/DOCUMENTS/0/DO14/en_US/skype-connect-technical-datasheet.pdf.Google Scholar
- TCPDump/LibPCap. 2010. TCPDump & LibPCap. Retrieved March 19, 2014 from http://www.tcpdump.org/.Google Scholar
- Naftali Tishby, Fernando Pereira, and William Bialek. 1999. The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, 368--377.Google Scholar
- Li Tong, Yuan Liu, Chun-rui Zhang, Fan-zhi Meng, and Yang Yue. 2014. A novel method for delimiting frames of unknown protocol. In 2014 IEEE Workshop on Electronics, Computer and Applications, 552--555.Google ScholarCross Ref
- Andrew Tridgell. 2003. How SAMBA Was Written. Retrieved February 26, 2014 from http://www.samba.org/ftp/tridge/misc/french_cafe.txt.Google Scholar
- Antonio Trifilo, Stefan Burschka, and Ernst Biersack. 2009. Traffic to protocol reverse engineering. In 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 1--8. DOI:10.1109/CISDA.2009.5356565 Google ScholarCross Ref
- Helen J. Wang, Chuanxiong Guo, Daniel R. Simon, and Alf Zugenmaier. 2004. Shield: Vulnerability-driven network filters for preventing known vulnerability exploits. In Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM’04). ACM, New York, NY, 193--204. DOI:http://dx.doi.org/10.1145/1015467.1015489. Google ScholarDigital Library
- Zhi Wang, Xuxian Jiang, Weidong Cui, Xinyuan Wang, and Mike Grace. 2009. ReFormat: Automatic reverse engineering of encrypted messages. In Computer Security—ESORICS 2009. Springer, Berlin, 200--215. Google ScholarDigital Library
- Yipeng Wang, Xingjian Li, Jiao Meng, Yong Zhao, Zhibin Zhang, and Li Guo. 2011a. Biprominer: Automatic mining of binary protocol features. In 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), 179--184. Google ScholarDigital Library
- Yipeng Wang, Zhibin Zhang, Danfeng Yao, Buyun Qu, and Li Guo. 2011b. Inferring protocol state machine from network traces: A probabilistic approach. Applied Cryptography and Network Security 2011. Google ScholarDigital Library
- Yipeng Wang, XiaoChun Yun, M. Zubair Shafiq, Liyan Wang, Alex X. Liu, Zhibin Zhang, Danfeng Yao, Yong Zheng Zhang, and Li Guo. 2012. A semantics aware approach to automated reverse engineering unknown protocols. In 2012 20th IEEE International Conference on Network Protocols (ICNP). Google ScholarDigital Library
- Yong Wang. 2013. Protocol Specification Inference Based on Keywords Identification. Advanced Data Mining and Applications. Springer, Berlin, 443--454. Google ScholarDigital Library
- T. A. Welch. 1984. A technique for high-performance data compression. Computer 17, 6 (1984), 8--19. Google ScholarDigital Library
- Wine.org. 2014. About Wine. Retrieved February 26, 2014 from http://www.winehq.org/about/.Google Scholar
- Gilbert Wondracek, Paolo Milani Comparetti, Christopher Kruegel, and Engin Kirda. 2008. Automatic network protocol analysis. In NDSS, 1--14.Google Scholar
- Ming-Ming Xiao, Shun-Zheng Yu, and Yu Wang. 2009. Automatic network protocol automaton extraction. In 2009 3rd International Conference on Network and System Security, 336--343. DOI:10.1109/NSS.2009.71 Google ScholarDigital Library
- Zhao Zhang, Qiao-Yan Wen, and Wen Tang. 2012. Mining protocol state machines by interactive grammar inference. In 2012 3rd International Conference on Digital Manufacturing and Automation (ICDMA), 524--527. DOI:10.1109/ICDMA.2012.125 Google ScholarDigital Library
Index Terms
- A Survey of Automatic Protocol Reverse Engineering Tools
Recommendations
Towards automated protocol reverse engineering using semantic information
ASIA CCS '14: Proceedings of the 9th ACM symposium on Information, computer and communications securityNetwork security products, such as NIDS or application firewalls, tend to focus on application level communication flows. However, adding support for new proprietary and often undocumented protocols, implies the reverse engineering of these protocols. ...
Automatic Reverse Engineering Method for Extracting Well-trimmed Protocol Specification
ICTCE '18: Proceedings of the 2nd International Conference on Telecommunications and Communication EngineeringEmergence of high-speed Internet and ubiquitous environment has led to a rapid increase of applications and malicious behaviors with various functions. Many of the complex and diverse protocols that occur under these situations, are unknown protocols ...
Automatic protocol reverse-engineering: Message format extraction and field semantics inference
Understanding the command-and-control (C&C) protocol used by a botnet is crucial for anticipating its repertoire of nefarious activity. However, the C&C protocols of botnets, similar to many other application layer protocols, are undocumented. Automatic ...
Comments