research-article

Public Access

Automata Processing in Reconfigurable Architectures: In-the-Cloud Deployment, Cross-Platform Evaluation, and Fast Symbol-Only Reconfiguration

Authors:
Chunkun Bo

University of Virginia, VA, USA

University of Virginia, VA, USA
View Profile

,
Vinh Dang

University of Virginia, VA, USA

University of Virginia, VA, USA
View Profile

,
Ted Xie

University of Virginia, VA, USA

University of Virginia, VA, USA
View Profile

,
Jack Wadden

University of Virginia, VA, USA

University of Virginia, VA, USA
View Profile

,
Mircea Stan

University of Virginia, VA, USA

University of Virginia, VA, USA
View Profile

,
Kevin Skadron

University of Virginia, VA, USA

University of Virginia, VA, USA
View Profile

ACM Transactions on Reconfigurable Technology and Systems Volume 12 Issue 2Article No.: 9pp 1–25https://doi.org/10.1145/3314576

Published:17 May 2019Publication History

ACM Transactions on Reconfigurable Technology and Systems

Abstract

We present a general automata processing framework on FPGAs, which generates an RTL kernel for automata processing together with an AXI and PCIe based I/O circuitry. We implement the framework on both local nodes and cloud platforms (Amazon AWS and Nimbix) with novel features. A full performance comparison of the proposed framework is conducted against state-of-the-art automata processing engines on CPUs, GPUs, and Micron’s Automata Processor using the ANMLZoo benchmark suite and some real-world datasets. Results show that FPGAs enable extremely high-throughput automata processing compared to von Neumann architectures. We also collect the resource utilization and power consumption on the two cloud platforms, and find that the I/O circuitry consumes most of the hardware resources and power. Furthermore, we propose a fast, symbol-only reconfiguration mechanism based on the framework for large pattern sets that cannot fit on a single device and need to be partitioned. The proposed method supports multiple passes of the input stream and reduces the re-compilation cost from hours to seconds.

References

Kevin Angstadt, Jack Wadden, Vinh Dang, Ted Xie, Dan Kramp, Westley Weimer, Mircea Stan, and Kevin Skadron. 2018. MNCaRT: An open-source, multi-architecture automata-processing research and execution ecosystem. IEEE Computer Architecture Letters 17, 1 (2018), 84--87. Google ScholarDigital Library
Kevin Angstadt, Westley Weimer, and Kevin Skadron. 2016. RAPID programming of pattern-recognition processors. ACM SIGOPS Operating Systems Review 50, 2 (2016), 593--605. Google ScholarDigital Library
AWS. 2017. Amazon EC2 F1 Instances. Retrieved October 2017 from https://aws.amazon.com/ec2/instance-types/f1/.Google Scholar
AWS-FPGA. 2017. Official Repository of the AWS EC2 FPGA Hardware and Software Development Kit. Retrieved October 2017 from https://github.com/aws/aws-fpga.Google Scholar
Michela Becci and Patrick Crowley. 2007. A hybrid finite automaton for practical deep packet inspection. In ACM CoNEXT Conference. Google ScholarDigital Library
Chunkun Bo. 2018a. Identify gRNA Off-Targets for CRISPR/Cas9 using Automata Processing. Retrieved February 2018a from https://github.com/chunkunbo/CRISPR.Google Scholar
Chunkun Bo. 2018b. REAPR on F1. Retrieved January 2018b from https://github.com/chunkunbo/REARP-on-Amazon-F1.Google Scholar
Chunkun Bo, Dang Vinh, Elaheh Sadredini, and Kevin Skadron. 2018. Searching for potential gRNA off-target sites for CRISPR/Cas9 using automata processing across different platforms. In Proceedings of IEEE International Symposium on High Performance Computer Architecture (HPCA’18).Google ScholarCross Ref
Chunkun Bo, Ke Wang, Jefferey J. Fox, and Kevin Skadron. 2015a. Entity resolution acceleration using Micron’s automata processor. In Proceedings of Architectures and Systems for Big Data (ASBD’15), in conjunction with ISCA (2015).Google Scholar
Chunkun Bo, Ke Wang, Jefferey J. Fox, and Kevin Skadron. 2016. Entity resolution acceleration using the automata processor. In Proceedings of IEEE International Conference on Big Data (Big Data’16).Google ScholarCross Ref
Chunkun Bo, Ke Wang, Yanjun Qi, and Kevin Skadron. 2015b. String kernel testing acceleration using the Micron automata processor. In Proceedings of Workshop on Computer Architecture for Machine Learning.Google Scholar
Pascal Caron and Djelloul Ziadi. 2000. Characterization of Glushkov automata. Theoretical Computer Science 233, 1--2 (2000), 75--90. Google ScholarDigital Library
Niccolo Cascarano, Pierluigi Rolando, Fulvio Risso, and Riccardo Sisto. 2010. iNFAnt: NFA pattern matching on GPGPU devices. ACM SIGCOMM Computer Communication Review 40, 5 (2010), 20--26. Google ScholarDigital Library
Vinh Dang. 2017a. A Deterministic Finite Automata GPU-Based Engine. Retrieved May 2017a from https://github.com/vqd8a/DFAGE.Google Scholar
Vinh Dang. 2017b. iNFAnt2. Retrieved May 2017b from https://github.com/vqd8a/iNFAnt2.Google Scholar
Paul Dlugosch, Dave Brown, Paul Glendenning, Michael Leventhal, and Harold Noyes. 2014. An efficient and scalable semiconductor architecture for parallel automata processing. In Proceedings of IEEE Transactions on Parallel and Distributed Systems.Google ScholarCross Ref
Yuanwei Fang, Tung T. Hoang, Michela Becci, and Andrew A. Chien. 2015. HARE: Hardware accelerator for regular expressions. In Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’15). Google ScholarDigital Library
Vaibhav Gogte, Aasheesh Kolli, Michael J. Cafarella, Loris D’Antoni, and Thomas F. Wenisch. 2016. HARE: Hardware accelerator for regular expressions. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). Google ScholarDigital Library
Tran Trung Hieu and Ngoc Thinh Tran. 2013. A memory efficient FPGA-based pattern matching engine for stateful NIDS. In Proceedings of the 5th Annual IEEE Conference Ubiquitous and Future Networks (ICUFN’13).Google Scholar
Xilinx Inc. 2018. SDAccel Development Environment. Retrieved January 2018 from https://www.xilinx.com/products/design-tools/softwarezone/sdaccel.html.Google Scholar
Intel. 2018. High-Performance Regular Expression Matching Library. Retrieved January 2018 from https://github.com/intel/hyperscan.Google Scholar
Rasha Karakchi, Lothrop O. Richards, and Jason D. Bakos. 2017. A dynamically reconfigurable automata processor overlay. In Proceedings of International Conference on ReConFigurable Computing and FPGAs (ReConFig’17).Google Scholar
Micron. 2015. ANML Documentation. Retrieved August 2015 from http://www.micronautomata.com/documentation/anml_documentation/c_intro.html.Google Scholar
Roger Moussalli, Mariam Salloum, Robert Halstead, Walid Najjar, and Vassilis J. Tsotras. 2014. A study on parallelizing XML path filtering using accelerators. ACM Transactions on Embedded Computing Systems (TECS) 13, 4 (2014), 93. Google ScholarDigital Library
Nimbix. 2017. Xilinx FPGAs on the Nimbix Cloud. Retrieved October 2017 from https://www.nimbix.net/xilinx/.Google Scholar
Marziyeh Nourian, Xiang Wang, Xiaodong Yu, Wu chun Feng, and Michela Becchi. 2017. Demystifying automata processing: GPUs, FPGAs or Micron’s AP? In Proceedings of the International Conference on Supercomputing. ACM, 1. Google ScholarDigital Library
Elaheh Sadredini, Deyuan Guo, Chunkun Bo, Reza Rahimi, and Kevin Skadron. 2018. A scalable solution for rule-based part-of-speech tagging on novel hardware accelerators. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining (KDD’18). Google ScholarDigital Library
Reetinder Sidhu and Viktor Prosanna. 2001. Fast regular expression matching using FPGAs. In Proceedings of the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’01). Google ScholarDigital Library
Jens Teubner, Louis Woods, and Chongling Nie. 2012. Skeleton automata for FPGAs: Reconfiguring without reconstructing. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 229--240. Google ScholarDigital Library
Jack Wadden, Kevin Angstadt, and Kevin Skadron. 2018a. Characterizing and mitigating output reporting bottlenecks in spatial automata processing architectures. In Proceedings of IEEE International Symposium on High Performance Computer Architecture (HPCA’18).Google ScholarCross Ref
Jack Wadden, Tommy Tracy II, Elaheh Sadredini, Lingxi Wu, Chunkun Bo, Jesse Du, Yizhou Wei, Matthew Wallace, Jeffrey Udall, Mircea Stan, and Kevin Skadron. 2018b. AutomataZoo: A modern automata processing benchmark suite. In Proceedings of IEEE International Symposium on Workload Characterization (IISWC’18).Google ScholarCross Ref
Jack Wadden and Kevin Skadron. 2016. VASim: An Open Virtual Automata Simulator for Automata Processing Application and Architecture Research. Technical Report CS2016--03, University of Virginia.Google Scholar
Jack Wadden, Dang Vinh, Nathan Brunelle, Tommy Tracy II, Deyuan Guo, Elaheh Sadredini, Ke Wang, Chunkun Bo, Robins Gabriel, Mircea Stan, and Kevin Skadron. 2017. ANMLzoo: A benchmark suite for exploring bottlenecks in automata processing engines and architectures. In Proceedings of IEEE International Symposium on Workload Characterization (IISWC’17).Google Scholar
Ke Wang, Kevin Angstadt, Chunkun Bo, Nathan Brunelle, Elaheh Sadredini, Tommy Tracy, Jack Wadden, Mircea Stan, and Kevin Skadron. 2016a. An overview of Micron’s automata processor. In Proceedings of Workshop on Hardware/Software Codesign and System Synthesis (CODES+ ISSS’16). Google ScholarDigital Library
Ke Wang, Yanjun Qi, Jefferey J. Fox, Mircea R. Stan, and Kevin Skadron. 2015. Association rule mining with the Micron automata processor. In Proceedings of IEEE International Conference on Parallel and Distributed Processing Symposium (IPDPS’15). Google ScholarDigital Library
Ke Wang, Elaheh Sadredini, and Kevin Skadron. 2016b. Sequential pattern mining with the Micron automata processor. In Proceedings of the ACM International Conference on Computing Frontiers. ACM, 135--144. Google ScholarDigital Library
Ted Xie, Vinh Dang, Chunkun Bo, Jack Wadden, Kevin Skadron, and Mircea Stan. 2018. REAPR: Reconfigurable Engine for Automata Processing. Retrieved August 2018 from https://github.com/ted-xie/REAPR.Google Scholar
Ted Xie, Dang Vinh, Jack Wadden, Mircea Stan, and Kevin Skadron. 2017. REAPR: Reconfigurable engine for automata processing. In Proceedings of IEEE International Conference on Field Programmable Logic and Applications (FPL’17).Google ScholarCross Ref
Yi-Hua Yang and Viktor Prasanna. 2012. High-performance and compact architecture for regular expression matching on FPGA. IEEE Transactions on Computers 61, 7 (2012), 1013--1025. Google ScholarDigital Library
Xiaodong Yu, Kaixi Hou, Hao Wang, and Wu chun Feng. 2017. Robotomata: A framework for approximate pattern matching of big data on an automata processor. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data’17). IEEE, 283--292.Google ScholarCross Ref
Keira Zhou, Jeffrey J. Fox, Ke Wang, Donald E. Brown, and Kevin Skadron. 2015. Brill tagging on the Micron automata processor. In Proceedings of IEEE International Conference on Semantic Computing (ICSC’15).Google ScholarCross Ref

Index Terms

Automata Processing in Reconfigurable Architectures: In-the-Cloud Deployment, Cross-Platform Evaluation, and Fast Symbol-Only Reconfiguration
1. Computer systems organization
  1. Architectures

Recommendations

Demystifying automata processing: GPUs, FPGAs or Micron's AP?
ICS '17: Proceedings of the International Conference on Supercomputing

Many established and emerging applications perform at their core some form of pattern matching, a computation that maps naturally onto finite automata abstractions. As a consequence, in recent years there has been a substantial amount of work on high-...
Read More
Partitioning signal processing applications to different granularity reconfigurable logic
SSIP'05: Proceedings of the 5th WSEAS international conference on Signal, speech and image processing

In this paper, we propose a methodology for partitioning DSP applications between the fine and coarse-grain reconfigurable hardware for improving performance. The fine-grain logic is implemented by an embedded FPGA unit, while for the coarse-grain ...
Read More
A method for partitioning applications in hybrid reconfigurable architectures

In this paper, we propose a methodology for accelerating application segments by partitioning them between reconfigurable hardware blocks of different granularity. Critical parts are speeded-up on the coarse-grain reconfigurable hardware for meeting the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Reconfigurable Technology and Systems Volume 12, Issue 2
June 2019
117 pages
ISSN:1936-7406
EISSN:1936-7414
DOI:10.1145/3322884
Editor:
Deming Chen
University of Illinois, Urbana-Champaign Urbana
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 May 2019
- Accepted: 1 February 2019
- Revised: 1 December 2018
- Received: 1 July 2018
Published in trets Volume 12, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Automata processing
FPGA
acceleration
cloud-computing
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 703
  Total Downloads
- Downloads (Last 12 months)91
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Automata Processing in Reconfigurable Architectures: In-the-Cloud Deployment, Cross-Platform Evaluation, and Fast Symbol-Only Reconfiguration

ACM Transactions on Reconfigurable Technology and Systems

Abstract

References

Cited By

Index Terms

Recommendations

Demystifying automata processing: GPUs, FPGAs or Micron's AP?

Partitioning signal processing applications to different granularity reconfigurable logic

A method for partitioning applications in hybrid reconfigurable architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Automata Processing in Reconfigurable Architectures: In-the-Cloud Deployment, Cross-Platform Evaluation, and Fast Symbol-Only Reconfiguration

ACM Transactions on Reconfigurable Technology and Systems

Abstract

References

Cited By

Index Terms

Recommendations

Demystifying automata processing: GPUs, FPGAs or Micron's AP?

Partitioning signal processing applications to different granularity reconfigurable logic

A method for partitioning applications in hybrid reconfigurable architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media