research-article

Public Access

Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

Authors:
Rajesh Ganesan

George Mason University, Fairfax, VA

George Mason University, Fairfax, VA
View Profile

,
Sushil Jajodia

George Mason University, Fairfax, VA

George Mason University, Fairfax, VA
View Profile

,
Ankit Shah

George Mason University, Fairfax, VA

George Mason University, Fairfax, VA
View Profile

,
Hasan Cam

Army Research Laboratory, Adelphi, Maryland

Army Research Laboratory, Adelphi, Maryland
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 8 Issue 1Article No.: 4pp 1–21https://doi.org/10.1145/2882969

Published:25 July 2016Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

An important component of the cyber-defense mechanism is the adequate staffing levels of its cybersecurity analyst workforce and their optimal assignment to sensors for investigating the dynamic alert traffic. The ever-increasing cybersecurity threats faced by today’s digital systems require a strong cyber-defense mechanism that is both reactive in its response to mitigate the known risk and proactive in being prepared for handling the unknown risks. In order to be proactive for handling the unknown risks, the above workforce must be scheduled dynamically so the system is adaptive to meet the day-to-day stochastic demands on its workforce (both size and expertise mix). The stochastic demands on the workforce stem from the varying alert generation and their significance rate, which causes an uncertainty for the cybersecurity analyst scheduler that is attempting to schedule analysts for work and allocate sensors to analysts. Sensor data are analyzed by automatic processing systems, and alerts are generated. A portion of these alerts is categorized to be significant, which requires thorough examination by a cybersecurity analyst. Risk, in this article, is defined as the percentage of significant alerts that are not thoroughly analyzed by analysts. In order to minimize risk, it is imperative that the cyber-defense system accurately estimates the future significant alert generation rate and dynamically schedules its workforce to meet the stochastic workload demand to analyze them. The article presents a reinforcement learning-based stochastic dynamic programming optimization model that incorporates the above estimates of future alert rates and responds by dynamically scheduling cybersecurity analysts to minimize risk (i.e., maximize significant alert coverage by analysts) and maintain the risk under a pre-determined upper bound. The article tests the dynamic optimization model and compares the results to an integer programming model that optimizes the static staffing needs based on a daily-average alert generation rate with no estimation of future alert rates (static workforce model). Results indicate that over a finite planning horizon, the learning-based optimization model, through a dynamic (on-call) workforce in addition to the static workforce, (a) is capable of balancing risk between days and reducing overall risk better than the static model, (b) is scalable and capable of identifying the quantity and the right mix of analyst expertise in an organization, and (c) is able to determine their dynamic (on-call) schedule and their sensor-to-analyst allocation in order to maintain risk below a given upper bound. Several meta-principles are presented, which are derived from the optimization model, and they further serve as guiding principles for hiring and scheduling cybersecurity analysts. Days-off scheduling was performed to determine analyst weekly work schedules that met the cybersecurity system’s workforce constraints and requirements.

Supplemental Material

Available for Download

zip

ganesan.zip (204.7 KB)

Supplemental movie, appendix, image and software files for, Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

References

M. Albanese, C. Molinaro, F. Persia, A. Picariello, and V. S. Subrahmanian. 2014. Discovering the top-k unexplained sequences in time-stamped observation data. IEEE Trans. Knowl. Data Eng. 26, 3 (2014), 577--594. Google ScholarDigital Library
J. P. Anderson. 1980. Computer Security Threat Monitoring and Surveillance. Technical Report. James P. Anderson Co., Fort Washington, PA.Google Scholar
M. E. Aydin and E. Oztemel. 2000. Dynamic job-shop scheduling using reinforcement learning agents. Robot. Autonom. Syst. 33, 2 (2000), 169--178.Google ScholarCross Ref
Daniel Barbara and Sushil Jajodia (Eds.). 2002. Application of Data Mining in Computer Security. Advances in Information Security, Vol. 6. Springer, Berlin. Google ScholarDigital Library
R. Bellman. 1957. Dynamic Programming. Princeton University Press, Princeton NJ.Google ScholarDigital Library
Der-San Chen, Robert G. Batson, and Yu Dang. 2010. Applied Integer Programming. Wiley, New York, NY.Google Scholar
CIO Chief Information Officer. 2008. DON Cyber Crime Handbook. Dept. of Navy, Washington, DC.Google Scholar
Dorothy E. Denning. 1986. An intrusion-detection model. In Proceedings of IEEE Symposium on Security and Privacy. Oakland, CA, 118--131.Google ScholarCross Ref
Dorothy E. Denning. 1987. An intrusion-detection model. IEEE Trans. Software Eng. 13, 2 (1987), 222--232. Google ScholarDigital Library
Roberto Di Pietro and Luigi V. Mancini (Eds.). 2008. Intrusion Detection Systems. Advances in Information Security, Vol. 38. Springer, Berlin.Google Scholar
Robert F. Erbacher and Steve E. Hutchinson. 2012. Extending case-based reasoning to network alert reporting. In 2012 ASE International Conference on Cyber Security. 187--194. Google ScholarDigital Library
Rajesh Ganesan, Sushil Jajodia, and Hasan Cam. 2015. Optimal scheduling of cybersecurity analyst for minimizing risk. ACM Transactions on Intelligent Systems and Technology (under review) (2015). Google ScholarDigital Library
A. Gosavi. 2003. Simulation Based Optimization: Parametric Optimization Techniques and Reinforcement Learning. Kluwer Academic, Norwell, MA. Google ScholarDigital Library
D. Lesaint, C. Voudouris, N. Azarmi, I. Alletson, and B. Laithwaite. 2003. Field workforce scheduling. BT Technol. J. 21, 4 (2003), 23--26. Google ScholarDigital Library
George L. Nemhauser and Laurence A. Wolsey. 1999. Integer and Combinatorial Optimization. Wiley-Interscience, New York, NY. Google ScholarDigital Library
Y. Nobert and J. Roy. 1998. Freight handling personnel scheduling at air cargo terminals. Transport. Sci. 32, 3 (1998), 295--301. Google ScholarDigital Library
Stephen Northcutt and Judy Novak. 2002. Network Intrusion Detection, 3rd Edition. New Riders Publishing, Thousand Oaks, CA. Google ScholarDigital Library
M. Ovelgonne, V. S. Subrahmanian, T. Dumitras, and A. Prakash. 2015. Global Cyber-Vulnerability Report. Springer, Berlin. Google ScholarDigital Library
C. D. Paternina-Arboleda and T. K. Das. 2005. A multi-agent reinforcement learning approach to obtaining dynamic control policies for stochastic lot scheduling problem. Simul. Model. Pract. Theor. 13, 5 (2005), 389--406.Google ScholarCross Ref
Vern Paxson. 1999. Bro: A system for detecting network intruders in real-time. Comput. Networks 31, 23--24 (1999), 2435--2463. Google ScholarDigital Library
Michael Pinedo. 2009. Planning and Scheduling in Manufacturing and Services. Springer, New York, NY.Google Scholar
W.B. Powell. 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley-Interscience, New York, NY. Google ScholarDigital Library
M. L. Puterman. 1994. Markov Decision Processes. Wiley Interscience, New York, NY.Google Scholar
J. Reis and N. Mamede. 2002. Multi-Agent Dynamic Scheduling and Re-Scheduling with Global Temporal Constraints. Kluwer Academic Publishers, Amsterdam.Google Scholar
Robin Sommer and Vern Paxson. 2010. Outside the closed world: On using machine learning for network intrusion detection. In Proceedings of IEEE Symposium on Security and Privacy. 305--316. Google ScholarDigital Library
R. Sutton and A. G. Barto. 1998. In Reinforcement Learning. The MIT Press, Cambridge, MA. Google ScholarDigital Library
El-Ghazali Talbi. 2009. Metaheuristics. Wiley-Interscience, New York, NY.Google Scholar
Wayne Winston. 2003. Operations Research. Cengage Learning, New York, NY.Google Scholar
W. Wonham. 1979. Linear Multivariable Control: A Geometric Approach. Faller-Verlag.Google Scholar
F. Zhou, J. Wang, J. Wang, and J. Jonrinaldi. 2012. A dynamic rescheduling model with multi-agent system and its solution method. J. Mech. Eng. 58, 2 (2012), 81--92.Google ScholarCross Ref
Carson Zimmerman. 2014. The Strategies of a World-Class Cybersecurity Operations Center. The MITRE Corporation, McLean, VA.Google Scholar

Index Terms

Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

Recommendations

Optimal Scheduling of Cybersecurity Analysts for Minimizing Risk
Special Issue: Cyber Security and Regular Papers

Cybersecurity threats are on the rise with evermore digitization of the information that many day-to-day systems depend upon. The demand for cybersecurity analysts outpaces supply, which calls for optimal management of the analyst resource. Therefore, a ...
Read More
Dynamic Optimization of the Level of Operational Effectiveness of a CSOC Under Adverse Conditions
Research Survey and Regular Papers

The analysts at a cybersecurity operations center (CSOC) analyze the alerts that are generated by intrusion detection systems (IDSs). Under normal operating conditions, sufficient numbers of analysts are available to analyze the alert workload. For the ...
Read More
Financial Cybersecurity Risk Management: Leadership Perspectives and Guidance for Systems and Institutions
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 8, Issue 1
January 2017
363 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2973184
Editor:
Yu Zheng
Microsoft Research, China
Issue’s Table of Contents
Copyright © 2016 ACM
© 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2016
- Accepted: 1 January 2016
- Revised: 1 November 2015
- Received: 1 September 2015
Published in tist Volume 8, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Cybersecurity analysts
dynamic scheduling
genetic algorithm
integer programming
optimization
reinforcement learning
resource allocation
risk mitigation
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 41
  Total Citations
  View Citations
- 1,388
  Total Downloads
- Downloads (Last 12 months)182
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

ACM Transactions on Intelligent Systems and Technology

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Optimal Scheduling of Cybersecurity Analysts for Minimizing Risk

Dynamic Optimization of the Level of Operational Effectiveness of a CSOC Under Adverse Conditions

Financial Cybersecurity Risk Management: Leadership Perspectives and Guidance for Systems and Institutions