skip to main content
research-article
Public Access

Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

Published:25 July 2016Publication History
Skip Abstract Section

Abstract

An important component of the cyber-defense mechanism is the adequate staffing levels of its cybersecurity analyst workforce and their optimal assignment to sensors for investigating the dynamic alert traffic. The ever-increasing cybersecurity threats faced by today’s digital systems require a strong cyber-defense mechanism that is both reactive in its response to mitigate the known risk and proactive in being prepared for handling the unknown risks. In order to be proactive for handling the unknown risks, the above workforce must be scheduled dynamically so the system is adaptive to meet the day-to-day stochastic demands on its workforce (both size and expertise mix). The stochastic demands on the workforce stem from the varying alert generation and their significance rate, which causes an uncertainty for the cybersecurity analyst scheduler that is attempting to schedule analysts for work and allocate sensors to analysts. Sensor data are analyzed by automatic processing systems, and alerts are generated. A portion of these alerts is categorized to be significant, which requires thorough examination by a cybersecurity analyst. Risk, in this article, is defined as the percentage of significant alerts that are not thoroughly analyzed by analysts. In order to minimize risk, it is imperative that the cyber-defense system accurately estimates the future significant alert generation rate and dynamically schedules its workforce to meet the stochastic workload demand to analyze them. The article presents a reinforcement learning-based stochastic dynamic programming optimization model that incorporates the above estimates of future alert rates and responds by dynamically scheduling cybersecurity analysts to minimize risk (i.e., maximize significant alert coverage by analysts) and maintain the risk under a pre-determined upper bound. The article tests the dynamic optimization model and compares the results to an integer programming model that optimizes the static staffing needs based on a daily-average alert generation rate with no estimation of future alert rates (static workforce model). Results indicate that over a finite planning horizon, the learning-based optimization model, through a dynamic (on-call) workforce in addition to the static workforce, (a) is capable of balancing risk between days and reducing overall risk better than the static model, (b) is scalable and capable of identifying the quantity and the right mix of analyst expertise in an organization, and (c) is able to determine their dynamic (on-call) schedule and their sensor-to-analyst allocation in order to maintain risk below a given upper bound. Several meta-principles are presented, which are derived from the optimization model, and they further serve as guiding principles for hiring and scheduling cybersecurity analysts. Days-off scheduling was performed to determine analyst weekly work schedules that met the cybersecurity system’s workforce constraints and requirements.

Skip Supplemental Material Section

Supplemental Material

References

  1. M. Albanese, C. Molinaro, F. Persia, A. Picariello, and V. S. Subrahmanian. 2014. Discovering the top-k unexplained sequences in time-stamped observation data. IEEE Trans. Knowl. Data Eng. 26, 3 (2014), 577--594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. P. Anderson. 1980. Computer Security Threat Monitoring and Surveillance. Technical Report. James P. Anderson Co., Fort Washington, PA.Google ScholarGoogle Scholar
  3. M. E. Aydin and E. Oztemel. 2000. Dynamic job-shop scheduling using reinforcement learning agents. Robot. Autonom. Syst. 33, 2 (2000), 169--178.Google ScholarGoogle ScholarCross RefCross Ref
  4. Daniel Barbara and Sushil Jajodia (Eds.). 2002. Application of Data Mining in Computer Security. Advances in Information Security, Vol. 6. Springer, Berlin. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Bellman. 1957. Dynamic Programming. Princeton University Press, Princeton NJ.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Der-San Chen, Robert G. Batson, and Yu Dang. 2010. Applied Integer Programming. Wiley, New York, NY.Google ScholarGoogle Scholar
  7. CIO Chief Information Officer. 2008. DON Cyber Crime Handbook. Dept. of Navy, Washington, DC.Google ScholarGoogle Scholar
  8. Dorothy E. Denning. 1986. An intrusion-detection model. In Proceedings of IEEE Symposium on Security and Privacy. Oakland, CA, 118--131.Google ScholarGoogle ScholarCross RefCross Ref
  9. Dorothy E. Denning. 1987. An intrusion-detection model. IEEE Trans. Software Eng. 13, 2 (1987), 222--232. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Roberto Di Pietro and Luigi V. Mancini (Eds.). 2008. Intrusion Detection Systems. Advances in Information Security, Vol. 38. Springer, Berlin.Google ScholarGoogle Scholar
  11. Robert F. Erbacher and Steve E. Hutchinson. 2012. Extending case-based reasoning to network alert reporting. In 2012 ASE International Conference on Cyber Security. 187--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Rajesh Ganesan, Sushil Jajodia, and Hasan Cam. 2015. Optimal scheduling of cybersecurity analyst for minimizing risk. ACM Transactions on Intelligent Systems and Technology (under review) (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Gosavi. 2003. Simulation Based Optimization: Parametric Optimization Techniques and Reinforcement Learning. Kluwer Academic, Norwell, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Lesaint, C. Voudouris, N. Azarmi, I. Alletson, and B. Laithwaite. 2003. Field workforce scheduling. BT Technol. J. 21, 4 (2003), 23--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. George L. Nemhauser and Laurence A. Wolsey. 1999. Integer and Combinatorial Optimization. Wiley-Interscience, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Nobert and J. Roy. 1998. Freight handling personnel scheduling at air cargo terminals. Transport. Sci. 32, 3 (1998), 295--301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Stephen Northcutt and Judy Novak. 2002. Network Intrusion Detection, 3rd Edition. New Riders Publishing, Thousand Oaks, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Ovelgonne, V. S. Subrahmanian, T. Dumitras, and A. Prakash. 2015. Global Cyber-Vulnerability Report. Springer, Berlin. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. D. Paternina-Arboleda and T. K. Das. 2005. A multi-agent reinforcement learning approach to obtaining dynamic control policies for stochastic lot scheduling problem. Simul. Model. Pract. Theor. 13, 5 (2005), 389--406.Google ScholarGoogle ScholarCross RefCross Ref
  20. Vern Paxson. 1999. Bro: A system for detecting network intruders in real-time. Comput. Networks 31, 23--24 (1999), 2435--2463. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Michael Pinedo. 2009. Planning and Scheduling in Manufacturing and Services. Springer, New York, NY.Google ScholarGoogle Scholar
  22. W.B. Powell. 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley-Interscience, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. L. Puterman. 1994. Markov Decision Processes. Wiley Interscience, New York, NY.Google ScholarGoogle Scholar
  24. J. Reis and N. Mamede. 2002. Multi-Agent Dynamic Scheduling and Re-Scheduling with Global Temporal Constraints. Kluwer Academic Publishers, Amsterdam.Google ScholarGoogle Scholar
  25. Robin Sommer and Vern Paxson. 2010. Outside the closed world: On using machine learning for network intrusion detection. In Proceedings of IEEE Symposium on Security and Privacy. 305--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Sutton and A. G. Barto. 1998. In Reinforcement Learning. The MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. El-Ghazali Talbi. 2009. Metaheuristics. Wiley-Interscience, New York, NY.Google ScholarGoogle Scholar
  28. Wayne Winston. 2003. Operations Research. Cengage Learning, New York, NY.Google ScholarGoogle Scholar
  29. W. Wonham. 1979. Linear Multivariable Control: A Geometric Approach. Faller-Verlag.Google ScholarGoogle Scholar
  30. F. Zhou, J. Wang, J. Wang, and J. Jonrinaldi. 2012. A dynamic rescheduling model with multi-agent system and its solution method. J. Mech. Eng. 58, 2 (2012), 81--92.Google ScholarGoogle ScholarCross RefCross Ref
  31. Carson Zimmerman. 2014. The Strategies of a World-Class Cybersecurity Operations Center. The MITRE Corporation, McLean, VA.Google ScholarGoogle Scholar

Index Terms

  1. Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Intelligent Systems and Technology
              ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 1
              January 2017
              363 pages
              ISSN:2157-6904
              EISSN:2157-6912
              DOI:10.1145/2973184
              • Editor:
              • Yu Zheng
              Issue’s Table of Contents

              Copyright © 2016 ACM

              © 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 July 2016
              • Accepted: 1 January 2016
              • Revised: 1 November 2015
              • Received: 1 September 2015
              Published in tist Volume 8, Issue 1

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader