Skip to main content
Top
Published in: Automatic Control and Computer Sciences 8/2022

01-12-2022

Using Information about Influencing Factors to Split Data Samples in Machine Learning Methods for the Purposes of Assessing Information Security

Authors: I. S. Lebedev, M. E. Sukhoparov

Published in: Automatic Control and Computer Sciences | Issue 8/2022

Login to get access

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Improving quality indicators of determining information security states of individual segments of cyber-physical systems involves processing large data arrays. This article proposes a method of splitting data samples to improve the quality of algorithms for classifying information security states. Classification models are configured on training sets of examples that may contain outliers, noisy data, and imbalances of observation objects, which affects the quality indicators of the results. At certain points in time, the influence of the external environment may lead to changes in the frequency of observed events or the ranges of logged values, which significantly affects quality indicators. It has been shown that a number of events in samples are caused by the influence of internal and external factors.
Literature
4.
go back to reference Jia, R., Dao, D., Wang, B., Hubis, F.A., Gurel, N.M., Li, Bo, Zhang, Ce, Spanos, C., and Song, D., Efficient task-specific data valuation for nearest neighbor algorithms, Proc. VLDB Endowment, 2019, vol. 12, no. 11, pp. 1610–1623. https://doi.org/10.14778/3342263.3342637 Jia, R., Dao, D., Wang, B., Hubis, F.A., Gurel, N.M., Li, Bo, Zhang, Ce, Spanos, C., and Song, D., Efficient task-specific data valuation for nearest neighbor algorithms, Proc. VLDB Endowment, 2019, vol. 12, no. 11, pp. 1610–1623.  https://​doi.​org/​10.​14778/​3342263.​3342637
5.
go back to reference Wu, Zh., Efros, A.A., and Yu, S.X., Improving generalization via scalable neighborhood component analysis, Computer Vision–ECCV 2018, Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y., Eds., Lecture Notes in Computer Science, vol. 11211, Cham: Springer, 2018, pp. 712–728. https://doi.org/10.1007/978-3-030-01234-2_42CrossRef Wu, Zh., Efros, A.A., and Yu, S.X., Improving generalization via scalable neighborhood component analysis, Computer Vision–ECCV 2018, Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y., Eds., Lecture Notes in Computer Science, vol. 11211, Cham: Springer, 2018, pp. 712–728.  https://​doi.​org/​10.​1007/​978-3-030-01234-2_​42CrossRef
8.
go back to reference Lee, M.H., Kim, N., Yoo, J., Kim, H.-K., Son, Yo.-D., Kim, Yo.-Bo, Oh, S.M., Kim, S., Lee, H., Jeon, J.E., and Lee, Y.J., Multitask fMRI and machine learning approach improve prediction of differential brain activity pattern in patients with insomnia disorder, Sci. Rep., 2021, vol. 11, p. 9402. https://doi.org/10.1038/s41598-021-88845-wCrossRef Lee, M.H., Kim, N., Yoo, J., Kim, H.-K., Son, Yo.-D., Kim, Yo.-Bo, Oh, S.M., Kim, S., Lee, H., Jeon, J.E., and Lee, Y.J., Multitask fMRI and machine learning approach improve prediction of differential brain activity pattern in patients with insomnia disorder, Sci. Rep., 2021, vol. 11, p. 9402.  https://​doi.​org/​10.​1038/​s41598-021-88845-wCrossRef
9.
go back to reference Karegowda, A.G., Punya, V., Jayaram, M.A., and Manjunath, A.S., Rule based classification for diabetic patients using cascaded k-means and decision tree C4.5, Int. J. Comput. Appl., 2012, vol. 45, no. 12, pp. 45–50. Karegowda, A.G., Punya, V., Jayaram, M.A., and Manjunath, A.S., Rule based classification for diabetic patients using cascaded k-means and decision tree C4.5, Int. J. Comput. Appl., 2012, vol. 45, no. 12, pp. 45–50.
11.
go back to reference Zhuravlev, Yu.I., On the algebraic approach to solving recognition and classification problems, Probl. Kibern., 1978, vol. 33, pp. 5–68. Zhuravlev, Yu.I., On the algebraic approach to solving recognition and classification problems, Probl. Kibern., 1978, vol. 33, pp. 5–68.
12.
go back to reference Vorontsov, K.V., Lektsii po algoritmicheskim kompozitsiyam (Lectures on Algorithmic Compositions). http://www.machinelearning.ru/wiki/images/0/0d/Voron-ML-Compositions.pdf. Vorontsov, K.V., Lektsii po algoritmicheskim kompozitsiyam (Lectures on Algorithmic Compositions). http://​www.​machinelearning.​ru/​wiki/​images/​0/​0d/​Voron-ML-Compositions.​pdf.​
13.
go back to reference D’yakonov, A., Methods for solving classification problems with categorical features, Prikladnaya matematika i informatika. Trudy fakul’teta Vychislitel’noi matematiki i kibernetiki MGU imeni M.V. Lomonosova (Applied Mathematics and Informatics: Works of the Faculty of Cybernetics of the Lomonosov Moscow State University), Moscow: Maks Press, 2014, pp. 103–127. D’yakonov, A., Methods for solving classification problems with categorical features, Prikladnaya matematika i informatika. Trudy fakul’teta Vychislitel’noi matematiki i kibernetiki MGU imeni M.V. Lomonosova (Applied Mathematics and Informatics: Works of the Faculty of Cybernetics of the Lomonosov Moscow State University), Moscow: Maks Press, 2014, pp. 103–127.
16.
go back to reference Kalinin, M.O. and Krundyshev, V.M., Detecting network attacks on digital enterprises using quantum machine learning, Tsifrovaya ekonomika, umnye innovatsii i tekhnologii. Sbornik trudov Natsional’noi (Vserossiiskoi) nauchno-prakticheskoi konferentsii s zarubezhnym uchastiem (Digital Economy, Smart Innovations and Technologies: Proc. Natl. (All-Russian) Sci.-Pract. Conf. with Int. Participation), St. Petersburg, 2021, St. Petersburg: Politekh-Press, 2021, pp. 286–287. https://doi.org/10.18720/IEP/2021.1/89 Kalinin, M.O. and Krundyshev, V.M., Detecting network attacks on digital enterprises using quantum machine learning, Tsifrovaya ekonomika, umnye innovatsii i tekhnologii. Sbornik trudov Natsional’noi (Vserossiiskoi) nauchno-prakticheskoi konferentsii s zarubezhnym uchastiem (Digital Economy, Smart Innovations and Technologies: Proc. Natl. (All-Russian) Sci.-Pract. Conf. with Int. Participation), St. Petersburg, 2021, St. Petersburg: Politekh-Press, 2021, pp. 286–287. https://​doi.​org/​10.​18720/​IEP/​2021.​1/​89
17.
go back to reference Lavrova, D.S. and Yarmak, A.V., Prediction of attacks on control subsystem of industrial objects using deep learning, XIII Vserossiiskoe soveshchanie po problemam upravleniya VSPU-2019 (13th All-Russian Meeting on Problems of Control), Moscow, 2019, Moscow: Inst. Problem Upravleniya im. V.A. Trapeznikova, 2019, pp. 2581–2586. https://doi.org/10.25728/vspu.2019.2581 Lavrova, D.S. and Yarmak, A.V., Prediction of attacks on control subsystem of industrial objects using deep learning, XIII Vserossiiskoe soveshchanie po problemam upravleniya VSPU-2019 (13th All-Russian Meeting on Problems of Control), Moscow, 2019, Moscow: Inst. Problem Upravleniya im. V.A. Trapeznikova, 2019, pp. 2581–2586.  https://​doi.​org/​10.​25728/​vspu.​2019.​2581
Metadata
Title
Using Information about Influencing Factors to Split Data Samples in Machine Learning Methods for the Purposes of Assessing Information Security
Authors
I. S. Lebedev
M. E. Sukhoparov
Publication date
01-12-2022
Publisher
Pleiades Publishing
Published in
Automatic Control and Computer Sciences / Issue 8/2022
Print ISSN: 0146-4116
Electronic ISSN: 1558-108X
DOI
https://doi.org/10.3103/S0146411622080119

Other articles of this Issue 8/2022

Automatic Control and Computer Sciences 8/2022 Go to the issue