Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 6/2017

27.06.2016 | Original Article

Toward an efficient fuzziness based instance selection methodology for intrusion detection system

verfasst von: Rana Aamir Raza Ashfaq, Yu-lin He, De-gang Chen

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 6/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Building a high quality classifier is one of the key problems in the field of machine learning (ML) and pattern recognition. Many ML algorithms have suffered from high computational power in the presence of large scale data sets. This paper proposes a fuzziness based instance selection technique for the large data sets to increase the efficiency of supervised learning algorithms by improving the shortcomings of designing an effective intrusion detection system (IDS). The proposed methodology is dependent on a new kind of single layer feed-forward neural network (SLFN), called random weight neural network (RWNN). At the first stage, a membership vector corresponding to every training instance is obtained by using RWNN for computing the fuzziness. Secondly, the training instances (along with their fuzziness values) according to the actual class labels are grouped separately. After this, the instances having low fuzziness values in each group are extracted, which are used to build a reduced data set. The instances outputted by the proposed method are used as an input for ML classifiers, which result in reducing the learning time and also increasing the learning capability. The proposed methodology exhibits that the reduced data set can easily learn the boundaries between class labels. The most obvious finding from this study is a considerable increase in the accuracy rate with unseen examples when compared with other instance selection method, i.e., IB2. The proposed method provides the better generalization and fast learning capability. The reasonability of the proposed methodology is theoretically explained and experiments on well known ID data sets support its usefulness.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Aamir Raza Ashfaq R, Wang X, Huang J, Abbas H, He Y (2016) Fuzziness based semisupervised learning approach for intrusion detection system, Information Sciences. in press, doi: 10.1016/j.ins.2016.04.019 Aamir Raza Ashfaq R, Wang X, Huang J, Abbas H, He Y (2016) Fuzziness based semisupervised learning approach for intrusion detection system, Information Sciences. in press, doi: 10.1016/j.ins.2016.04.019
2.
Zurück zum Zitat Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66 Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
3.
Zurück zum Zitat Anand K, Ganapathy S, Kulothungan K, Yogesh P, Kannan A (2012) A rule based approach for attribute selection and intrusion detection in wireless sensor networks. Proc Eng 38:1658–1664CrossRef Anand K, Ganapathy S, Kulothungan K, Yogesh P, Kannan A (2012) A rule based approach for attribute selection and intrusion detection in wireless sensor networks. Proc Eng 38:1658–1664CrossRef
4.
Zurück zum Zitat Anderson P (1980) Computer security threat monitoring and surveillance, technical report. James P Anderson Co., Fort Washington Anderson P (1980) Computer security threat monitoring and surveillance, technical report. James P Anderson Co., Fort Washington
5.
Zurück zum Zitat Bezdek J, Kuncheva L (2001) Nearest prototype classifier designs: an experimental study. Int J Intell Syst 16(12):1445–1473CrossRefMATH Bezdek J, Kuncheva L (2001) Nearest prototype classifier designs: an experimental study. Int J Intell Syst 16(12):1445–1473CrossRefMATH
6.
Zurück zum Zitat Caises Y, Gonzalez A, Leyva E, Prez R (2009) SCIS: combining instance selection methods to increase their effectiveness over a wide range of domains. Intell Data Eng Autom Learn IDEAL 2009:17–24 Caises Y, Gonzalez A, Leyva E, Prez R (2009) SCIS: combining instance selection methods to increase their effectiveness over a wide range of domains. Intell Data Eng Autom Learn IDEAL 2009:17–24
7.
Zurück zum Zitat Cao FL, Ye HL, Wang DH (2015) A probabilistic learning algorithm for robust modeling using neural networks with random weights. Inf Sci 313:62–78CrossRef Cao FL, Ye HL, Wang DH (2015) A probabilistic learning algorithm for robust modeling using neural networks with random weights. Inf Sci 313:62–78CrossRef
8.
Zurück zum Zitat Chen W, Hsu S, Shen H (2005) Application of SVM and ANN for intrusion detection. Comput Oper Res 32(10):2617–2634CrossRefMATH Chen W, Hsu S, Shen H (2005) Application of SVM and ANN for intrusion detection. Comput Oper Res 32(10):2617–2634CrossRefMATH
9.
Zurück zum Zitat Chou C, Kuo B, Chang F (2006) The generalized condensed nearest neighbor rule as a data reduction method. In: Proceedings of the 18th international conference on pattern recognition (ICPR’06), vol 2, pp 556–559 Chou C, Kuo B, Chang F (2006) The generalized condensed nearest neighbor rule as a data reduction method.  In: Proceedings of the 18th international conference on pattern recognition (ICPR’06), vol 2, pp 556–559
10.
Zurück zum Zitat Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefMATH Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefMATH
11.
Zurück zum Zitat De Luca A, Termini S (1972) A definition of a non-probabilistic entropy in the setting of fuzzy sets theory. Inf Control 20(4):301–312CrossRefMATH De Luca A, Termini S (1972) A definition of a non-probabilistic entropy in the setting of fuzzy sets theory. Inf Control 20(4):301–312CrossRefMATH
12.
Zurück zum Zitat Denning D (1987) An intrusion-detection model. IEEE Trans Softw Eng 13(2):222–232CrossRef Denning D (1987) An intrusion-detection model. IEEE Trans Softw Eng 13(2):222–232CrossRef
13.
Zurück zum Zitat Devijver P, Kittler J (1980) On the edited nearest neighbor rule. In: Proceedings of the 5th international conference on pattern recognition. Pattern Recognition Society, Los Alamitos, CA, pp 72–80 Devijver P, Kittler J (1980) On the edited nearest neighbor rule. In: Proceedings of the 5th international conference on pattern recognition. Pattern Recognition Society, Los Alamitos, CA, pp 72–80
14.
Zurück zum Zitat Elbasiony R, Sallam E, Eltobely T, Fahmy M (2013) A hybrid network intrusion detection framework based on random forests and weighted k-means. Ain Shams Eng J 4(4):753–762CrossRef Elbasiony R, Sallam E, Eltobely T, Fahmy M (2013) A hybrid network intrusion detection framework based on random forests and weighted k-means. Ain Shams Eng J 4(4):753–762CrossRef
15.
Zurück zum Zitat Hart P (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515–516CrossRef Hart P (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515–516CrossRef
16.
Zurück zum Zitat He S, Chen H, Zhu Z, Ward D, Cooper H, Viant M, Heath J, Yao X (2015) Robust twin boosting for feature selection from high-dimensional omics data with label noise. Inf Sci 291:1–18CrossRef He S, Chen H, Zhu Z, Ward D, Cooper H, Viant M, Heath J, Yao X (2015) Robust twin boosting for feature selection from high-dimensional omics data with label noise. Inf Sci 291:1–18CrossRef
17.
Zurück zum Zitat He YL, Wang XZ, Huang JZX (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364-365:222–240CrossRef He YL, Wang XZ, Huang JZX (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364-365:222–240CrossRef
18.
Zurück zum Zitat Hofmann A, Horeis T, Sick B (2004) Feature selection for intrusion detection: an evolutionary wrapper approach. In: Proceedings of the 2004 IEEE international joint conference on neural networks, vol 2, pp 1563–1568 Hofmann A, Horeis T, Sick B (2004) Feature selection for intrusion detection: an evolutionary wrapper approach. In: Proceedings of the 2004 IEEE international joint conference on neural networks, vol 2, pp 1563–1568
19.
Zurück zum Zitat Igelnik B, Pao Yoh-Han (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329CrossRef Igelnik B, Pao Yoh-Han (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329CrossRef
21.
Zurück zum Zitat Keller J, Gray M, Givens J (1985) A fuzzy K-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 15(4):580–585CrossRef Keller J, Gray M, Givens J (1985) A fuzzy K-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 15(4):580–585CrossRef
22.
Zurück zum Zitat Kemmerer R, Vigna G (2002) Intrusion detection: a brief history and overview. Computer 35(4):27–30CrossRef Kemmerer R, Vigna G (2002) Intrusion detection: a brief history and overview. Computer 35(4):27–30CrossRef
23.
Zurück zum Zitat Li Y, Hu Z, Cai Y, Zhang W (2005) Support vector based prototype selection method for nearest neighbor rules. In: Wang L, Chen K, Ong YS (eds) Advances in natural computation. Lecture notes in computer science, vol 3610. Springer, Berlin, Heidelberg, pp 528–535 Li Y, Hu Z, Cai Y, Zhang W (2005) Support vector based prototype selection method for nearest neighbor rules. In: Wang L, Chen K, Ong YS (eds) Advances in natural computation. Lecture notes in computer science, vol 3610. Springer, Berlin, Heidelberg, pp 528–535
24.
Zurück zum Zitat Liao Y, Vemuri V (2002) Use of K-Nearest Neighbor classifier for intrusion detection. Comput Secur 21(5):439–448CrossRef Liao Y, Vemuri V (2002) Use of K-Nearest Neighbor classifier for intrusion detection. Comput Secur 21(5):439–448CrossRef
26.
Zurück zum Zitat Liu Q, Yin J, Leung V, Zhai J, Cai Z, Lin J (2014) Applying a new localized generalization error model to design neural networks. Neural Comput Appl 27(1):59–66CrossRef Liu Q, Yin J, Leung V, Zhai J, Cai Z, Lin J (2014) Applying a new localized generalization error model to design neural networks. Neural Comput Appl 27(1):59–66CrossRef
27.
Zurück zum Zitat Liu F, Zhang D, Shen LL (2015) Study on novel curvature features for 3D fingerprint recognition. Neurocomputing 168:599–608CrossRef Liu F, Zhang D, Shen LL (2015) Study on novel curvature features for 3D fingerprint recognition. Neurocomputing 168:599–608CrossRef
28.
Zurück zum Zitat Mukherjee S, Sharma N (2012) Intrusion detection using naive bayes classifier with feature reduction. Proc Technol 4:119–128CrossRef Mukherjee S, Sharma N (2012) Intrusion detection using naive bayes classifier with feature reduction. Proc Technol 4:119–128CrossRef
29.
Zurück zum Zitat Neter J (1996) Applied linear statistical models. WCB/MacGraw-Hill, Boston Neter J (1996) Applied linear statistical models. WCB/MacGraw-Hill, Boston
31.
Zurück zum Zitat Pereira C, Nakamura R, Costa K, Papa J (2012) An optimum-path forest framework for intrusion detection in computer networks. Eng Appl Artif Intell 25(6):1226–1234CrossRef Pereira C, Nakamura R, Costa K, Papa J (2012) An optimum-path forest framework for intrusion detection in computer networks. Eng Appl Artif Intell 25(6):1226–1234CrossRef
32.
Zurück zum Zitat Qiu M, Zhang L, Ming Z, Chen Z, Qin X, Yang L (2013) Security-aware optimization for ubiquitous computing systems with SEAT graph approach. J Comput Syst Sci 79(5):518–529CrossRefMATHMathSciNet Qiu M, Zhang L, Ming Z, Chen Z, Qin X, Yang L (2013) Security-aware optimization for ubiquitous computing systems with SEAT graph approach. J Comput Syst Sci 79(5):518–529CrossRefMATHMathSciNet
33.
Zurück zum Zitat Sanchez D, Trillas E (2012) Measures of fuzziness under different uses of fuzzy sets. Commun Comput Inf Sci 298:25–34MATH Sanchez D, Trillas E (2012) Measures of fuzziness under different uses of fuzzy sets. Commun Comput Inf Sci 298:25–34MATH
34.
Zurück zum Zitat Schmidt W, Kraaijveld M, Duin R (1992) Feedforward neural networks with random weights. In: Proceedings of 11th IAPR international conference on pattern recognition, conference B: pattern recognition methodology and systems, pp 1–4 Schmidt W, Kraaijveld M, Duin R (1992) Feedforward neural networks with random weights. In: Proceedings of 11th IAPR international conference on pattern recognition, conference B: pattern recognition methodology and systems, pp 1–4
35.
Zurück zum Zitat Schultz M, Eskin E, Zadok F, Stolfo S (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE symposium on security and privacy, pp 38–49 Schultz M, Eskin E, Zadok F, Stolfo S (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE symposium on security and privacy, pp 38–49
36.
Zurück zum Zitat Shi J, Jiang Q, Mao R, Lu M, Wang T (2015) FR-KECA: fuzzy robust kernel entropy component analysis. Neurocomputing 149:1415–1423CrossRef Shi J, Jiang Q, Mao R, Lu M, Wang T (2015) FR-KECA: fuzzy robust kernel entropy component analysis. Neurocomputing 149:1415–1423CrossRef
37.
Zurück zum Zitat Spillmann B, Neuhaus M, Bunke H, Pkalska E, Duin R (2006) Transforming strings to vector spaces using prototype selection. Lecture notes in computer science, pp 287–296 Spillmann B, Neuhaus M, Bunke H, Pkalska E, Duin R (2006) Transforming strings to vector spaces using prototype selection. Lecture notes in computer science, pp 287–296
39.
Zurück zum Zitat Te Braake H, Van Straten G (1995) Random activation weight neural net (RAWN) for east non-iterative training. Eng Appl Artif Intell 8(1):71–80CrossRef Te Braake H, Van Straten G (1995) Random activation weight neural net (RAWN) for east non-iterative training. Eng Appl Artif Intell 8(1):71–80CrossRef
40.
Zurück zum Zitat Tomek I (1976) An experiment with the edited nearest-neighbor rule. IEEE Trans Syst Man Cybern 6(6):448–452MATHMathSciNet Tomek I (1976) An experiment with the edited nearest-neighbor rule. IEEE Trans Syst Man Cybern 6(6):448–452MATHMathSciNet
41.
42.
Zurück zum Zitat Wang XZ, Aamir R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196CrossRefMathSciNet Wang XZ, Aamir R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196CrossRefMathSciNet
43.
Zurück zum Zitat Wang XZ, Miao Q, Zhai M, Zhai J (2012) Instance selection based on sample entropy for efficient data classification with ELM. In: Proceedings of the 2012 IEEE international conference on systems, man, and cybernetics (SMC), pp 970–974 Wang XZ, Miao Q, Zhai M, Zhai J (2012) Instance selection based on sample entropy for efficient data classification with ELM. In: Proceedings of the 2012 IEEE international conference on systems, man, and cybernetics (SMC), pp 970–974
44.
45.
Zurück zum Zitat Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654CrossRef Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654CrossRef
46.
47.
Zurück zum Zitat Xie J, Hone K, Xie W, Gao X, Shi Y, Liu X (2013) Extending twin support vector machine classifier for multi-category classification problems. Intell Data Anal 17(4):649–664 Xie J, Hone K, Xie W, Gao X, Shi Y, Liu X (2013) Extending twin support vector machine classifier for multi-category classification problems. Intell Data Anal 17(4):649–664
48.
Zurück zum Zitat Yan Q, Yu F (2015) Distributed denial of service attacks in software-defined networking with cloud computing. IEEE Commun Mag 53(4):52–59CrossRef Yan Q, Yu F (2015) Distributed denial of service attacks in software-defined networking with cloud computing. IEEE Commun Mag 53(4):52–59CrossRef
49.
Zurück zum Zitat Yang M, Zhu PF, Liu F, Shen LL (2015) Joint representation and pattern learning for robust face recognition. Neurocomputing 168:70–80CrossRef Yang M, Zhu PF, Liu F, Shen LL (2015) Joint representation and pattern learning for robust face recognition. Neurocomputing 168:70–80CrossRef
50.
Zurück zum Zitat Yao Y, Wei Y, Gao FX, Ge Y (2006) Anomaly intrusion detection approach using hybrid MLP/CNN neural network. In: Sixth international conference on intelligent systems design and applications, vol 2, pp 1095–1102 Yao Y, Wei Y, Gao FX, Ge Y (2006) Anomaly intrusion detection approach using hybrid MLP/CNN neural network. In: Sixth international conference on intelligent systems design and applications, vol 2, pp 1095–1102
51.
Zurück zum Zitat You ZH, Lei YK, Zhu L, Xia JF, Wang B (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinf 14(Suppl 8):S10CrossRef You ZH, Lei YK, Zhu L, Xia JF, Wang B (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinf 14(Suppl 8):S10CrossRef
52.
Zurück zum Zitat You ZH, Yu JZ, Zhu L, Li S, Wen ZK (2014) A mapreduce based parallel SVM for large-scale predicting proteinprotein interactions. Neurocomputing 145:37–43CrossRef You ZH, Yu JZ, Zhu L, Li S, Wen ZK (2014) A mapreduce based parallel SVM for large-scale predicting proteinprotein interactions. Neurocomputing 145:37–43CrossRef
54.
Zurück zum Zitat Zhang Z, Shen H (2005) Application of online-training SVMs for real-time intrusion detection with different considerations. Comput Commun 28(12):1428–1442CrossRef Zhang Z, Shen H (2005) Application of online-training SVMs for real-time intrusion detection with different considerations. Comput Commun 28(12):1428–1442CrossRef
55.
Zurück zum Zitat Zhao W, Wang ZH, Cao FL, Wang DH (2015) A local learning algorithm for random weights networks. Knowl Based Syst 74:159–166CrossRef Zhao W, Wang ZH, Cao FL, Wang DH (2015) A local learning algorithm for random weights networks. Knowl Based Syst 74:159–166CrossRef
Metadaten
Titel
Toward an efficient fuzziness based instance selection methodology for intrusion detection system
verfasst von
Rana Aamir Raza Ashfaq
Yu-lin He
De-gang Chen
Publikationsdatum
27.06.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 6/2017
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-016-0557-4

Weitere Artikel der Ausgabe 6/2017

International Journal of Machine Learning and Cybernetics 6/2017 Zur Ausgabe

Neuer Inhalt