Skip to main content

2022 | OriginalPaper | Buchkapitel

Selected Aspects of Interactive Feature Extraction

verfasst von : Marek Grzegorowski

Erschienen in: Transactions on Rough Sets XXIII

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the presented study, the problem of interactive feature extraction, i.e., supported by interaction with users, is discussed, and several innovative approaches to automating feature creation and selection are proposed. The current state of knowledge on feature extraction processes in commercial applications is shown. The problems associated with processing big data sets as well as approaches to process high-dimensional time series are discussed. The introduced feature extraction methods were subjected to experimental verification on real-life problems and data. Besides the experimentation, the practical case studies and applications of developed techniques in selected scientific projects are shown.
Feature extraction addresses the problem of finding the most compact and informative data representation resulting in improved efficiency of data storage and processing, facilitating the subsequent learning and generalization steps. Feature extraction not only simplifies the data representation but also enables the acquisition of features that can be further easily utilized by both analysts and learning algorithms. In its most common flow, the process starts from an initial set of measured data and builds derived features intended to be informative and non-redundant. Logically, there are two phases of this process: the first is the construction of the new attributes based on original data (sometimes referred to as feature engineering), the second is a selection of the most important among the attributes (sometimes referred to as feature selection). There are many approaches to feature creation and selection that are well-described in the literature. Still, it is hard to find methods facilitating interaction with users, which would take into consideration users’ knowledge about the domain, their experience, and preferences.
In the study on the interactiveness of the feature extraction, the problems of deriving useful and understandable attributes from raw sensor readings and reducing the amount of those attributes to achieve possibly simplest, yet accurate, models are addressed. The proposed methods go beyond the current standards by enabling a more efficient way to express the domain knowledge associated with the most important subsets of attributes. The proposed algorithms for the construction and selection of features can use various forms of information granulation, problem decomposition, and parallelization. They can also tackle large spaces of derivable features and ensure a satisfactory (according to a given criterion) level of information about the target variable (decision), even after removing a substantial number of features.
The proposed approaches have been developed based on the experience gained in the course of several research projects in the fields of data analysis and processing multi-sensor data streams. The methods have been validated in terms of the quality of the extracted features, as well as throughput, scalability, and robustness of their operation. The discussed methodology has been verified in open data mining competitions to confirm its usefulness.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Later in this section, we discuss real matrices, as they are more relevant for real-life data sets.
 
2
We call a matrix \(X \in \mathbb {Z}^{n \times n}\) unitary iff \(X X^{H} = X^{H} X = \mathbb {I}\). For a real matrix \(X \in \mathbb {R}^{n \times n}\), we have \(X^{H} = X^{T}\), and we say that a matrix is orthogonal, i.e., \(X X^{T} = X^{T} X = \mathbb {I}\).
 
3
\(Cov[X,Y] = E[XY] - E[X]E[Y]\), or \(Cov[X,Y] = \frac{1}{m} \sum _{1 \le i \le m} (x_i - E[X])(y_i - E[Y])\).
 
4
In literature, matrix \(\textbf{V}\) is often denoted as \(\textbf{W}\), whereas \(\mathbf {\Sigma }\) as \(\mathbf {\Lambda }\). We, however, continue with the notation as introduces with SVD example above.
 
5
For a given corpus, the co-occurrence of two words is the number of times they appear together (and are close enough, e.g., no more than 30 words separates them in text) in documents.
 
6
The main topic for documents \(\{D_1, D_2, D_3\}\) could be related to “feature selection”.
 
7
It is worth mentioning that the neural network input is a numeric vector embedding for each word (typically, word vectorization is performed after the initial preprocessing).
 
8
Entropy is one of the basic measures of information contained in data. For a discrete random variable X with possible values \(\{x_1, .., x_m\}\) is defined as: \(H(X) = -\sum _{i=1}^{m} p(x_i)log(p(x_i))\).
 
9
Typically the set of all features/attributes is denoted with A [291].
 
10
If attribute domains are overlapping, i.e., there exist \(a_i,a_j \in A\) for which \(V_{a_i} \cap V_{a_j} \ne \emptyset \), then concatenation may include a delimiter \(\mid _A\) such that for each \(a \in A\) we have \(\mid _A \notin V_a\).
 
13
The description for the classification tasks would differ mainly in the training algorithms, and model evaluation criteria used.
 
17
See AWS global cloud infrastructure at aws.​amazon.​com/​about-aws/​global-infrastructure.
 
18
AWS code-names for regions in brackets.
 
20
The video recording is available on-line on KnowlegePit platform.
 
21
Numbers in brackets indicate the amount of unique bids in data for each region.
 
Literatur
3.
Zurück zum Zitat Abeel, T., Helleputte, T., de Peer, Y.V., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)CrossRef Abeel, T., Helleputte, T., de Peer, Y.V., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)CrossRef
8.
Zurück zum Zitat Ahmadi, E., Jasemi, M., Monplaisir, L., Nabavi, M.A., Mahmoodi, A., Jam, P.A.: New efficient hybrid candlestick technical analysis model for stock market timing on the basis of the support vector machine and heuristic algorithms of imperialist competition and genetic. Expert Syst. Appl. 94, 21–31 (2018). https://doi.org/10.1016/j.eswa.2017.10.023CrossRef Ahmadi, E., Jasemi, M., Monplaisir, L., Nabavi, M.A., Mahmoodi, A., Jam, P.A.: New efficient hybrid candlestick technical analysis model for stock market timing on the basis of the support vector machine and heuristic algorithms of imperialist competition and genetic. Expert Syst. Appl. 94, 21–31 (2018). https://​doi.​org/​10.​1016/​j.​eswa.​2017.​10.​023CrossRef
9.
Zurück zum Zitat Ahmed, F., Samorani, M., Bellinger, C., Zaïane, O.R.: Advantage of integration in big data: feature generation in multi-relational databases for imbalanced learning. In: Proceedings of IEEE Big Data, pp. 532–539 (2016) Ahmed, F., Samorani, M., Bellinger, C., Zaïane, O.R.: Advantage of integration in big data: feature generation in multi-relational databases for imbalanced learning. In: Proceedings of IEEE Big Data, pp. 532–539 (2016)
11.
Zurück zum Zitat Al-Ali, H., Cuzzocrea, A., Damiani, E., Mizouni, R., Tello, G.: A composite machine-learning-based framework for supporting low-level event logs to high-level business process model activities mappings enhanced by flexible BPMN model translation. Soft. Comput. 24(10), 7557–7578 (2019). https://doi.org/10.1007/s00500-019-04385-6CrossRef Al-Ali, H., Cuzzocrea, A., Damiani, E., Mizouni, R., Tello, G.: A composite machine-learning-based framework for supporting low-level event logs to high-level business process model activities mappings enhanced by flexible BPMN model translation. Soft. Comput. 24(10), 7557–7578 (2019). https://​doi.​org/​10.​1007/​s00500-019-04385-6CrossRef
12.
Zurück zum Zitat Alelyani, S., Tang, J., Liu, H.: Feature selection for clustering: a review. In: Aggarwal, C.C., Reddy, C.K. (eds.) Data Clustering: Algorithms and Applications, pp. 29–60. CRC Press, Boca Raton (2013) Alelyani, S., Tang, J., Liu, H.: Feature selection for clustering: a review. In: Aggarwal, C.C., Reddy, C.K. (eds.) Data Clustering: Algorithms and Applications, pp. 29–60. CRC Press, Boca Raton (2013)
14.
Zurück zum Zitat Altidor, W., Khoshgoftaar, T.M., Napolitano, A.: Measuring stability of feature ranking techniques: a noise-based approach. Int. J. Bus. Intell. Data Min. 7(1–2), 80–115 (2012) Altidor, W., Khoshgoftaar, T.M., Napolitano, A.: Measuring stability of feature ranking techniques: a noise-based approach. Int. J. Bus. Intell. Data Min. 7(1–2), 80–115 (2012)
15.
Zurück zum Zitat Appice, A., Guccione, P., Malerba, D., Ciampi, A.: Dealing with temporal and spatial correlations to classify outliers in geophysical data streams. Inf. Sci. 285, 162–180 (2014)MathSciNetMATHCrossRef Appice, A., Guccione, P., Malerba, D., Ciampi, A.: Dealing with temporal and spatial correlations to classify outliers in geophysical data streams. Inf. Sci. 285, 162–180 (2014)MathSciNetMATHCrossRef
16.
Zurück zum Zitat Assunção, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)CrossRef Assunção, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)CrossRef
19.
Zurück zum Zitat Azad, M., Moshkov, M.: Minimization of decision tree average depth for decision tables with many-valued decisions. Procedia Comput. Sci. 35, 368–377 (2014). https://doi.org/10.1016/j.procs.2014.08.117. Knowledge-Based and Intelligent Information & Engineering Systems 18th Annual Conference, KES-2014 Gdynia, Poland, September 2014 Proceedings Azad, M., Moshkov, M.: Minimization of decision tree average depth for decision tables with many-valued decisions. Procedia Comput. Sci. 35, 368–377 (2014). https://​doi.​org/​10.​1016/​j.​procs.​2014.​08.​117. Knowledge-Based and Intelligent Information & Engineering Systems 18th Annual Conference, KES-2014 Gdynia, Poland, September 2014 Proceedings
20.
Zurück zum Zitat Bahmani, B., Moseley, B., Vattani, A., Kumar, R., Vassilvitskii, S.: Scalable K-Means++. Proc. VLDB Endow. 5(7), 622–633 (2012)CrossRef Bahmani, B., Moseley, B., Vattani, A., Kumar, R., Vassilvitskii, S.: Scalable K-Means++. Proc. VLDB Endow. 5(7), 622–633 (2012)CrossRef
21.
Zurück zum Zitat Bałazińska, M., Zdonik, S.: Databases meet the stream processing era, pp. 225–234. Association for Computing Machinery and Morgan and Claypool (2018) Bałazińska, M., Zdonik, S.: Databases meet the stream processing era, pp. 225–234. Association for Computing Machinery and Morgan and Claypool (2018)
23.
Zurück zum Zitat Bargiela, A., Pedrycz, W.: The roots of granular computing. In: 2006 IEEE International Conference on Granular Computing, pp. 806–809. IEEE (2006) Bargiela, A., Pedrycz, W.: The roots of granular computing. In: 2006 IEEE International Conference on Granular Computing, pp. 806–809. IEEE (2006)
25.
26.
Zurück zum Zitat Baughman, M., Haas, C., Wolski, R., Foster, I., Chard, K.: Predicting amazon spot prices with LSTM networks. In: Proceedings of the 9th Workshop on Scientific Cloud Computing, ScienceCloud 2018, p. 7. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3217880.3217881 Baughman, M., Haas, C., Wolski, R., Foster, I., Chard, K.: Predicting amazon spot prices with LSTM networks. In: Proceedings of the 9th Workshop on Scientific Cloud Computing, ScienceCloud 2018, p. 7. Association for Computing Machinery, New York (2018). https://​doi.​org/​10.​1145/​3217880.​3217881
37.
Zurück zum Zitat Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: IV, J.F.E., Fogelman-Soulié, F., Flach, P.A., Zaki, M.J. (eds.) Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009, pp. 139–148. ACM (2009). https://doi.org/10.1145/1557019.1557041 Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: IV, J.F.E., Fogelman-Soulié, F., Flach, P.A., Zaki, M.J. (eds.) Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009, pp. 139–148. ACM (2009). https://​doi.​org/​10.​1145/​1557019.​1557041
41.
Zurück zum Zitat Bondell, H.D., Reich, B.J.: Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics 64(1), 115–123 (2008)MathSciNetMATHCrossRef Bondell, H.D., Reich, B.J.: Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics 64(1), 115–123 (2008)MathSciNetMATHCrossRef
45.
Zurück zum Zitat Boullé, M.: Predicting dangerous seismic events in coal mines under distribution drift. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of FedCSIS 2016, pp. 227–230. IEEE (2016) Boullé, M.: Predicting dangerous seismic events in coal mines under distribution drift. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of FedCSIS 2016, pp. 227–230. IEEE (2016)
46.
Zurück zum Zitat Brahim, A.B., Limam, M.: Robust ensemble feature selection for high dimensional data sets. In: Proceedings of HPCS 2013, pp. 151–157 (2013) Brahim, A.B., Limam, M.: Robust ensemble feature selection for high dimensional data sets. In: Proceedings of HPCS 2013, pp. 151–157 (2013)
59.
Zurück zum Zitat Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. CoRR abs/1901.03407 (2019) Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. CoRR abs/1901.03407 (2019)
60.
Zurück zum Zitat Chalapathy, R., Khoa, N.L.D., Chawla, S.: Robust deep learning methods for anomaly detection. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 3507–3508. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3394486.3406704 Chalapathy, R., Khoa, N.L.D., Chawla, S.: Robust deep learning methods for anomaly detection. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 3507–3508. Association for Computing Machinery, New York (2020). https://​doi.​org/​10.​1145/​3394486.​3406704
61.
Zurück zum Zitat Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)CrossRef Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)CrossRef
62.
Zurück zum Zitat Chądzyńska-Krasowska, A., Betliński, P., Ślęzak, D.: Scalable machine learning with granulated data summaries: a case of feature selection. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2017. LNCS (LNAI), vol. 10352, pp. 519–529. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60438-1_51CrossRef Chądzyńska-Krasowska, A., Betliński, P., Ślęzak, D.: Scalable machine learning with granulated data summaries: a case of feature selection. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2017. LNCS (LNAI), vol. 10352, pp. 519–529. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-60438-1_​51CrossRef
66.
Zurück zum Zitat Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939785 Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. Association for Computing Machinery, New York (2016). https://​doi.​org/​10.​1145/​2939672.​2939785
67.
Zurück zum Zitat Cheng, W., Dembczyński, K., Hüllermeier, E.: Graded multilabel classification: the ordinal case. In: Fürnkranz, J., Joachims, T. (eds.) Proceedings of the 27th International Conference on Machine Learning (ICML-10), 21–24 June 2010, Haifa, Israel, pp. 223–230. Omnipress (2010) Cheng, W., Dembczyński, K., Hüllermeier, E.: Graded multilabel classification: the ordinal case. In: Fürnkranz, J., Joachims, T. (eds.) Proceedings of the 27th International Conference on Machine Learning (ICML-10), 21–24 June 2010, Haifa, Israel, pp. 223–230. Omnipress (2010)
69.
Zurück zum Zitat Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A meeting of SIGDAT, A Special Interest Group of the ACL, pp. 1724–1734. ACL (2014) Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A meeting of SIGDAT, A Special Interest Group of the ACL, pp. 1724–1734. ACL (2014)
70.
Zurück zum Zitat Chu, C.T., et al.: Map-reduce for machine learning on multicore. In: Proceedings of NIPS, pp. 281–288 (2006) Chu, C.T., et al.: Map-reduce for machine learning on multicore. In: Proceedings of NIPS, pp. 281–288 (2006)
72.
Zurück zum Zitat Clark, P.G., Grzymała-Busse, J.W., Hippe, Z.S., Mroczek, T., Niemiec, R.: Complexity of rule sets mined from incomplete data using probabilistic approximations based on generalized maximal consistent blocks. Procedia Comput. Sci. 176, 1803–1812 (2020). https://doi.org/10.1016/j.procs.2020.09.219. Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020 Clark, P.G., Grzymała-Busse, J.W., Hippe, Z.S., Mroczek, T., Niemiec, R.: Complexity of rule sets mined from incomplete data using probabilistic approximations based on generalized maximal consistent blocks. Procedia Comput. Sci. 176, 1803–1812 (2020). https://​doi.​org/​10.​1016/​j.​procs.​2020.​09.​219. Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020
75.
Zurück zum Zitat Cornelis, C., Jensen, R., Martín, G.H., Ślęzak, D.: Attribute selection with fuzzy decision reducts. Inf. Sci. 180(2), 209–224 (2010)MathSciNetMATHCrossRef Cornelis, C., Jensen, R., Martín, G.H., Ślęzak, D.: Attribute selection with fuzzy decision reducts. Inf. Sci. 180(2), 209–224 (2010)MathSciNetMATHCrossRef
79.
81.
Zurück zum Zitat Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: scalable online collaborative filtering. In: Proceedings of WWW, pp. 271–280 (2007) Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: scalable online collaborative filtering. In: Proceedings of WWW, pp. 271–280 (2007)
82.
Zurück zum Zitat Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of ICML 2001, pp. 74–81 (2001) Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of ICML 2001, pp. 74–81 (2001)
85.
Zurück zum Zitat Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM J. Comput. 31(6), 1794–1813 (2002)MathSciNetMATHCrossRef Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM J. Comput. 31(6), 1794–1813 (2002)MathSciNetMATHCrossRef
87.
Zurück zum Zitat Dayal, U., Castellanos, M., Simitsis, A., Wilkinson, K.: Data integration flows for business intelligence. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 2009, pp. 1–11. ACM, New York (2009). https://doi.org/10.1145/1516360.1516362 Dayal, U., Castellanos, M., Simitsis, A., Wilkinson, K.: Data integration flows for business intelligence. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 2009, pp. 1–11. ACM, New York (2009). https://​doi.​org/​10.​1145/​1516360.​1516362
89.
Zurück zum Zitat Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018) Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)
94.
Zurück zum Zitat Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)CrossRef Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)CrossRef
97.
Zurück zum Zitat Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on International Conference on Machine Learning, ICML 1995, pp. 194–202. Morgan Kaufmann Publishers Inc., San Francisco (1995) Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on International Conference on Machine Learning, ICML 1995, pp. 194–202. Morgan Kaufmann Publishers Inc., San Francisco (1995)
98.
Zurück zum Zitat Dramiński, M., Rada-Iglesias, A., Enroth, S., Wadelius, C., Koronacki, J., Komorowski, H.J.: Monte Carlo feature selection for supervised classification. Bioinformatics 24(1), 110–117 (2008)CrossRef Dramiński, M., Rada-Iglesias, A., Enroth, S., Wadelius, C., Koronacki, J., Komorowski, H.J.: Monte Carlo feature selection for supervised classification. Bioinformatics 24(1), 110–117 (2008)CrossRef
104.
Zurück zum Zitat Eiras-Franco, C., Bolón-Canedo, V., Ramos, S., González-Domínguez, J., Alonso-Betanzos, A., Touriño, J.: Multithreaded and Spark Parallelization of Feature Selection Filters. J. Comput. Sci. 17, 609–619 (2016)CrossRef Eiras-Franco, C., Bolón-Canedo, V., Ramos, S., González-Domínguez, J., Alonso-Betanzos, A., Touriño, J.: Multithreaded and Spark Parallelization of Feature Selection Filters. J. Comput. Sci. 17, 609–619 (2016)CrossRef
105.
Zurück zum Zitat Ekanayake, J., et al.: Twister: a runtime for iterative mapreduce. In: Proceedings of HPDC, pp. 810–818 (2010) Ekanayake, J., et al.: Twister: a runtime for iterative mapreduce. In: Proceedings of HPDC, pp. 810–818 (2010)
106.
Zurück zum Zitat Elmeleegy, K.: Piranha: optimizing short jobs in Hadoop. Proc. VLDB Endow. 6(11), 985–996 (2013)CrossRef Elmeleegy, K.: Piranha: optimizing short jobs in Hadoop. Proc. VLDB Endow. 6(11), 985–996 (2013)CrossRef
107.
Zurück zum Zitat Fan, J., Lv, J.: A selective overview of variable selection in high dimensional feature space. Stat. Sin. 20(1), 101–148 (2010)MathSciNetMATH Fan, J., Lv, J.: A selective overview of variable selection in high dimensional feature space. Stat. Sin. 20(1), 101–148 (2010)MathSciNetMATH
108.
Zurück zum Zitat Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp. 1022–1029 (1993) Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp. 1022–1029 (1993)
111.
Zurück zum Zitat Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019). http://jmlr.org/papers/v20/18-760.html Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019). http://​jmlr.​org/​papers/​v20/​18-760.​html
114.
Zurück zum Zitat Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2000)MathSciNetMATH Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2000)MathSciNetMATH
115.
Zurück zum Zitat Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRef Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRef
121.
Zurück zum Zitat García-Torres, M., Gómez-Vela, F., Melián-Batista, B., Moreno-Vega, J.M.: High-dimensional feature selection via feature grouping. Inf. Sci. 326, 102–118 (2016)CrossRef García-Torres, M., Gómez-Vela, F., Melián-Batista, B., Moreno-Vega, J.M.: High-dimensional feature selection via feature grouping. Inf. Sci. 326, 102–118 (2016)CrossRef
124.
Zurück zum Zitat Gibowicz, S.J., Lasocki, S.: Seismicity induced by mining: 10 years later. In: Advances in Geophysics, pp. 81–164 (2001) Gibowicz, S.J., Lasocki, S.: Seismicity induced by mining: 10 years later. In: Advances in Geophysics, pp. 81–164 (2001)
130.
Zurück zum Zitat Govindan, P., Chen, R., Scheinberg, K., Srinivasan, S.: A scalable solution for group feature selection. In: Proceedings of IEEE Big Data 2015, pp. 2846–2848 (2015) Govindan, P., Chen, R., Scheinberg, K., Srinivasan, S.: A scalable solution for group feature selection. In: Proceedings of IEEE Big Data 2015, pp. 2846–2848 (2015)
132.
Zurück zum Zitat Grochala, D., Kajor, M., Kucharski, D., Iwaniec, M., Kańtoch, E.: A novel approach in auscultation technology - new sensors and algorithms. In: Bujnowski, A., Kaczmarek, M., Ruminski, J. (eds.) 11th International Conference on Human System Interaction, HSI 2018, Gdansk, Poland, 4–6 July 2018, pp. 240–244. IEEE (2018). https://doi.org/10.1109/HSI.2018.8431339 Grochala, D., Kajor, M., Kucharski, D., Iwaniec, M., Kańtoch, E.: A novel approach in auscultation technology - new sensors and algorithms. In: Bujnowski, A., Kaczmarek, M., Ruminski, J. (eds.) 11th International Conference on Human System Interaction, HSI 2018, Gdansk, Poland, 4–6 July 2018, pp. 240–244. IEEE (2018). https://​doi.​org/​10.​1109/​HSI.​2018.​8431339
133.
Zurück zum Zitat Grorud, L.J., Smith, D.: The national fire fighter near-miss reporting. Annual Report 2008. An Exclusive Supplement to Fire & Rescue Magazine, pp. 1–24 (2008) Grorud, L.J., Smith, D.: The national fire fighter near-miss reporting. Annual Report 2008. An Exclusive Supplement to Fire & Rescue Magazine, pp. 1–24 (2008)
135.
Zurück zum Zitat Grychowski, T.: Hazard assessment based on fuzzy logic. Arch. Min. Sci. 53(4), 595–602 (2008) Grychowski, T.: Hazard assessment based on fuzzy logic. Arch. Min. Sci. 53(4), 595–602 (2008)
138.
Zurück zum Zitat Grzegorowski, M.: Massively parallel feature extraction framework application in predicting dangerous seismic events. In: Proceedings of FedCSIS 2016, pp. 225–229 (2016) Grzegorowski, M.: Massively parallel feature extraction framework application in predicting dangerous seismic events. In: Proceedings of FedCSIS 2016, pp. 225–229 (2016)
139.
Zurück zum Zitat Grzegorowski, M.: Selected aspects of interactive feature extraction. Ph.D. thesis, University of Warsaw (2021) Grzegorowski, M.: Selected aspects of interactive feature extraction. Ph.D. thesis, University of Warsaw (2021)
140.
141.
Zurück zum Zitat Grzegorowski, M., Janusz, A., Ślęzak, D., Szczuka, M.S.: On the role of feature space granulation in feature selection processes. In: Nie, J., et al. (eds.) 2017 IEEE International Conference on Big Data, BigData 2017, Boston, MA, USA, 11–14 December 2017, pp. 1806–1815. IEEE Computer Society (2017). https://doi.org/10.1109/BigData.2017.8258124 Grzegorowski, M., Janusz, A., Ślęzak, D., Szczuka, M.S.: On the role of feature space granulation in feature selection processes. In: Nie, J., et al. (eds.) 2017 IEEE International Conference on Big Data, BigData 2017, Boston, MA, USA, 11–14 December 2017, pp. 1806–1815. IEEE Computer Society (2017). https://​doi.​org/​10.​1109/​BigData.​2017.​8258124
144.
Zurück zum Zitat Grzegorowski, M., Pardel, P.W., Stawicki, S., Stencel, K.: SONCA: scalable semantic processing of rapidly growing document stores. In: Pechenizkiy, M., Wojciechowski, M. (eds.) New Trends in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol. 185, pp. 89–98. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32518-2_9CrossRef Grzegorowski, M., Pardel, P.W., Stawicki, S., Stencel, K.: SONCA: scalable semantic processing of rapidly growing document stores. In: Pechenizkiy, M., Wojciechowski, M. (eds.) New Trends in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol. 185, pp. 89–98. Springer, Heidelberg (2012). https://​doi.​org/​10.​1007/​978-3-642-32518-2_​9CrossRef
147.
Zurück zum Zitat Grzegorowski, M., Stawicki, S.: Window-based feature extraction framework for multi-sensor data: a posture recognition case study. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015, pp. 397–405. IEEE (2015). https://doi.org/10.15439/2015F425 Grzegorowski, M., Stawicki, S.: Window-based feature extraction framework for multi-sensor data: a posture recognition case study. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015, pp. 397–405. IEEE (2015). https://​doi.​org/​10.​15439/​2015F425
149.
Zurück zum Zitat Gu, B., Liu, G., Huang, H.: Groups-keeping solution path algorithm for sparse regression with automatic feature grouping. In: Proceedings of the KDD, pp. 185–193 (2017) Gu, B., Liu, G., Huang, H.: Groups-keeping solution path algorithm for sparse regression with automatic feature grouping. In: Proceedings of the KDD, pp. 185–193 (2017)
151.
Zurück zum Zitat Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH
153.
Zurück zum Zitat Güzel, B.E.K., Karaçalı, B.: Fisher’s linear discriminant analysis based prediction using transient features of seismic events in coal mines. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 231–234. IEEE (2016). https://doi.org/10.15439/2016F116 Güzel, B.E.K., Karaçalı, B.: Fisher’s linear discriminant analysis based prediction using transient features of seismic events in coal mines. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 231–234. IEEE (2016). https://​doi.​org/​10.​15439/​2016F116
155.
Zurück zum Zitat Hall, M.: Correlation-based feature selection for machine learning. Ph.D. thesis, University of Waikato (1999) Hall, M.: Correlation-based feature selection for machine learning. Ph.D. thesis, University of Waikato (1999)
160.
Zurück zum Zitat He, Y.L., Tian, Y., Xu, Y., Zhu, Q.X.: Novel soft sensor development using echo state network integrated with singular value decomposition: application to complex chemical processes. Chemometr. Intell. Lab. Syst. 200, 103981 (2020)CrossRef He, Y.L., Tian, Y., Xu, Y., Zhu, Q.X.: Novel soft sensor development using echo state network integrated with singular value decomposition: application to complex chemical processes. Chemometr. Intell. Lab. Syst. 200, 103981 (2020)CrossRef
162.
Zurück zum Zitat Herodotou, H., Dong, F., Babu, S.: No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, p. 18. ACM (2011) Herodotou, H., Dong, F., Babu, S.: No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, p. 18. ACM (2011)
166.
Zurück zum Zitat Hosseini, B., Hammer, B.: Interpretable discriminative dimensionality reduction and feature selection on the manifold. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11906, pp. 310–326. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46150-8_19CrossRef Hosseini, B., Hammer, B.: Interpretable discriminative dimensionality reduction and feature selection on the manifold. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11906, pp. 310–326. Springer, Cham (2020). https://​doi.​org/​10.​1007/​978-3-030-46150-8_​19CrossRef
168.
Zurück zum Zitat Hu, X.: Ensembles of classifiers based on rough sets theory and set-oriented database operations. In: Proceedings of IEEE GrC 2006, pp. 67–73 (2006) Hu, X.: Ensembles of classifiers based on rough sets theory and set-oriented database operations. In: Proceedings of IEEE GrC 2006, pp. 67–73 (2006)
172.
175.
Zurück zum Zitat Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc, New Jersey (1988)MATH Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc, New Jersey (1988)MATH
178.
Zurück zum Zitat Janusz, A.: Algorithms for similarity relation learning from high dimensional data. Ph.D. thesis, University of Warsaw (2014) Janusz, A.: Algorithms for similarity relation learning from high dimensional data. Ph.D. thesis, University of Warsaw (2014)
180.
Zurück zum Zitat Janusz, A., Grad, Ł., Grzegorowski, M.: Clash Royale challenge: how to select training decks for win-rate prediction. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, FedCSIS 2019, Leipzig, Germany, 1–4 September 2019. Annals of Computer Science and Information Systems, vol. 18, pp. 3–6 (2019). https://doi.org/10.15439/2019F365 Janusz, A., Grad, Ł., Grzegorowski, M.: Clash Royale challenge: how to select training decks for win-rate prediction. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, FedCSIS 2019, Leipzig, Germany, 1–4 September 2019. Annals of Computer Science and Information Systems, vol. 18, pp. 3–6 (2019). https://​doi.​org/​10.​15439/​2019F365
182.
Zurück zum Zitat Janusz, A., Grzegorowski, M., Michalak, M., Wróbel, Ł, Sikora, M., Ślęzak, D.: Predicting seismic events in coal mines based on underground sensor measurements. Eng. Appl. Artif. Intell. 64, 83–94 (2017)CrossRef Janusz, A., Grzegorowski, M., Michalak, M., Wróbel, Ł, Sikora, M., Ślęzak, D.: Predicting seismic events in coal mines based on underground sensor measurements. Eng. Appl. Artif. Intell. 64, 83–94 (2017)CrossRef
183.
Zurück zum Zitat Janusz, A., Krasuski, A., Stawicki, S., Rosiak, M., Ślęzak, D., Nguyen, H.S.: Key risk factors for polish state fire service: a data mining competition at knowledge pit. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, 7–10 September 2014. Annals of Computer Science and Information Systems, vol. 2, pp. 345–354 (2014). https://doi.org/10.15439/2014F507 Janusz, A., Krasuski, A., Stawicki, S., Rosiak, M., Ślęzak, D., Nguyen, H.S.: Key risk factors for polish state fire service: a data mining competition at knowledge pit. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, 7–10 September 2014. Annals of Computer Science and Information Systems, vol. 2, pp. 345–354 (2014). https://​doi.​org/​10.​15439/​2014F507
187.
Zurück zum Zitat Janusz, A., Ślęzak, D., Sikora, M., Wróbel, Ł.: Predicting dangerous seismic events: AAIA’16 data mining challenge. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, FedCSIS 2016, Gdańsk, Poland, 11–14 September 2016. Annals of Computer Science and Information Systems, vol. 8, pp. 205–211. IEEE (2016). https://doi.org/10.15439/2016F560 Janusz, A., Ślęzak, D., Sikora, M., Wróbel, Ł.: Predicting dangerous seismic events: AAIA’16 data mining challenge. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, FedCSIS 2016, Gdańsk, Poland, 11–14 September 2016. Annals of Computer Science and Information Systems, vol. 8, pp. 205–211. IEEE (2016). https://​doi.​org/​10.​15439/​2016F560
188.
Zurück zum Zitat Janusz, A., Szczuka, M.S.: Assessment of data granulations in context of feature extraction problem. In: Proceedings of IEEE GrC, pp. 116–120 (2014) Janusz, A., Szczuka, M.S.: Assessment of data granulations in context of feature extraction problem. In: Proceedings of IEEE GrC, pp. 116–120 (2014)
189.
Zurück zum Zitat Janusz, A., Tajmajer, T., Świechowski, M.: Helping AI to play hearthstone: AAIA’17 data mining challenge. In: Proceedings of FedCSIS, pp. 121–125 (2017) Janusz, A., Tajmajer, T., Świechowski, M.: Helping AI to play hearthstone: AAIA’17 data mining challenge. In: Proceedings of FedCSIS, pp. 121–125 (2017)
192.
Zurück zum Zitat Jiménez, F., Palma, J.T., Sánchez, G., Marín, D., Ortega, F.P., López, M.D.L.: Feature selection based multivariate time series forecasting: an application to antibiotic resistance outbreaks prediction. Artif. Intell. Med. 104, 101818 (2020)CrossRef Jiménez, F., Palma, J.T., Sánchez, G., Marín, D., Ortega, F.P., López, M.D.L.: Feature selection based multivariate time series forecasting: an application to antibiotic resistance outbreaks prediction. Artif. Intell. Med. 104, 101818 (2020)CrossRef
196.
Zurück zum Zitat Jing, Y., Li, T., Luo, C., Horng, S.J., Wang, G., Yu, Z.: An incremental approach for attribute reduction based on knowledge granularity. Knowl. Based Syst. 104, 24–38 (2016)CrossRef Jing, Y., Li, T., Luo, C., Horng, S.J., Wang, G., Yu, Z.: An incremental approach for attribute reduction based on knowledge granularity. Knowl. Based Syst. 104, 24–38 (2016)CrossRef
197.
Zurück zum Zitat Jovic, A., Brkic, K., Bogunovic, N.: A review of feature selection methods with applications. In: Proceedings of MIPRO 2015, pp. 1200–1205 (2015) Jovic, A., Brkic, K., Bogunovic, N.: A review of feature selection methods with applications. In: Proceedings of MIPRO 2015, pp. 1200–1205 (2015)
199.
Zurück zum Zitat Kabiesz, J.: The justification and objective to modify methods of forecasting the potential and assess the actual state of rockburst hazard. In: Methods for Assessment of Rockburst Hazard in Coal Mines’ Excavations, vol. 44, pp. 44–48 (2010). (in Polish) Kabiesz, J.: The justification and objective to modify methods of forecasting the potential and assess the actual state of rockburst hazard. In: Methods for Assessment of Rockburst Hazard in Coal Mines’ Excavations, vol. 44, pp. 44–48 (2010). (in Polish)
200.
Zurück zum Zitat Kabiesz, J., Sikora, B., Sikora, M., Wróbel, Ł: Application of rule-based models for seismic hazard prediction in coal mines. Acta Montanistica Slovaca 18(3), 262–277 (2013) Kabiesz, J., Sikora, B., Sikora, M., Wróbel, Ł: Application of rule-based models for seismic hazard prediction in coal mines. Acta Montanistica Slovaca 18(3), 262–277 (2013)
201.
Zurück zum Zitat Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl. Inf. Syst. 12(1), 95–116 (2007)CrossRef Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl. Inf. Syst. 12(1), 95–116 (2007)CrossRef
203.
Zurück zum Zitat Kańtoch, E., Augustyniak, P., Markiewicz, M., Prusak, D.: Monitoring activities of daily living based on wearable wireless body sensor network. In: 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, Chicago, IL, USA, 26–30 August 2014, pp. 586–589. IEEE (2014). https://doi.org/10.1109/EMBC.2014.6943659 Kańtoch, E., Augustyniak, P., Markiewicz, M., Prusak, D.: Monitoring activities of daily living based on wearable wireless body sensor network. In: 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, Chicago, IL, USA, 26–30 August 2014, pp. 586–589. IEEE (2014). https://​doi.​org/​10.​1109/​EMBC.​2014.​6943659
205.
Zurück zum Zitat Karabatak, M., Ince, M.C.: A new feature selection method based on association rules for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 36(10), 12500–12505 (2009)CrossRef Karabatak, M., Ince, M.C.: A new feature selection method based on association rules for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 36(10), 12500–12505 (2009)CrossRef
206.
Zurück zum Zitat Kasinikota, A., Balamurugan, P., Shevade, S.: Modeling label interactions in multi-label classification: a multi-structure SVM perspective. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10937, pp. 43–55. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93034-3_4CrossRef Kasinikota, A., Balamurugan, P., Shevade, S.: Modeling label interactions in multi-label classification: a multi-structure SVM perspective. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10937, pp. 43–55. Springer, Cham (2018). https://​doi.​org/​10.​1007/​978-3-319-93034-3_​4CrossRef
210.
Zurück zum Zitat Keogh, E., Lin, J., Fu, A.: Hot sax: efficiently finding the most unusual time series subsequence. In: Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM 2005, pp. 226–233. IEEE Computer Society, Washington, DC (2005). https://doi.org/10.1109/ICDM.2005.79 Keogh, E., Lin, J., Fu, A.: Hot sax: efficiently finding the most unusual time series subsequence. In: Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM 2005, pp. 226–233. IEEE Computer Society, Washington, DC (2005). https://​doi.​org/​10.​1109/​ICDM.​2005.​79
211.
Zurück zum Zitat Keogh, E.J., Pazzani, M.J.: Scaling up dynamic time warping for datamining applications. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 285–289. ACM, New York (2000). https://doi.org/10.1145/347090.347153 Keogh, E.J., Pazzani, M.J.: Scaling up dynamic time warping for datamining applications. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 285–289. ACM, New York (2000). https://​doi.​org/​10.​1145/​347090.​347153
212.
Zurück zum Zitat Keren, G., Schuller, B.W.: Convolutional RNN: an enhanced model for extracting features from sequential data. In: 2016 International Joint Conference on Neural Networks, IJCNN 2016, Vancouver, BC, Canada, 24–29 July 2016, pp. 3412–3419. IEEE (2016). https://doi.org/10.1109/IJCNN.2016.7727636 Keren, G., Schuller, B.W.: Convolutional RNN: an enhanced model for extracting features from sequential data. In: 2016 International Joint Conference on Neural Networks, IJCNN 2016, Vancouver, BC, Canada, 24–29 July 2016, pp. 3412–3419. IEEE (2016). https://​doi.​org/​10.​1109/​IJCNN.​2016.​7727636
213.
Zurück zum Zitat Khandelwal, V., Chaturvedi, A.K., Gupta, C.P.: Amazon EC2 spot price prediction using regression random forests. IEEE Trans. Cloud Comput. 8(1), 59–72 (2020)CrossRef Khandelwal, V., Chaturvedi, A.K., Gupta, C.P.: Amazon EC2 spot price prediction using regression random forests. IEEE Trans. Cloud Comput. 8(1), 59–72 (2020)CrossRef
214.
Zurück zum Zitat Kieu, T., Yang, B., Guo, C., Jensen, C.S.: Outlier detection for time series with recurrent autoencoder ensembles. In: Kraus, S. (ed.) Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 2725–2732. ijcai.org (2019). https://doi.org/10.24963/ijcai.2019/378 Kieu, T., Yang, B., Guo, C., Jensen, C.S.: Outlier detection for time series with recurrent autoencoder ensembles. In: Kraus, S. (ed.) Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 2725–2732. ijcai.org (2019). https://​doi.​org/​10.​24963/​ijcai.​2019/​378
217.
Zurück zum Zitat Kornowski, J.: Linear prediction of aggregated seismic and seismoacoustic energy emitted from a mining longwall. Acta Montana Ser. A 22(129), 5–14 (2003) Kornowski, J.: Linear prediction of aggregated seismic and seismoacoustic energy emitted from a mining longwall. Acta Montana Ser. A 22(129), 5–14 (2003)
218.
Zurück zum Zitat Kowalski, M., Ślęzak, D., Stencel, K., Pardel, P.W., Grzegorowski, M., Kijowski, M.: RDBMS model for scientific articles analytics. In: Bembenik, R., Skonieczny, L., Rybiński, H., Niezgodka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. Studies in Computational Intelligence, vol. 390, pp. 49–60. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24809-2_4 Kowalski, M., Ślęzak, D., Stencel, K., Pardel, P.W., Grzegorowski, M., Kijowski, M.: RDBMS model for scientific articles analytics. In: Bembenik, R., Skonieczny, L., Rybiński, H., Niezgodka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. Studies in Computational Intelligence, vol. 390, pp. 49–60. Springer, Heidelberg (2012). https://​doi.​org/​10.​1007/​978-3-642-24809-2_​4
219.
Zurück zum Zitat Kozielski, M., Sikora, M., Wróbel, Ł.: DISESOR - decision support system for mining industry. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015. Annals of Computer Science and Information Systems, vol. 5, pp. 67–74. IEEE (2015). https://doi.org/10.15439/2015F168 Kozielski, M., Sikora, M., Wróbel, Ł.: DISESOR - decision support system for mining industry. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015. Annals of Computer Science and Information Systems, vol. 5, pp. 67–74. IEEE (2015). https://​doi.​org/​10.​15439/​2015F168
221.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012)
224.
Zurück zum Zitat Kurach, K., Pawłowski, K.: Predicting dangerous seismic activity with recurrent neural networks. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 239–243. IEEE (2016). https://doi.org/10.15439/2016F134 Kurach, K., Pawłowski, K.: Predicting dangerous seismic activity with recurrent neural networks. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 239–243. IEEE (2016). https://​doi.​org/​10.​15439/​2016F134
225.
Zurück zum Zitat Kusuma, R.M.I., Ho, T.T., Kao, W.C., Ou, Y.Y., Hua, K.L.: Using deep learning neural networks and candlestick chart representation to predict stock market (2019) Kusuma, R.M.I., Ho, T.T., Kao, W.C., Ou, Y.Y., Hua, K.L.: Using deep learning neural networks and candlestick chart representation to predict stock market (2019)
226.
227.
Zurück zum Zitat Lan, G., Hou, C., Nie, F., Luo, T., Yi, D.: Robust feature selection via simultaneous sapped norm and sparse regularizer minimization. Neurocomputing 283, 228–240 (2018)CrossRef Lan, G., Hou, C., Nie, F., Luo, T., Yi, D.: Robust feature selection via simultaneous sapped norm and sparse regularizer minimization. Neurocomputing 283, 228–240 (2018)CrossRef
229.
Zurück zum Zitat Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 24–26 June 2008, Anchorage, Alaska, USA. IEEE Computer Society (2008). https://doi.org/10.1109/CVPR.2008.4587756 Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 24–26 June 2008, Anchorage, Alaska, USA. IEEE Computer Society (2008). https://​doi.​org/​10.​1109/​CVPR.​2008.​4587756
231.
Zurück zum Zitat Lasocki, S.: Probabilistic analysis of seismic hazard posed by mining induced events. In: Proceedings of Sixth International Symposium on Rockburst and Seismicity in Mines, pp. 151–156 (2005) Lasocki, S.: Probabilistic analysis of seismic hazard posed by mining induced events. In: Proceedings of Sixth International Symposium on Rockburst and Seismicity in Mines, pp. 151–156 (2005)
233.
Zurück zum Zitat LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: ISCAS, pp. 253–256. IEEE (2010) LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: ISCAS, pp. 253–256. IEEE (2010)
234.
Zurück zum Zitat Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with mapreduce: a survey. SIGMOD Rec. 40(4), 11–20 (2012)CrossRef Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with mapreduce: a survey. SIGMOD Rec. 40(4), 11–20 (2012)CrossRef
239.
Zurück zum Zitat Li, P., Wu, J., Shang, L.: Fast approximate attribute reduction with MapReduce. In: Proceedings of RSKT 2013, pp. 271–278 (2013) Li, P., Wu, J., Shang, L.: Fast approximate attribute reduction with MapReduce. In: Proceedings of RSKT 2013, pp. 271–278 (2013)
246.
Zurück zum Zitat Liu, H., Wu, X., Zhang, S.: A new supervised feature selection method for pattern classification. Comput. Intell. 30(2), 342–361 (2014)MathSciNetCrossRef Liu, H., Wu, X., Zhang, S.: A new supervised feature selection method for pattern classification. Comput. Intell. 30(2), 342–361 (2014)MathSciNetCrossRef
260.
Zurück zum Zitat Mark, C.: Coal bursts in the deep longwall mines of the United States. Int. J. Coal Sci. Technol. 3(1), 1–9 (2016)MathSciNetCrossRef Mark, C.: Coal bursts in the deep longwall mines of the United States. Int. J. Coal Sci. Technol. 3(1), 1–9 (2016)MathSciNetCrossRef
261.
Zurück zum Zitat Mason, A.J.: Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies. Ph.D. thesis, Imperial College London (2009) Mason, A.J.: Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies. Ph.D. thesis, Imperial College London (2009)
262.
Zurück zum Zitat Mathew, S.: Overview of Amazon Web Services, April 2017. Accessed 04 June 2019 Mathew, S.: Overview of Amazon Web Services, April 2017. Accessed 04 June 2019
263.
Zurück zum Zitat Meina, M., Janusz, A., Rykaczewski, K., Ślęzak, D., Celmer, B., Krasuski, A.: Tagging firefighter activities at the emergency scene: summary of AAIA’15 data mining competition at knowledge pit. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015. Annals of Computer Science and Information Systems, vol. 5, pp. 367–373. IEEE (2015). https://doi.org/10.15439/2015F426 Meina, M., Janusz, A., Rykaczewski, K., Ślęzak, D., Celmer, B., Krasuski, A.: Tagging firefighter activities at the emergency scene: summary of AAIA’15 data mining competition at knowledge pit. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015. Annals of Computer Science and Information Systems, vol. 5, pp. 367–373. IEEE (2015). https://​doi.​org/​10.​15439/​2015F426
264.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Bengio, Y., LeCun, Y. (eds.) 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, 2–4 May 2013, Workshop Track Proceedings (2013). http://arxiv.org/abs/1301.3781 Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Bengio, Y., LeCun, Y. (eds.) 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, 2–4 May 2013, Workshop Track Proceedings (2013). http://​arxiv.​org/​abs/​1301.​3781
265.
Zurück zum Zitat Milczek, J.K., Bogucki, R., Lasek, J., Tadeusiak, M.: Early warning system for seismic events in coal mines using machine learning. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 213–220. IEEE (2016). https://doi.org/10.15439/2016F420 Milczek, J.K., Bogucki, R., Lasek, J., Tadeusiak, M.: Early warning system for seismic events in coal mines using machine learning. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 213–220. IEEE (2016). https://​doi.​org/​10.​15439/​2016F420
266.
268.
Zurück zum Zitat Moczulski, W., Przystałka, P., Sikora, M., Zimroz, R.: Modern ICT and mechatronic systems in contemporary mining industry. In: Rough Sets - International Joint Conference, IJCRS 2016, Santiago de Chile, Chile, 7–11 October 2016, Proceedings, pp. 33–42 (2016). https://doi.org/10.1007/978-3-319-47160-0_3 Moczulski, W., Przystałka, P., Sikora, M., Zimroz, R.: Modern ICT and mechatronic systems in contemporary mining industry. In: Rough Sets - International Joint Conference, IJCRS 2016, Santiago de Chile, Chile, 7–11 October 2016, Proceedings, pp. 33–42 (2016). https://​doi.​org/​10.​1007/​978-3-319-47160-0_​3
270.
Zurück zum Zitat Mönks, U., Dörksen, H., Lohweg, V., Hübner, M.: Information fusion of conflicting input data. Sensors 16(11), E1798 (2016)CrossRef Mönks, U., Dörksen, H., Lohweg, V., Hübner, M.: Information fusion of conflicting input data. Sensors 16(11), E1798 (2016)CrossRef
271.
Zurück zum Zitat Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. Society for Industrial and Applied Mathematics (2009) Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. Society for Industrial and Applied Mathematics (2009)
272.
Zurück zum Zitat Mörchen, F., Ultsch, A.: Optimizing time series discretization for knowledge discovery. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD 2005, pp. 660–665. ACM, New York (2005). https://doi.org/10.1145/1081870.1081953 Mörchen, F., Ultsch, A.: Optimizing time series discretization for knowledge discovery. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD 2005, pp. 660–665. ACM, New York (2005). https://​doi.​org/​10.​1145/​1081870.​1081953
273.
Zurück zum Zitat Moshkov, M.J., Piliszczuk, M., Zielosko, B.: On construction of partial reducts and irreducible partial decision rules. Fund. Inform. 75(1–4), 357–374 (2007)MathSciNetMATH Moshkov, M.J., Piliszczuk, M., Zielosko, B.: On construction of partial reducts and irreducible partial decision rules. Fund. Inform. 75(1–4), 357–374 (2007)MathSciNetMATH
276.
Zurück zum Zitat Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)MATH Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)MATH
280.
Zurück zum Zitat Nguyen, S.H., Szczuka, M.: Feature selection in decision systems with constraints. In: Flores, V., Gomide, F., Janusz, A., Meneses, C., Miao, D., Peters, G., Ślęzak, D., Wang, G., Weber, R., Yao, Y. (eds.) IJCRS 2016. LNCS (LNAI), vol. 9920, pp. 537–547. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47160-0_49CrossRef Nguyen, S.H., Szczuka, M.: Feature selection in decision systems with constraints. In: Flores, V., Gomide, F., Janusz, A., Meneses, C., Miao, D., Peters, G., Ślęzak, D., Wang, G., Weber, R., Yao, Y. (eds.) IJCRS 2016. LNCS (LNAI), vol. 9920, pp. 537–547. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-47160-0_​49CrossRef
281.
Zurück zum Zitat Nguyen, S.H., Skowron, A.: Quantization of real value attributes - rough set and boolean reasoning approach. In: Proceedings of the Second Joint Annual Conference on Information Sciences, Wrightsville Beach, North Carolina, 28 September–1 October 1995, pp. 34–37 (1995) Nguyen, S.H., Skowron, A.: Quantization of real value attributes - rough set and boolean reasoning approach. In: Proceedings of the Second Joint Annual Conference on Information Sciences, Wrightsville Beach, North Carolina, 28 September–1 October 1995, pp. 34–37 (1995)
282.
Zurück zum Zitat Nguyen, T.T., Skowron, A.: Rough-Granular Computing in Human-Centric Information Processing. In: Bargiela, A., Pedrycz, W. (eds.) Human-Centric Information Processing Through Granular Modelling. Studies in Computational Intelligence, vol. 182, pp. 1–30. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-92916-1_1CrossRef Nguyen, T.T., Skowron, A.: Rough-Granular Computing in Human-Centric Information Processing. In: Bargiela, A., Pedrycz, W. (eds.) Human-Centric Information Processing Through Granular Modelling. Studies in Computational Intelligence, vol. 182, pp. 1–30. Springer, Heidelberg (2009). https://​doi.​org/​10.​1007/​978-3-540-92916-1_​1CrossRef
283.
Zurück zum Zitat Nixon, M.S., Aguado, A.S.: Feature Extraction and Image Processing for Computer Vision, 4th edn. Academic Press (2020) Nixon, M.S., Aguado, A.S.: Feature Extraction and Image Processing for Computer Vision, 4th edn. Academic Press (2020)
284.
Zurück zum Zitat Nogueira, S.: Quantifying the stability of feature selection. Ph.D. thesis, University of Manchester (2018) Nogueira, S.: Quantifying the stability of feature selection. Ph.D. thesis, University of Manchester (2018)
285.
Zurück zum Zitat Nogueira, S., Sechidis, K., Brown, G.: On the stability of feature selection algorithms. J. Mach. Learn. Res. 18, 174:1–174:54 (2017) Nogueira, S., Sechidis, K., Brown, G.: On the stability of feature selection algorithms. J. Mach. Learn. Res. 18, 174:1–174:54 (2017)
287.
Zurück zum Zitat Parmar, N., Ramachandran, P., Vaswani, A., Bello, I., Levskaya, A., Shlens, J.: Stand-alone self-attention in vision models. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Canada, Vancouver, BC, pp. 68–80 (2019) Parmar, N., Ramachandran, P., Vaswani, A., Bello, I., Levskaya, A., Shlens, J.: Stand-alone self-attention in vision models. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Canada, Vancouver, BC, pp. 68–80 (2019)
288.
Zurück zum Zitat Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data, System Theory, Knowledge Engineering and Problem Solving, vol. 9. Kluwer (1991) Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data, System Theory, Knowledge Engineering and Problem Solving, vol. 9. Kluwer (1991)
289.
Zurück zum Zitat Pawlak, Z., Skowron, A.: Rough membership functions. In: Advances in the Dempster-Shafer Theory of Evidence, pp. 251–271. Wiley, New York (1994) Pawlak, Z., Skowron, A.: Rough membership functions. In: Advances in the Dempster-Shafer Theory of Evidence, pp. 251–271. Wiley, New York (1994)
295.
Zurück zum Zitat Pedrycz, W.: Granular Computing: Analysis and Design of Intelligent Systems. CRC Press, Boca Raton (2013)CrossRef Pedrycz, W.: Granular Computing: Analysis and Design of Intelligent Systems. CRC Press, Boca Raton (2013)CrossRef
298.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1532–1543. ACL (2014). https://doi.org/10.3115/v1/d14-1162 Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1532–1543. ACL (2014). https://​doi.​org/​10.​3115/​v1/​d14-1162
301.
Zurück zum Zitat Podlodowski, Ł.: Utilizing an ensemble of SVMs with GMM voting-based mechanism in predicting dangerous seismic events in active coal mines. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 235–238. IEEE (2016). https://doi.org/10.15439/2016F122 Podlodowski, Ł.: Utilizing an ensemble of SVMs with GMM voting-based mechanism in predicting dangerous seismic events in active coal mines. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 235–238. IEEE (2016). https://​doi.​org/​10.​15439/​2016F122
302.
Zurück zum Zitat Polikar, R., DePasquale, J., Mohammed, H.S., Brown, G., Kuncheva, L.I.: Learn++.MF: a random subspace approach for the missing feature problem. Pattern Recognit. 43(11), 3817–3832 (2010) Polikar, R., DePasquale, J., Mohammed, H.S., Brown, G., Kuncheva, L.I.: Learn++.MF: a random subspace approach for the missing feature problem. Pattern Recognit. 43(11), 3817–3832 (2010)
306.
Zurück zum Zitat Przystałka, P., Sikora, M. (eds.): Zintegrowany, szkieletowy system wspmagania decyzji dla systemów monitorowania procesów, urządzeń i zagrożeń. Monograficzna Seria Wydawnicza Instyututu Technik Innowacyjnych EMAG (2017) Przystałka, P., Sikora, M. (eds.): Zintegrowany, szkieletowy system wspmagania decyzji dla systemów monitorowania procesów, urządzeń i zagrożeń. Monograficzna Seria Wydawnicza Instyututu Technik Innowacyjnych EMAG (2017)
310.
Zurück zum Zitat Qian, J., Lv, P., Yue, X., Liu, C., Jing, Z.: Hierarchical attribute reduction algorithms for big data using MapReduce. Knowl.-Based Syst. 73, 18–31 (2015)CrossRef Qian, J., Lv, P., Yue, X., Liu, C., Jing, Z.: Hierarchical attribute reduction algorithms for big data using MapReduce. Knowl.-Based Syst. 73, 18–31 (2015)CrossRef
311.
Zurück zum Zitat Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993) Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
312.
Zurück zum Zitat Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016) Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016)
313.
Zurück zum Zitat Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog (2019)
314.
Zurück zum Zitat Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020) Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
315.
316.
Zurück zum Zitat Ramírez-Gallego, S., et al.: Fast-mRMR: fast minimum redundancy maximum relevance algorithm for high-dimensional big data. Int. J. Intell. Syst. 32, 134–152 (2017)CrossRef Ramírez-Gallego, S., et al.: Fast-mRMR: fast minimum redundancy maximum relevance algorithm for high-dimensional big data. Int. J. Intell. Syst. 32, 134–152 (2017)CrossRef
317.
320.
Zurück zum Zitat Read, J., Puurula, A., Bifet, A.: Multi-label classification with meta-labels. In: Kumar, R., Toivonen, H., Pei, J., Huang, J.Z., Wu, X. (eds.) 2014 IEEE International Conference on Data Mining, ICDM 2014, Shenzhen, China, 14–17 December 2014, pp. 941–946. IEEE Computer Society (2014). https://doi.org/10.1109/ICDM.2014.38 Read, J., Puurula, A., Bifet, A.: Multi-label classification with meta-labels. In: Kumar, R., Toivonen, H., Pei, J., Huang, J.Z., Wu, X. (eds.) 2014 IEEE International Conference on Data Mining, ICDM 2014, Shenzhen, China, 14–17 December 2014, pp. 941–946. IEEE Computer Society (2014). https://​doi.​org/​10.​1109/​ICDM.​2014.​38
321.
Zurück zum Zitat Rehman, M.H., Chang, V., Batool, A., Wah, T.Y.: Big data reduction framework for value creation in sustainable enterprises. Int. J. Inf. Manag. 36(6), 917–928 (2016)CrossRef Rehman, M.H., Chang, V., Batool, A., Wah, T.Y.: Big data reduction framework for value creation in sustainable enterprises. Int. J. Inf. Manag. 36(6), 917–928 (2016)CrossRef
322.
Zurück zum Zitat dos Reis, D.M., Flach, P.A., Matwin, S., Batista, G.E.A.P.A.: Fast unsupervised online drift detection using incremental kolmogorov-smirnov test. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016, pp. 1545–1554. ACM (2016). https://doi.org/10.1145/2939672.2939836 dos Reis, D.M., Flach, P.A., Matwin, S., Batista, G.E.A.P.A.: Fast unsupervised online drift detection using incremental kolmogorov-smirnov test. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016, pp. 1545–1554. ACM (2016). https://​doi.​org/​10.​1145/​2939672.​2939836
323.
Zurück zum Zitat Riza, L.S., et al.: Implementing algorithms of rough set theory and fuzzy rough set theory in the R package ‘RoughSets’. Inf. Sci. 287, 68–89 (2014)CrossRef Riza, L.S., et al.: Implementing algorithms of rough set theory and fuzzy rough set theory in the R package ‘RoughSets’. Inf. Sci. 287, 68–89 (2014)CrossRef
325.
Zurück zum Zitat Rosen, J., et al.: Iterative MapReduce for Large Scale Machine Learning. CoRR abs/1303.3517 (2013) Rosen, J., et al.: Iterative MapReduce for Large Scale Machine Learning. CoRR abs/1303.3517 (2013)
327.
Zurück zum Zitat Roy, D., Murty, K.S.R., Mohan, C.K.: Feature selection using deep neural networks. In: Proceedings of IJCNN 2015, pp. 1–6 (2015) Roy, D., Murty, K.S.R., Mohan, C.K.: Feature selection using deep neural networks. In: Proceedings of IJCNN 2015, pp. 1–6 (2015)
332.
Zurück zum Zitat Rzeszótko, J., Nguyen, S.H.: Machine learning for traffic prediction. Fund. Inform. 119(3–4), 407–420 (2012)MathSciNet Rzeszótko, J., Nguyen, S.H.: Machine learning for traffic prediction. Fund. Inform. 119(3–4), 407–420 (2012)MathSciNet
335.
Zurück zum Zitat Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: alternatives and implications. Data Min. Knowl. Disc. 4(2–3), 89–125 (2000)CrossRef Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: alternatives and implications. Data Min. Knowl. Disc. 4(2–3), 89–125 (2000)CrossRef
337.
338.
Zurück zum Zitat Senawi, A., Wei, H., Billings, S.A.: A new maximum relevance-minimum multicollinearity (MRmMC) method for feature selection and ranking. Pattern Recogn. 67, 47–61 (2017)CrossRef Senawi, A., Wei, H., Billings, S.A.: A new maximum relevance-minimum multicollinearity (MRmMC) method for feature selection and ranking. Pattern Recogn. 67, 47–61 (2017)CrossRef
342.
Zurück zum Zitat Shah, J.S.: Novel statistical approaches for missing values in truncated high-dimensional metabolomics data with a detection threshold. Ph.D. thesis, University of Louisville (2017) Shah, J.S.: Novel statistical approaches for missing values in truncated high-dimensional metabolomics data with a detection threshold. Ph.D. thesis, University of Louisville (2017)
355.
Zurück zum Zitat Ślęzak, D.: Normalized decision functions and measures for inconsistent decision tables analysis. Fund. Inform. 44(3), 291–319 (2000)MathSciNetMATH Ślęzak, D.: Normalized decision functions and measures for inconsistent decision tables analysis. Fund. Inform. 44(3), 291–319 (2000)MathSciNetMATH
356.
357.
Zurück zum Zitat Ślęzak, D.: Rough sets and functional dependencies in data: foundations of association reducts. Trans. Comput. Sci. 5, 182–205 (2009)MATH Ślęzak, D.: Rough sets and functional dependencies in data: foundations of association reducts. Trans. Comput. Sci. 5, 182–205 (2009)MATH
360.
Zurück zum Zitat Ślęzak, D., et al.: A framework for learning and embedding multi-sensor forecasting models into a decision support system: a case study of methane concentration in coal mines. Inf. Sci. 451–452, 112–133 (2018)MathSciNetCrossRef Ślęzak, D., et al.: A framework for learning and embedding multi-sensor forecasting models into a decision support system: a case study of methane concentration in coal mines. Inf. Sci. 451–452, 112–133 (2018)MathSciNetCrossRef
361.
Zurück zum Zitat Ślęzak, D., Grzegorowski, M., Janusz, A., Stawicki, S.: Interactive Data Exploration with Infolattices. Abstract Materials of BAFI 2015 (2015) Ślęzak, D., Grzegorowski, M., Janusz, A., Stawicki, S.: Interactive Data Exploration with Infolattices. Abstract Materials of BAFI 2015 (2015)
364.
Zurück zum Zitat Ślęzak, D., Stawicki, S.: The problem of finding the simplest classifier ensemble is NP-hard – a rough-set-inspired formulation based on decision bireducts. In: Bello, R., Miao, D., Falcon, R., Nakata, M., Rosete, A., Ciucci, D. (eds.) IJCRS 2020. LNCS (LNAI), vol. 12179, pp. 204–212. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52705-1_15CrossRef Ślęzak, D., Stawicki, S.: The problem of finding the simplest classifier ensemble is NP-hard – a rough-set-inspired formulation based on decision bireducts. In: Bello, R., Miao, D., Falcon, R., Nakata, M., Rosete, A., Ciucci, D. (eds.) IJCRS 2020. LNCS (LNAI), vol. 12179, pp. 204–212. Springer, Cham (2020). https://​doi.​org/​10.​1007/​978-3-030-52705-1_​15CrossRef
365.
Zurück zum Zitat Ślęzak, D., Widz, S.: Evolutionary inspired optimization of feature subset ensembles. In: Takagi, H., Abraham, A., Köppen, M., Yoshida, K., de Carvalho, A.C.P.L.F. (eds.) Second World Congress on Nature & Biologically Inspired Computing, NaBIC 2010, 15–17 December 2010, Kitakyushu, Japan, pp. 437–442. IEEE (2010). https://doi.org/10.1109/NABIC.2010.5716365 Ślęzak, D., Widz, S.: Evolutionary inspired optimization of feature subset ensembles. In: Takagi, H., Abraham, A., Köppen, M., Yoshida, K., de Carvalho, A.C.P.L.F. (eds.) Second World Congress on Nature & Biologically Inspired Computing, NaBIC 2010, 15–17 December 2010, Kitakyushu, Japan, pp. 437–442. IEEE (2010). https://​doi.​org/​10.​1109/​NABIC.​2010.​5716365
366.
Zurück zum Zitat Smuk, M.: Missing data methodology: sensitivity analysis after multiple imputation. Ph.D. thesis, University of London (2015) Smuk, M.: Missing data methodology: sensitivity analysis after multiple imputation. Ph.D. thesis, University of London (2015)
367.
Zurück zum Zitat Sobhani, P., Viktor, H., Matwin, S.: Learning from imbalanced data using ensemble methods and cluster-based undersampling. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2014. LNCS (LNAI), vol. 8983, pp. 69–83. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17876-9_5CrossRef Sobhani, P., Viktor, H., Matwin, S.: Learning from imbalanced data using ensemble methods and cluster-based undersampling. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2014. LNCS (LNAI), vol. 8983, pp. 69–83. Springer, Cham (2015). https://​doi.​org/​10.​1007/​978-3-319-17876-9_​5CrossRef
369.
Zurück zum Zitat Sorzano, C.O.S., Vargas, J., Montano, A.P.: A survey of dimensionality reduction techniques (2014) Sorzano, C.O.S., Vargas, J., Montano, A.P.: A survey of dimensionality reduction techniques (2014)
370.
Zurück zum Zitat de Souto, M.C.P., Costa, I.G., de Araujo, D.S.A., Ludermir, T.B., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinform. 9, 1–14 (2008)CrossRef de Souto, M.C.P., Costa, I.G., de Araujo, D.S.A., Ludermir, T.B., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinform. 9, 1–14 (2008)CrossRef
371.
372.
373.
Zurück zum Zitat Stawicki, S., Ślęzak, D., Janusz, A., Widz, S.: Decision bireducts and decision reducts - a comparison. Int. J. Approx. Reason. 84, 75–109 (2017)MathSciNetMATHCrossRef Stawicki, S., Ślęzak, D., Janusz, A., Widz, S.: Decision bireducts and decision reducts - a comparison. Int. J. Approx. Reason. 84, 75–109 (2017)MathSciNetMATHCrossRef
375.
376.
Zurück zum Zitat Świniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recogn. Lett. 24(6), 833–849 (2003)MATHCrossRef Świniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recogn. Lett. 24(6), 833–849 (2003)MATHCrossRef
377.
Zurück zum Zitat Szczuka, M.S., Ślęzak, D.: How deep data becomes big data. In: Proceedings of IFSA/NAFIPS 2013, pp. 579–584 (2013) Szczuka, M.S., Ślęzak, D.: How deep data becomes big data. In: Proceedings of IFSA/NAFIPS 2013, pp. 579–584 (2013)
378.
Zurück zum Zitat Szczuka, M.S., Wojdyłło, P.: Neuro-wavelet classifiers for EEG signals based on rough set methods. Neurocomputing 36(1–4), 103–122 (2001)MATHCrossRef Szczuka, M.S., Wojdyłło, P.: Neuro-wavelet classifiers for EEG signals based on rough set methods. Neurocomputing 36(1–4), 103–122 (2001)MATHCrossRef
381.
Zurück zum Zitat Teixeira de Souza, J., Matwin, S., Japkowicz, N.: Parallelizing feature selection. Algorithmica 45(3), 433–456 (2006) Teixeira de Souza, J., Matwin, S., Japkowicz, N.: Parallelizing feature selection. Algorithmica 45(3), 433–456 (2006)
382.
Zurück zum Zitat Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, 28July–2 August 2019, Volume 1: Long Papers, pp. 4593–4601. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/p19-1452 Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, 28July–2 August 2019, Volume 1: Long Papers, pp. 4593–4601. Association for Computational Linguistics (2019). https://​doi.​org/​10.​18653/​v1/​p19-1452
383.
Zurück zum Zitat Tran, T.N., Afanador, N.L., Buydens, L.M., Blanchet, L.: Interpretation of variable importance in partial least squares with significance multivariate correlation (SMC). Chemom. Intell. Lab. Syst. 138, 153–160 (2014)CrossRef Tran, T.N., Afanador, N.L., Buydens, L.M., Blanchet, L.: Interpretation of variable importance in partial least squares with significance multivariate correlation (SMC). Chemom. Intell. Lab. Syst. 138, 153–160 (2014)CrossRef
384.
Zurück zum Zitat Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)CrossRef Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)CrossRef
388.
Zurück zum Zitat Van Der Maaten, L., Postma, E., Van den Herik, J.: Dimensionality reduction: a comparative review. Tilburg University Technical Report, TiCC-TR 2009 (2009) Van Der Maaten, L., Postma, E., Van den Herik, J.: Dimensionality reduction: a comparative review. Tilburg University Technical Report, TiCC-TR 2009 (2009)
389.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 5998–6008 (2017) Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 5998–6008 (2017)
399.
Zurück zum Zitat Wang, X., Liu, X., Japkowicz, N., Matwin, S.: Resampling and cost-sensitive methods for imbalanced multi-instance learning. In: 2013 IEEE 13th International Conference on Data Mining Workshops, pp. 808–816 (2013) Wang, X., Liu, X., Japkowicz, N., Matwin, S.: Resampling and cost-sensitive methods for imbalanced multi-instance learning. In: 2013 IEEE 13th International Conference on Data Mining Workshops, pp. 808–816 (2013)
400.
Zurück zum Zitat Widz, S., Ślęzak, D.: Granular attribute selection: a case study of rough set approach to MRI segmentation. In: Proceedings of PReMI 2013, pp. 47–52 (2013) Widz, S., Ślęzak, D.: Granular attribute selection: a case study of rough set approach to MRI segmentation. In: Proceedings of PReMI 2013, pp. 47–52 (2013)
401.
Zurück zum Zitat Wieczorkowska, A., Wróblewski, J., Synak, P., Ślęzak, D.: Application of temporal descriptors to musical instrument sound recognition. J. Intell. Inf. Syst. 21(1), 71–93 (2003)MATHCrossRef Wieczorkowska, A., Wróblewski, J., Synak, P., Ślęzak, D.: Application of temporal descriptors to musical instrument sound recognition. J. Intell. Inf. Syst. 21(1), 71–93 (2003)MATHCrossRef
402.
Zurück zum Zitat Wójtowicz, A.: Ensemble classification of incomplete data - a non-imputation approach with an application in ovarian tumour diagnosis support. Ph.D. thesis, University in Poznań (2017) Wójtowicz, A.: Ensemble classification of incomplete data - a non-imputation approach with an application in ovarian tumour diagnosis support. Ph.D. thesis, University in Poznań (2017)
404.
Zurück zum Zitat Wróblewski, J.: Ensembles of classifiers based on approximate reducts. Fund. Inform. 47(3–4), 351–360 (2001)MathSciNetMATH Wróblewski, J.: Ensembles of classifiers based on approximate reducts. Fund. Inform. 47(3–4), 351–360 (2001)MathSciNetMATH
405.
409.
Zurück zum Zitat Wu, Z., Pan, S., Long, G., Jiang, J., Chang, X., Zhang, C.: Connecting the dots: multivariate time series forecasting with graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 753–763. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3394486.3403118 Wu, Z., Pan, S., Long, G., Jiang, J., Chang, X., Zhang, C.: Connecting the dots: multivariate time series forecasting with graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 753–763. Association for Computing Machinery, New York (2020). https://​doi.​org/​10.​1145/​3394486.​3403118
410.
Zurück zum Zitat Xie, J., Wu, J., Qian, Q.: Feature selection algorithm based on association rules mining method. In: Proceedings of ICIS 2009, pp. 357–362 (2009) Xie, J., Wu, J., Qian, Q.: Feature selection algorithm based on association rules mining method. In: Proceedings of ICIS 2009, pp. 357–362 (2009)
411.
Zurück zum Zitat Xioufis, E.S., Spiliopoulou, M., Tsoumakas, G., Vlahavas, I.: Dealing with concept drift and class imbalance in multi-label stream classification. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence - Volume Two, IJCAI 2011, pp. 1583–1588. AAAI Press (2011). https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-266 Xioufis, E.S., Spiliopoulou, M., Tsoumakas, G., Vlahavas, I.: Dealing with concept drift and class imbalance in multi-label stream classification. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence - Volume Two, IJCAI 2011, pp. 1583–1588. AAAI Press (2011). https://​doi.​org/​10.​5591/​978-1-57735-516-8/​IJCAI11-266
419.
Zurück zum Zitat Yao, Y., Zhao, Y., Wang, J.: On reduct construction algorithms. Trans. Comput. Sci. 2, 100–117 (2008)MATH Yao, Y., Zhao, Y., Wang, J.: On reduct construction algorithms. Trans. Comput. Sci. 2, 100–117 (2008)MATH
420.
Zurück zum Zitat Yao, Y., Zhong, N.: Granular computing. In: Wah, B.W. (ed.) Wiley Encyclopedia of Computer Science and Engineering. Wiley, Hoboken (2008) Yao, Y., Zhong, N.: Granular computing. In: Wah, B.W. (ed.) Wiley Encyclopedia of Computer Science and Engineering. Wiley, Hoboken (2008)
423.
Zurück zum Zitat Zadeh, L.A.: Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 90(2), 111–127 (1997)MathSciNetMATHCrossRef Zadeh, L.A.: Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 90(2), 111–127 (1997)MathSciNetMATHCrossRef
424.
Zurück zum Zitat Zadeh, L.A.: From computing with numbers to computing with words—from manipulation of measurements to manipulation of perceptions. In: Azvine, B., Nauck, D.D., Azarmi, N. (eds.) Intelligent Systems and Soft Computing. LNCS (LNAI), vol. 1804, pp. 3–40. Springer, Heidelberg (2000). https://doi.org/10.1007/10720181_1CrossRef Zadeh, L.A.: From computing with numbers to computing with words—from manipulation of measurements to manipulation of perceptions. In: Azvine, B., Nauck, D.D., Azarmi, N. (eds.) Intelligent Systems and Soft Computing. LNCS (LNAI), vol. 1804, pp. 3–40. Springer, Heidelberg (2000). https://​doi.​org/​10.​1007/​10720181_​1CrossRef
426.
Zurück zum Zitat Zagorecki, A.: A versatile approach to classification of multivariate time series data. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015, pp. 407–410. IEEE (2015) Zagorecki, A.: A versatile approach to classification of multivariate time series data. In: Ganzha, M., Maciaszek, L.A., Paprzycki, M. (eds.) 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Lódz, Poland, 13–16 September 2015, pp. 407–410. IEEE (2015)
427.
Zurück zum Zitat Zdravevski, E., Lameski, P., Dimitrievski, A., Grzegorowski, M., Apanowicz, C.: Cluster-size optimization within a cloud-based ETL framework for Big Data. In: 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019, pp. 3754–3763. IEEE (2019). https://doi.org/10.1109/BigData47090.2019.9006547 Zdravevski, E., Lameski, P., Dimitrievski, A., Grzegorowski, M., Apanowicz, C.: Cluster-size optimization within a cloud-based ETL framework for Big Data. In: 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019, pp. 3754–3763. IEEE (2019). https://​doi.​org/​10.​1109/​BigData47090.​2019.​9006547
428.
Zurück zum Zitat Zdravevski, E., Lameski, P., Kulakov, A.: Automatic feature engineering for prediction of dangerous seismic activities in coal mines. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 245–248. IEEE (2016). https://doi.org/10.15439/2016F152 Zdravevski, E., Lameski, P., Kulakov, A.: Automatic feature engineering for prediction of dangerous seismic activities in coal mines. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the 2016 Federated Conference on Computer Science and Information Systems. Annals of Computer Science and Information Systems, vol. 8, pp. 245–248. IEEE (2016). https://​doi.​org/​10.​15439/​2016F152
429.
Zurück zum Zitat Zdravevski, E., Lameski, P., Mingov, R., Kulakov, A., Gjorgjevikj, D.: Robust histogram-based feature engineering of time series data. In: Proceedings of FedCSIS 2015, pp. 381–388 (2015) Zdravevski, E., Lameski, P., Mingov, R., Kulakov, A., Gjorgjevikj, D.: Robust histogram-based feature engineering of time series data. In: Proceedings of FedCSIS 2015, pp. 381–388 (2015)
432.
Zurück zum Zitat Zhang, X., Qian, B., Cao, S., Li, Y., Chen, H., Zheng, Y., Davidson, I.: Inprem: an interpretable and trustworthy predictive model for healthcare. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 450–460. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3394486.3403087 Zhang, X., Qian, B., Cao, S., Li, Y., Chen, H., Zheng, Y., Davidson, I.: Inprem: an interpretable and trustworthy predictive model for healthcare. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 450–460. Association for Computing Machinery, New York (2020). https://​doi.​org/​10.​1145/​3394486.​3403087
435.
Zurück zum Zitat Zhao, Y., Udell, M.: Missing value imputation for mixed data via gaussian copula. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 636–646. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3394486.3403106 Zhao, Y., Udell, M.: Missing value imputation for mixed data via gaussian copula. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2020, pp. 636–646. Association for Computing Machinery, New York (2020). https://​doi.​org/​10.​1145/​3394486.​3403106
436.
Zurück zum Zitat Zhao, Z., Zhang, R., Cox, J., Duling, D., Sarle, W.: Massively parallel feature selection: an approach based on variance preservation. Mach. Learn. 92(1), 195–220 (2013)MathSciNetMATHCrossRef Zhao, Z., Zhang, R., Cox, J., Duling, D., Sarle, W.: Massively parallel feature selection: an approach based on variance preservation. Mach. Learn. 92(1), 195–220 (2013)MathSciNetMATHCrossRef
Metadaten
Titel
Selected Aspects of Interactive Feature Extraction
verfasst von
Marek Grzegorowski
Copyright-Jahr
2022
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-66544-2_8

Premium Partner