Skip to main content
Top

2021 | OriginalPaper | Chapter

Linguistic Summaries Using Interval-Valued Fuzzy Representation of Imprecise Information - An Innovative Tool for Detecting Outliers

Authors : Agnieszka Duraj, Piotr S. Szczepaniak

Published in: Computational Science – ICCS 2021

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The practice of textual and numerical information processing often involves the need to analyze and test a database for the presence of items that differ substantially from other records. Such items, referred to as outliers, can be successfully detected using linguistic summaries. In this paper, we extend this approach by the use of non-monotonic quantifiers and interval-valued fuzzy sets. The results obtained by this innovative method confirm its usefulness for outlier detection, which is of significant practical relevance for database analysis applications.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Shareef, D.M.A.M., Aminifar, S.A.: Uncertainty handling in big data using fuzzy logic-literature review (2021) Shareef, D.M.A.M., Aminifar, S.A.: Uncertainty handling in big data using fuzzy logic-literature review (2021)
2.
go back to reference Ross, T.J., et al.: Fuzzy Logic with Engineering Applications, vol. 2. Wiley, Hoboken (2004)MATH Ross, T.J., et al.: Fuzzy Logic with Engineering Applications, vol. 2. Wiley, Hoboken (2004)MATH
3.
go back to reference Duraj, A., Szczepaniak, P.S.: Information outliers and their detection. In: Burgin, M., Hofkirchner, W. (eds.) Information Studies and the Quest for Transdisciplinarity, vol. 9, pp. 413–437, Chapter 15. World Scientific Publishing Company (2017) Duraj, A., Szczepaniak, P.S.: Information outliers and their detection. In: Burgin, M., Hofkirchner, W. (eds.) Information Studies and the Quest for Transdisciplinarity, vol. 9, pp. 413–437, Chapter 15. World Scientific Publishing Company (2017)
6.
go back to reference Barnett, V., Lewis, T.: Outliers in Statistical Data, vol. 3. Wiley, New York (1994)MATH Barnett, V., Lewis, T.: Outliers in Statistical Data, vol. 3. Wiley, New York (1994)MATH
7.
go back to reference Guevara, J., Canu, S., Hirata, R.: Support measure data description for group anomaly detection. In: ODDx3 Workshop on Outlier Definition, Detection, and Description at the 21st ACM SIGKDD International Conference On Knowledge Discovery And Data Mining (KDD 2015) (2015) Guevara, J., Canu, S., Hirata, R.: Support measure data description for group anomaly detection. In: ODDx3 Workshop on Outlier Definition, Detection, and Description at the 21st ACM SIGKDD International Conference On Knowledge Discovery And Data Mining (KDD 2015) (2015)
8.
go back to reference Xiong, L., Póczos, B., Schneider, J., Connolly, A., Vander Plas, J.: Hierarchical probabilistic models for group anomaly detection. In: International Conference on Artificial Intelligence and Statistics 2011, pp. 789–797. Springer (2011) Xiong, L., Póczos, B., Schneider, J., Connolly, A., Vander Plas, J.: Hierarchical probabilistic models for group anomaly detection. In: International Conference on Artificial Intelligence and Statistics 2011, pp. 789–797. Springer (2011)
9.
go back to reference Jayakumar, G., Thomas, B.J.: A new procedure of clustering based on multivariate outlier detection. J. Data Sci. 11(1), 69–84 (2013)MathSciNetCrossRef Jayakumar, G., Thomas, B.J.: A new procedure of clustering based on multivariate outlier detection. J. Data Sci. 11(1), 69–84 (2013)MathSciNetCrossRef
11.
go back to reference Yager, R.R.: Linguistic summaries as a tool for database discovery. In: FQAS, pp. 17–22 (1994) Yager, R.R.: Linguistic summaries as a tool for database discovery. In: FQAS, pp. 17–22 (1994)
12.
go back to reference Yager, R.: Linguistic summaries as a tool for databases discovery. In: Workshop on Fuzzy Databases System and Information Retrieval (1995) Yager, R.: Linguistic summaries as a tool for databases discovery. In: Workshop on Fuzzy Databases System and Information Retrieval (1995)
13.
go back to reference Kacprzyk, J., Wilbik, A., Zadrozny, S.: Linguistic summaries of time series via a quantifier based aggregation using the sugeno integral. In: 2006 IEEE International Conference on Fuzzy Systems, pp. 713–719. IEEE (2006) Kacprzyk, J., Wilbik, A., Zadrozny, S.: Linguistic summaries of time series via a quantifier based aggregation using the sugeno integral. In: 2006 IEEE International Conference on Fuzzy Systems, pp. 713–719. IEEE (2006)
14.
go back to reference Kacprzyk, J., Wilbik, A., Zadrożny, S.: Linguistic summarization of time series using a fuzzy quantifier driven aggregation. Fuzzy Sets Syst. 159(12), 1485–1499 (2008)MathSciNetCrossRef Kacprzyk, J., Wilbik, A., Zadrożny, S.: Linguistic summarization of time series using a fuzzy quantifier driven aggregation. Fuzzy Sets Syst. 159(12), 1485–1499 (2008)MathSciNetCrossRef
15.
go back to reference Kacprzyk, J., Yager, R.R., Zadrozny, S.: Fuzzy linguistic summaries of databases for an efficient business data analysis and decision support. In: Abramowicz, W., Zurada, J. (eds.) Knowledge Discovery for Business Information Systems. SECS, vol. 600, pp. 129–152. Springer, Boston (2002). https://doi.org/10.1007/0-306-46991-X_6CrossRef Kacprzyk, J., Yager, R.R., Zadrozny, S.: Fuzzy linguistic summaries of databases for an efficient business data analysis and decision support. In: Abramowicz, W., Zurada, J. (eds.) Knowledge Discovery for Business Information Systems. SECS, vol. 600, pp. 129–152. Springer, Boston (2002). https://​doi.​org/​10.​1007/​0-306-46991-X_​6CrossRef
16.
go back to reference Kacprzyk, J., Zadrożny, S.: Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools. Inf. Sci. 173(4), 281–304 (2005)MathSciNetCrossRef Kacprzyk, J., Zadrożny, S.: Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools. Inf. Sci. 173(4), 281–304 (2005)MathSciNetCrossRef
17.
go back to reference Kacprzyk, J., Wilbik, A., Zadrożny, S.: An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation. Int. J. Intell. Syst. 25(5), 411–439 (2010)MATH Kacprzyk, J., Wilbik, A., Zadrożny, S.: An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation. Int. J. Intell. Syst. 25(5), 411–439 (2010)MATH
18.
go back to reference Ng, R.: Outlier detection in personalized medicine. In: Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, p. 7 ACM (2013) Ng, R.: Outlier detection in personalized medicine. In: Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, p. 7 ACM (2013)
19.
go back to reference Aggarwal, C.C.: Toward exploratory test-instance-centered diagnosis in high-dimensional classification. IEEE Trans. Knowl. Data Eng. 19(8), 1001–1015 (2007)CrossRef Aggarwal, C.C.: Toward exploratory test-instance-centered diagnosis in high-dimensional classification. IEEE Trans. Knowl. Data Eng. 19(8), 1001–1015 (2007)CrossRef
20.
go back to reference Cramer, J.A., Shah, S.S., Battaglia, T.M., Banerji, S.N., Obando, L.A., Booksh, K.S.: Outlier detection in chemical data by fractal analysis. J. Chemom. 18(7–8), 317–326 (2004)CrossRef Cramer, J.A., Shah, S.S., Battaglia, T.M., Banerji, S.N., Obando, L.A., Booksh, K.S.: Outlier detection in chemical data by fractal analysis. J. Chemom. 18(7–8), 317–326 (2004)CrossRef
21.
go back to reference Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J.-Int. J. Very Large Data Bases 8(3–4), 237–253 (2000)CrossRef Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J.-Int. J. Very Large Data Bases 8(3–4), 237–253 (2000)CrossRef
23.
go back to reference Giatrakos, N., Kotidis, Y., Deligiannakis, A., Vassalos, V., Theodoridis, Y.: In-network approximate computation of outliers with quality guarantees. Inf. Syst. 38(8), 1285–1308 (2013)CrossRef Giatrakos, N., Kotidis, Y., Deligiannakis, A., Vassalos, V., Theodoridis, Y.: In-network approximate computation of outliers with quality guarantees. Inf. Syst. 38(8), 1285–1308 (2013)CrossRef
24.
go back to reference Last, M., Kandel, A.: Automated detection of outliers in real-world data. In: Proceedings of the Second International Conference on Intelligent Technologies, pp. 292–301 (2001) Last, M., Kandel, A.: Automated detection of outliers in real-world data. In: Proceedings of the Second International Conference on Intelligent Technologies, pp. 292–301 (2001)
25.
go back to reference Guo, Q., Wu, K., Li, W.: Fault forecast and diagnosis of steam turbine based on fuzzy rough set theory. In: Second International Conference on Innovative Computing, Information and Control 2007. ICICIC 2007, p. 501. IEEE (2007) Guo, Q., Wu, K., Li, W.: Fault forecast and diagnosis of steam turbine based on fuzzy rough set theory. In: Second International Conference on Innovative Computing, Information and Control 2007. ICICIC 2007, p. 501. IEEE (2007)
26.
go back to reference Kacprzyk, J., Zadrozny, S.: Protoforms of linguistic database summaries as a human consistent tool for using natural language in data mining. Int. J. Softw. Sci. Comput. Intell. (IJSSCI) 1(1), 100–111 (2009)CrossRef Kacprzyk, J., Zadrozny, S.: Protoforms of linguistic database summaries as a human consistent tool for using natural language in data mining. Int. J. Softw. Sci. Comput. Intell. (IJSSCI) 1(1), 100–111 (2009)CrossRef
27.
go back to reference Kacprzyk, J., Yager, R.R.: Linguistic summaries of data using fuzzy logic. Int. J. General Syst. 30(2), 133–154 (2001)MathSciNetCrossRef Kacprzyk, J., Yager, R.R.: Linguistic summaries of data using fuzzy logic. Int. J. General Syst. 30(2), 133–154 (2001)MathSciNetCrossRef
28.
go back to reference Wilbik, A., Keller, J.M.: A fuzzy measure similarity between sets of linguistic summaries. IEEE Trans. Fuzzy Syst. 21(1), 183–189 (2013)CrossRef Wilbik, A., Keller, J.M.: A fuzzy measure similarity between sets of linguistic summaries. IEEE Trans. Fuzzy Syst. 21(1), 183–189 (2013)CrossRef
29.
go back to reference Boriah, S., Chandola, V., Kumar, V.: Similarity measures for categorical data: a comparative evaluation. Red 30(2), 3 (2008) Boriah, S., Chandola, V., Kumar, V.: Similarity measures for categorical data: a comparative evaluation. Red 30(2), 3 (2008)
30.
go back to reference Duraj, A., Niewiadomski, A., Szczepaniak, P.S.: Outlier detection using linguistically quantified statements. Int. J. Intell. Syst. 33(9), 1858–1868 (2018)CrossRef Duraj, A., Niewiadomski, A., Szczepaniak, P.S.: Outlier detection using linguistically quantified statements. Int. J. Intell. Syst. 33(9), 1858–1868 (2018)CrossRef
31.
go back to reference Duraj, A., Niewiadomski, A., Szczepaniak, P.S.: Detection of outlier information by the use of linguistic summaries based on classic and interval-valued fuzzy sets. Int. J. Intell. Syst. 34(3), 415–438 (2019)CrossRef Duraj, A., Niewiadomski, A., Szczepaniak, P.S.: Detection of outlier information by the use of linguistic summaries based on classic and interval-valued fuzzy sets. Int. J. Intell. Syst. 34(3), 415–438 (2019)CrossRef
32.
go back to reference Duraj, A.: Outlier detection in medical data using linguistic summaries. In: 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp. 385–390. IEEE (2017) Duraj, A.: Outlier detection in medical data using linguistic summaries. In: 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp. 385–390. IEEE (2017)
34.
go back to reference van Benthem, J., Ter Meulen, A.: Handbook of Logic and Language. Elsevier, Amsterdam (1996)MATH van Benthem, J., Ter Meulen, A.: Handbook of Logic and Language. Elsevier, Amsterdam (1996)MATH
35.
go back to reference Benferhat, S., Dubois, D., Prade, H.: Nonmonotonic reasoning, conditional objects and possibility theory. Artif. Intell. 92(1–2), 259–276 (1997)MathSciNetCrossRef Benferhat, S., Dubois, D., Prade, H.: Nonmonotonic reasoning, conditional objects and possibility theory. Artif. Intell. 92(1–2), 259–276 (1997)MathSciNetCrossRef
36.
go back to reference Giordano, L., Gliozzi, V., Olivetti, N., Pozzato, G.L.: A non-monotonic description logic for reasoning about typicality. Artif. Intell. 195, 165–202 (2013)MathSciNetCrossRef Giordano, L., Gliozzi, V., Olivetti, N., Pozzato, G.L.: A non-monotonic description logic for reasoning about typicality. Artif. Intell. 195, 165–202 (2013)MathSciNetCrossRef
39.
go back to reference Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning-iii. Inf. Sci. 9(1), 43–80 (1975)MathSciNetCrossRef Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning-iii. Inf. Sci. 9(1), 43–80 (1975)MathSciNetCrossRef
42.
go back to reference Arora, N., Kaur, P.D.: A Bolasso based consistent feature selection enabled random forest classification algorithm: an application to credit risk assessment. Appl. Soft Comput. 86, 105936 (2020)CrossRef Arora, N., Kaur, P.D.: A Bolasso based consistent feature selection enabled random forest classification algorithm: an application to credit risk assessment. Appl. Soft Comput. 86, 105936 (2020)CrossRef
43.
go back to reference Kaur, S.: Comparative analysis of bankruptcy prediction models: An Indian perspective. CABELL’S DIRECTORY, USA 19 Kaur, S.: Comparative analysis of bankruptcy prediction models: An Indian perspective. CABELL’S DIRECTORY, USA 19
44.
go back to reference Altman, E.I., Iwanicz-Drozdowska, M., Laitinen, E.K., Suvas, A.: Financial distress prediction in an international context: a review and empirical analysis of Altman’s Z-score model. J. Int. Financ. Manag. Account. 28(2), 131–171 (2017)CrossRef Altman, E.I., Iwanicz-Drozdowska, M., Laitinen, E.K., Suvas, A.: Financial distress prediction in an international context: a review and empirical analysis of Altman’s Z-score model. J. Int. Financ. Manag. Account. 28(2), 131–171 (2017)CrossRef
Metadata
Title
Linguistic Summaries Using Interval-Valued Fuzzy Representation of Imprecise Information - An Innovative Tool for Detecting Outliers
Authors
Agnieszka Duraj
Piotr S. Szczepaniak
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-77980-1_38

Premium Partner