Skip to main content

2019 | OriginalPaper | Buchkapitel

Method for the Assessment of Semantic Accuracy Using Rules Identified by Conditional Functional Dependencies

verfasst von : Vanusa S. Santana, Fábio S. Lopes

Erschienen in: Metadata and Semantic Research

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data is a central resource of organizations, which makes data quality essential for their intellectual growth. Quality is seen as a multifaceted concept and, in general, refers to suitability for use. This indicates that the pillar for the quality evaluation is the definition of a set of quality rules, determined from the criteria of the business. However, it may be impossible to manually specify the quality rules for the evaluation. The use of Conditional Functional Dependencies (CFDs) allows to automatically identifying context-dependent quality rules. This paper presents a method for assess data quality using the CFD concept to extract quality rules and identify inconsistencies. The quality of the database in the proposed method will be evaluated in the semantic accuracy dimension. The method consolidates the process of knowledge discovery with data quality assessment, listing the respective activities that result in the quantification of semantic accuracy. An instance of the method has been demonstrated by applying it in the context of air quality monitoring data. The evaluation of the method showed that the CFDs rules were able to reflect some atmospheric phenomena, emerging interesting context-dependent rules. The patterns of the transactions, which may be unknown by the users, can be used as input for the evaluation and monitoring of data quality.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abdo, A.S., Rashed, K.S., Hatem, M.A.: Enhancement of data quality in health care industry: a promising data quality approach. In: Handbook of Research on Machine Learning Innovations and Trends, pp. 230–250. IGI Global (2017) Abdo, A.S., Rashed, K.S., Hatem, M.A.: Enhancement of data quality in health care industry: a promising data quality approach. In: Handbook of Research on Machine Learning Innovations and Trends, pp. 230–250. IGI Global (2017)
2.
Zurück zum Zitat Abdullah, U., Sawar, M.J., Ahmed, A.: Design of a rule-based system using Structured Query Language. In: Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing DASC 2009, pp. 223–228. IEEE (2009) Abdullah, U., Sawar, M.J., Ahmed, A.: Design of a rule-based system using Structured Query Language. In: Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing DASC 2009, pp. 223–228. IEEE (2009)
3.
Zurück zum Zitat Alpar, P., Winkelsträter, S.: Assessment of data quality in accounting data with association rules. Expert Syst. Appl. 41(5), 2259–2268 (2014)CrossRef Alpar, P., Winkelsträter, S.: Assessment of data quality in accounting data with association rules. Expert Syst. Appl. 41(5), 2259–2268 (2014)CrossRef
6.
Zurück zum Zitat Batini, C., et al.: A comprehensive data quality methodology for web and structured data. Int. J. Innovative Comput. Appl. 1(3), 205–218 (2008)CrossRef Batini, C., et al.: A comprehensive data quality methodology for web and structured data. Int. J. Innovative Comput. Appl. 1(3), 205–218 (2008)CrossRef
8.
Zurück zum Zitat Chiang, F., Miller, R.J.: Discovering data quality rules. Proc. VLDB Endowment 1(1), 1166–1177 (2008)CrossRef Chiang, F., Miller, R.J.: Discovering data quality rules. Proc. VLDB Endowment 1(1), 1166–1177 (2008)CrossRef
9.
Zurück zum Zitat Du, Y., et al.: Discovering context-aware conditional functional dependencies. Front. Comput. Sci. 11(4), 688–701 (2017)CrossRef Du, Y., et al.: Discovering context-aware conditional functional dependencies. Front. Comput. Sci. 11(4), 688–701 (2017)CrossRef
10.
Zurück zum Zitat English, L.P.: Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits. Wiley, Hoboken (1999) English, L.P.: Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits. Wiley, Hoboken (1999)
11.
Zurück zum Zitat Fan, W., et al.: Discovering conditional functional dependencies. IEEE Trans. Knowl. Data Eng. 23(5), 683–698 (2011)CrossRef Fan, W., et al.: Discovering conditional functional dependencies. IEEE Trans. Knowl. Data Eng. 23(5), 683–698 (2011)CrossRef
12.
Zurück zum Zitat Furber, C., Hepp, M.: SWIQA – A semantic web information quality assessment framework. In: European Conference on Information Systems (ECIS) (2011) Furber, C., Hepp, M.: SWIQA – A semantic web information quality assessment framework. In: European Conference on Information Systems (ECIS) (2011)
13.
Zurück zum Zitat Guo, A., Liu, X., Sun, T.: Research on key problems of data quality in large industrial data environment. In: Proceedings of the 3rd International Conference on Robotics, Control and Automation (ICRCA 2018), pp. 245–248. ACM, New York (2018) Guo, A., Liu, X., Sun, T.: Research on key problems of data quality in large industrial data environment. In: Proceedings of the 3rd International Conference on Robotics, Control and Automation (ICRCA 2018), pp. 245–248. ACM, New York (2018)
14.
Zurück zum Zitat Heinrich, B., et al.: Requirements for data quality metrics. J. Data Inf. Qual. 9(2), 32 (2018). Article 12 Heinrich, B., et al.: Requirements for data quality metrics. J. Data Inf. Qual. 9(2), 32 (2018). Article 12
15.
Zurück zum Zitat IEC 25012: 2008 Software engineering-Software product Quality requirements and evaluation (SQuaRE) - data quality model (2008) IEC 25012: 2008 Software engineering-Software product Quality requirements and evaluation (SQuaRE) - data quality model (2008)
16.
Zurück zum Zitat Lira, T.S.: Modelagem e previsão da qualidade do ar na cidade de Uberlândia – MG. Tese (doutorado) Universidade Federal de Uberlândia, Programa de Pós-Graduação em Engenharia Química (2009) Lira, T.S.: Modelagem e previsão da qualidade do ar na cidade de Uberlândia – MG. Tese (doutorado) Universidade Federal de Uberlândia, Programa de Pós-Graduação em Engenharia Química (2009)
17.
Zurück zum Zitat Maydanchik, A.: Data Quality Assessment. Technics Publications, Basking Ridge, 322 p. (2007) Maydanchik, A.: Data Quality Assessment. Technics Publications, Basking Ridge, 322 p. (2007)
18.
Zurück zum Zitat Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)CrossRef Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)CrossRef
19.
Zurück zum Zitat Saha, B., Srivastava, D.: Data quality: the other face of big data. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 1294–1297. IEEE (2014) Saha, B., Srivastava, D.: Data quality: the other face of big data. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 1294–1297. IEEE (2014)
20.
Zurück zum Zitat Salem, R., Abdo, A.: Fixing rules for data cleaning based on conditional functional dependency. Future Comput. Inf. J. 1(1–2), 10–26 (2016)CrossRef Salem, R., Abdo, A.: Fixing rules for data cleaning based on conditional functional dependency. Future Comput. Inf. J. 1(1–2), 10–26 (2016)CrossRef
21.
Zurück zum Zitat Wang, R.Y., Strong, D.M.: Beyond accuracy: What data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)CrossRef Wang, R.Y., Strong, D.M.: Beyond accuracy: What data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)CrossRef
22.
Zurück zum Zitat Zhou, J., et al.: A method for generating fixing rules from constant conditional functional dependencies. IEEE Trans. Knowl. Data Eng. 6–11 (2016) Zhou, J., et al.: A method for generating fixing rules from constant conditional functional dependencies. IEEE Trans. Knowl. Data Eng. 6–11 (2016)
23.
Zurück zum Zitat Zhang, C., Yufeng, D.: Conditional functional dependency discovery and data repair based on decision tree. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 864–868 (2015) Zhang, C., Yufeng, D.: Conditional functional dependency discovery and data repair based on decision tree. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 864–868 (2015)
Metadaten
Titel
Method for the Assessment of Semantic Accuracy Using Rules Identified by Conditional Functional Dependencies
verfasst von
Vanusa S. Santana
Fábio S. Lopes
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-36599-8_25

Neuer Inhalt