Skip to main content

01.03.2008

Empirical studies to assess the understandability of data warehouse schemas using structural metrics

verfasst von: Manuel Angel Serrano, Coral Calero, Houari A. Sahraoui, Mario Piattini

Erschienen in: Software Quality Journal | Ausgabe 1/2008

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data warehouses are powerful tools for making better and faster decisions in organizations where information is an asset of primary importance. Due to the complexity of data warehouses, metrics and procedures are required to continuously assure their quality. This article describes an empirical study and a replication aimed at investigating the use of structural metrics as indicators of the understandability, and by extension, the cognitive complexity of data warehouse schemas. More specifically, a four-step analysis is conducted: (1) check if individually and collectively, the considered metrics can be correlated with schema understandability using classical statistical techniques, (2) evaluate whether understandability can be predicted by case similarity using the case-based reasoning technique, (3) determine, for each level of understandability, the subsets of metrics that are important by means of a classification technique, and assess, by means of a probabilistic technique, the degree of participation of each metric in the understandability prediction. The results obtained show that although a linear model is a good approximation of the relation between structure and understandability, the associated coefficients are not significant enough. Additionally, classification analyses reveal respectively that prediction can be achieved by considering structure similarity, that extracted classification rules can be used to estimate the magnitude of understandability, and that some metrics such as the number of fact tables have more impact than others.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Figure 6 does not show schema S07 because it was removed in the replication. See Sect. 4.3.1 for more information.
 
Literatur
Zurück zum Zitat Anahory, S., & Murray, D. (1997). Data warehousing in the real world. Harlow, UK: Addison-Wesley. Anahory, S., & Murray, D. (1997). Data warehousing in the real world. Harlow, UK: Addison-Wesley.
Zurück zum Zitat Basili, V. R., Shull, F., & Lanubille, F. (1999). Building knowledge through families of experiments. IEEE Transactions on Software Engineering, 25(4), 456–473.CrossRef Basili, V. R., Shull, F., & Lanubille, F. (1999). Building knowledge through families of experiments. IEEE Transactions on Software Engineering, 25(4), 456–473.CrossRef
Zurück zum Zitat Bouzeghoub, M., & Kedad, Z. (2002). Information and database quality, Chapter 8, Quality in data warehousing (pp. 163–198). Kluwer Academic Publishers. Bouzeghoub, M., & Kedad, Z. (2002). Information and database quality, Chapter 8, Quality in data warehousing (pp. 163–198). Kluwer Academic Publishers.
Zurück zum Zitat Briand, L., Morasca, S., & Basili, V. (1996). Property-based software engineering measurement. IEEE Transactions on Software Engineering, 22(1), 68–86.CrossRef Briand, L., Morasca, S., & Basili, V. (1996). Property-based software engineering measurement. IEEE Transactions on Software Engineering, 22(1), 68–86.CrossRef
Zurück zum Zitat Briand, L., Ikonomovski, S., Lounis, H., & Wüst, J. (1998). A Comprehensive investigation of quality factors in object-oriented designs: An industrial case study, Technical Report ISERN-98-29. Germany: Fraunhofer Institute for Experimental Software Engineering. Briand, L., Ikonomovski, S., Lounis, H., & Wüst, J. (1998). A Comprehensive investigation of quality factors in object-oriented designs: An industrial case study, Technical Report ISERN-98-29. Germany: Fraunhofer Institute for Experimental Software Engineering.
Zurück zum Zitat Calero, C., Piattini, M., Pascual, C., & Serrano, M. (2001). Towards Data warehouse Quality Metrics, International Workshop on Design and Management of Data Warehouses (DMDW’01). Calero, C., Piattini, M., Pascual, C., & Serrano, M. (2001). Towards Data warehouse Quality Metrics, International Workshop on Design and Management of Data Warehouses (DMDW’01).
Zurück zum Zitat Carver, J., Jaccheri, L., Morasca, S., & Shull, F. (2003). Issues in using students in empirical studies in software engineering education. In Proceedings of 2003 International Symposium on software metrics (METRICS 2003). Sydney, Australia. September 2003, pp. 239–249. Carver, J., Jaccheri, L., Morasca, S., & Shull, F. (2003). Issues in using students in empirical studies in software engineering education. In Proceedings of 2003 International Symposium on software metrics (METRICS 2003). Sydney, Australia. September 2003, pp. 239–249.
Zurück zum Zitat Debevoise, N. T. (1999). The data warehouse method. NJ: Prentice Hall Upper Saddle River. Debevoise, N. T. (1999). The data warehouse method. NJ: Prentice Hall Upper Saddle River.
Zurück zum Zitat Fenton, N., & Pfleeger, S. (1997). Software metrics: A rigorous approach (2nd ed.). London: Chapman & Hall. Fenton, N., & Pfleeger, S. (1997). Software metrics: A rigorous approach (2nd ed.). London: Chapman & Hall.
Zurück zum Zitat Flach, P., & Lachiche, N. (1999). 1BC: A First-Order Bayesian Classifier. In Proceedings of the Ninth International Workshop on inductive logic programming (ILP’99), volume 1634 of lecture notes in artificial intelligence, pp. 92–103. Flach, P., & Lachiche, N. (1999). 1BC: A First-Order Bayesian Classifier. In Proceedings of the Ninth International Workshop on inductive logic programming (ILP’99), volume 1634 of lecture notes in artificial intelligence, pp. 92–103.
Zurück zum Zitat Godin, R., Mineau, G., Missaoui, R., St-Germain, M., & Faraj, N. (1995). Applying concept formation methods to software reuse. International Journal of Knowledge Engineering and Software Engineering, 5(1), 119–142.CrossRef Godin, R., Mineau, G., Missaoui, R., St-Germain, M., & Faraj, N. (1995). Applying concept formation methods to software reuse. International Journal of Knowledge Engineering and Software Engineering, 5(1), 119–142.CrossRef
Zurück zum Zitat Grosser, D., Sahraoui, H. A., & Valtchev, P. (2003). An analogy-based approach for predicting design stability of Java classes. In International Symposium on Software Metrics (METRICS’03), pp. 252–262. Grosser, D., Sahraoui, H. A., & Valtchev, P. (2003). An analogy-based approach for predicting design stability of Java classes. In International Symposium on Software Metrics (METRICS’03), pp. 252–262.
Zurück zum Zitat Hörst, M., Regnell, B., & Wohlin, C. (2000). Using students as subjects – A comparative study of students & professionals in lead-time impact assessment. In 4th Conference on empirical assessment & evaluation in software engineering, EASE, Keele University, UK. Hörst, M., Regnell, B., & Wohlin, C. (2000). Using students as subjects – A comparative study of students & professionals in lead-time impact assessment. In 4th Conference on empirical assessment & evaluation in software engineering, EASE, Keele University, UK.
Zurück zum Zitat Huang, K.-T., Lee, Y. W., & Wang, R. Y. (1999). Quality information and knowledge. Prentice Hall: Upper Saddle River. Huang, K.-T., Lee, Y. W., & Wang, R. Y. (1999). Quality information and knowledge. Prentice Hall: Upper Saddle River.
Zurück zum Zitat Inmon, W. H. (1997). Building the data warehouse (2nd ed.). John Wiley and Sons. Inmon, W. H. (1997). Building the data warehouse (2nd ed.). John Wiley and Sons.
Zurück zum Zitat ISO. (2001). Software product evaluation-quality characteristics and guidelines for their use. Geneva: ISO/IEC Standard 9126. ISO. (2001). Software product evaluation-quality characteristics and guidelines for their use. Geneva: ISO/IEC Standard 9126.
Zurück zum Zitat Jarke, M., LenzerinI, I. M., Vassilou, Y., & Vassiliadis, P. (2000). Fundamentals of data warehouses. Springer. Jarke, M., LenzerinI, I. M., Vassilou, Y., & Vassiliadis, P. (2000). Fundamentals of data warehouses. Springer.
Zurück zum Zitat Kimball, R., Reeves, L., Ross, M., & Thornthwaite, W. (1998). The data warehouse lifecycle toolkit. John Wiley and Sons. Kimball, R., Reeves, L., Ross, M., & Thornthwaite, W. (1998). The data warehouse lifecycle toolkit. John Wiley and Sons.
Zurück zum Zitat Kitchenham, B., Pfleegger, S., Pickard, L., Jones, P., Hoaglin, D., El-Emam, K., & Rosenberg, J. (2002). Preliminary guidelines for empirical research in software engineering. IEEE Transactions of Software Engineering, 28(8), 721–734.CrossRef Kitchenham, B., Pfleegger, S., Pickard, L., Jones, P., Hoaglin, D., El-Emam, K., & Rosenberg, J. (2002). Preliminary guidelines for empirical research in software engineering. IEEE Transactions of Software Engineering, 28(8), 721–734.CrossRef
Zurück zum Zitat Poels, G., & Dedene G. (1999). DISTANCE: A framework for software measure construction. Belgium: Dept. Applied Economics Katholieke Universiteit Leuven. Poels, G., & Dedene G. (1999). DISTANCE: A framework for software measure construction. Belgium: Dept. Applied Economics Katholieke Universiteit Leuven.
Zurück zum Zitat Ramoni, M., & Sebastiani, P. (1999). Bayesian methods for intelligent data analysis. In: M. Berthold & D. J. Hand (Eds.), An introduction to intelligent data analysis. Springer: New York. Ramoni, M., & Sebastiani, P. (1999). Bayesian methods for intelligent data analysis. In: M. Berthold & D. J. Hand (Eds.), An introduction to intelligent data analysis. Springer: New York.
Zurück zum Zitat Schneidewind, N. (2002). Body of knowledge for software quality measurement. IEEE Computer, 35(2), 77–83. Schneidewind, N. (2002). Body of knowledge for software quality measurement. IEEE Computer, 35(2), 77–83.
Zurück zum Zitat Serrano, M., Calero, C., & Piattini, M. (2002). Validating metrics for data warehouses. IEE Proceedings SOFTWARE, 149(5), 161–166.CrossRef Serrano, M., Calero, C., & Piattini, M. (2002). Validating metrics for data warehouses. IEE Proceedings SOFTWARE, 149(5), 161–166.CrossRef
Zurück zum Zitat Serrano, M., Calero, C., & Piattini, M. (2005). An experimental replication with data warehouse metrics. International Journal of Data Warehousing & Mining, 1(4), 1–21. Serrano, M., Calero, C., & Piattini, M. (2005). An experimental replication with data warehouse metrics. International Journal of Data Warehousing & Mining, 1(4), 1–21.
Zurück zum Zitat Wilson, D., & Martinez, T. (1997). Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 6, 1–34.MATHMathSciNet Wilson, D., & Martinez, T. (1997). Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 6, 1–34.MATHMathSciNet
Zurück zum Zitat Wohlin, C., Runeson, P., Höst, M., Ohlson, M., Regnell, B., & Wesslén, A. (2000). Experimentation in software engineering: An introduction. Kluwer Academic Publishers. Wohlin, C., Runeson, P., Höst, M., Ohlson, M., Regnell, B., & Wesslén, A. (2000). Experimentation in software engineering: An introduction. Kluwer Academic Publishers.
Zurück zum Zitat Zuse, H. (1998). A framework of software measurement. Berlin: Walter de Gruyter. Zuse, H. (1998). A framework of software measurement. Berlin: Walter de Gruyter.
Metadaten
Titel
Empirical studies to assess the understandability of data warehouse schemas using structural metrics
verfasst von
Manuel Angel Serrano
Coral Calero
Houari A. Sahraoui
Mario Piattini
Publikationsdatum
01.03.2008
Verlag
Springer US
Erschienen in
Software Quality Journal / Ausgabe 1/2008
Print ISSN: 0963-9314
Elektronische ISSN: 1573-1367
DOI
https://doi.org/10.1007/s11219-007-9030-7