Skip to main content
Erschienen in: Automated Software Engineering 3/2016

01.09.2016

Model inference for spreadsheets

verfasst von: Jácome Cunha, Martin Erwig, Jorge Mendes, João Saraiva

Erschienen in: Automated Software Engineering | Ausgabe 3/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints in the generated spreadsheets. Employing this strategy for legacy spreadsheets is difficult, because the model has to be reverse engineered from an existing spreadsheet and existing data must be transferred into the new model-generated spreadsheet. We have developed and implemented a technique that automatically infers relational schemas from spreadsheets. This technique uses particularities from the spreadsheet realm to create better schemas. We have evaluated this technique in two ways: first, we have demonstrated its applicability by using it on a set of real-world spreadsheets. Second, we have run an empirical study with users. The study has shown that the results produced by our technique are comparable to the ones developed by experts starting from the same (legacy) spreadsheet data. Although relational schemas are very useful to model data, they do not fit spreadsheets well, as they do not allow expressing layout. Thus, we have also introduced a mapping between relational schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards it against a large class of formula errors. The developed tool is a contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
More information about the degree can be found in http://​mei.​di.​uminho.​pt/​?​q=​en/​1213/​ssd-uk.
 
2
One spreadsheet, c6/Adbudget6.xls, was unreadable.
 
3
We omit here the column labels, whose names depend on the number of columns in the generated table.
 
Literatur
Zurück zum Zitat Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: ICSE ’06 Proceedings of the 28th International Conference on Software Engineering, pp. 182–191. ACM, New York, NY, USA (2006) Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: ICSE ’06 Proceedings of the 28th International Conference on Software Engineering, pp. 182–191. ACM, New York, NY, USA (2006)
Zurück zum Zitat Abraham, R., Erwig, M., Kollmansberger, S., Seifert, E.: Visual specifications of correct spreadsheets. In: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing (VLHCC ’05), IEEE Computer Society, pp. 189–196 (2005) Abraham, R., Erwig, M., Kollmansberger, S., Seifert, E.: Visual specifications of correct spreadsheets. In: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing (VLHCC ’05), IEEE Computer Society, pp. 189–196 (2005)
Zurück zum Zitat Alhajj, R.: Extracting the extended entity-relationship model from a legacy relational database. Inf. Syst. 28(6), 597–618 (2003)CrossRefMATH Alhajj, R.: Extracting the extended entity-relationship model from a legacy relational database. Inf. Syst. 28(6), 597–618 (2003)CrossRefMATH
Zurück zum Zitat Cheng, T., Rival, X.: An abstract domain to infer types over zones in spreadsheets. In: Miné, A., Schmidt, D. (eds.) Static Analysis. Lecture Notes in Computer Science, vol. 7460, pp. 94–110. Springer, Berlin (2012)CrossRef Cheng, T., Rival, X.: An abstract domain to infer types over zones in spreadsheets. In: Miné, A., Schmidt, D. (eds.) Static Analysis. Lecture Notes in Computer Science, vol. 7460, pp. 94–110. Springer, Berlin (2012)CrossRef
Zurück zum Zitat Codd, E.F.: A relational model of data for large shared data banks. Commun ACM 13(6), 377–387 (1970)CrossRefMATH Codd, E.F.: A relational model of data for large shared data banks. Commun ACM 13(6), 377–387 (1970)CrossRefMATH
Zurück zum Zitat Connolly, T., Begg, C.: Database Systems, A Practical Approach to Design, Implementation, and Management, 3rd edn. Addison-Wesley, Harlow (2002)MATH Connolly, T., Begg, C.: Database Systems, A Practical Approach to Design, Implementation, and Management, 3rd edn. Addison-Wesley, Harlow (2002)MATH
Zurück zum Zitat Cunha, J.: Model-based spreadsheet engineering. PhD thesis, Universidade do Minho (2011) Cunha, J.: Model-based spreadsheet engineering. PhD thesis, Universidade do Minho (2011)
Zurück zum Zitat Cunha, J., Saraiva, J., Visser, J.: Discovery-based edit assistance for spreadsheets. In: 2009 IEEE Symposium on Visual Languages and Human-Centric Computing, IEEE, pp. 233–237 (2009a) Cunha, J., Saraiva, J., Visser, J.: Discovery-based edit assistance for spreadsheets. In: 2009 IEEE Symposium on Visual Languages and Human-Centric Computing, IEEE, pp. 233–237 (2009a)
Zurück zum Zitat Cunha, J., Saraiva, J., Visser, J.: From spreadsheets to relational databases and back. In: PEPM ’09 Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation. ACM, New York, NY, USA, pp. 179–188 (2009b) Cunha, J., Saraiva, J., Visser, J.: From spreadsheets to relational databases and back. In: PEPM ’09 Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation. ACM, New York, NY, USA, pp. 179–188 (2009b)
Zurück zum Zitat Cunha, J., Erwig, M., Saraiva, J. Automatically inferring ClassSheet models from spreadsheets. In: Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, IEEE Computer Society, VLHCC ’10, pp. 93–100 (2010) Cunha, J., Erwig, M., Saraiva, J. Automatically inferring ClassSheet models from spreadsheets. In: Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, IEEE Computer Society, VLHCC ’10, pp. 93–100 (2010)
Zurück zum Zitat Cunha J, Mendes, J., Fernandes, J.P., Saraiva, J.: Embedding and evolution of spreadsheet models in spreadsheet systems. In: IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’11), IEEE Computer Society, pp. 186–201 (2011a) Cunha J, Mendes, J., Fernandes, J.P., Saraiva, J.: Embedding and evolution of spreadsheet models in spreadsheet systems. In: IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’11), IEEE Computer Society, pp. 186–201 (2011a)
Zurück zum Zitat Cunha, J., Visser, J., Alves, T., Saraiva, J.: Type-safe evolution of spreadsheets. In: Giannakopoulou, D., Orejas, F. (eds.) Fundamental Approaches to Software Engineering. Lecture Notes in Computer Science, vol. 6603, pp. 186–201. Springer, Berlin (2011b)CrossRef Cunha, J., Visser, J., Alves, T., Saraiva, J.: Type-safe evolution of spreadsheets. In: Giannakopoulou, D., Orejas, F. (eds.) Fundamental Approaches to Software Engineering. Lecture Notes in Computer Science, vol. 6603, pp. 186–201. Springer, Berlin (2011b)CrossRef
Zurück zum Zitat Cunha, J., Fernandes, J.P., Mendes, J., Pacheco, H., Saraiva, J.: Bidirectional transformation of model-driven spreadsheets. In: Hu, Z., de Lara, J. (eds.) Theory and Practice of Model Transformations. Lecture Notes in Computer Science, vol. 7307, pp. 105–120. Springer, Berlin (2012a)CrossRef Cunha, J., Fernandes, J.P., Mendes, J., Pacheco, H., Saraiva, J.: Bidirectional transformation of model-driven spreadsheets. In: Hu, Z., de Lara, J. (eds.) Theory and Practice of Model Transformations. Lecture Notes in Computer Science, vol. 7307, pp. 105–120. Springer, Berlin (2012a)CrossRef
Zurück zum Zitat Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: MDSheet: a framework for model-driven Spreadsheet Engineering. In: Proceedings of the 34rd ACM International Conference on Software Engineering (ICSE’12), pp. 1412–1415 (2012b) Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: MDSheet: a framework for model-driven Spreadsheet Engineering. In: Proceedings of the 34rd ACM International Conference on Software Engineering (ICSE’12), pp. 1412–1415 (2012b)
Zurück zum Zitat Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: Towards an evaluation of bidirectional model-driven spreadsheets. User evaluation for Software Engineering Researchers, ACM Digital Library, USER’ 12, 25–28 (2012c) Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: Towards an evaluation of bidirectional model-driven spreadsheets. User evaluation for Software Engineering Researchers, ACM Digital Library, USER’ 12, 25–28 (2012c)
Zurück zum Zitat Cunha, J., Fernandes, J.P., Saraiva, J.: From Relational ClassSheets to UML+OCL. The Software Engineering Track at the 27th Annual ACM Symposium On Applied Computing (SAC 2012), Riva del Garda (Trento), pp. 1151–1158. ACM, Italy (2012d) Cunha, J., Fernandes, J.P., Saraiva, J.: From Relational ClassSheets to UML+OCL. The Software Engineering Track at the 27th Annual ACM Symposium On Applied Computing (SAC 2012), Riva del Garda (Trento), pp. 1151–1158. ACM, Italy (2012d)
Zurück zum Zitat Engels, G., Erwig, M.: ClassSheets: automatic generation of spreadsheet applications from object-oriented specifications. In: Redmiles, D., Ellman, T., Zisman, A. (eds.) 20th IEEE/ACM International Conference on Automated Software Engineering, Long Beach, USA, pp. 124–133. ACM (2005) Engels, G., Erwig, M.: ClassSheets: automatic generation of spreadsheet applications from object-oriented specifications. In: Redmiles, D., Ellman, T., Zisman, A. (eds.) 20th IEEE/ACM International Conference on Automated Software Engineering, Long Beach, USA, pp. 124–133. ACM (2005)
Zurück zum Zitat Erwig, M.: Software engineering for spreadsheets. IEEE Softw. 29(5), 25–30 (2009)CrossRef Erwig, M.: Software engineering for spreadsheets. IEEE Softw. 29(5), 25–30 (2009)CrossRef
Zurück zum Zitat Hermans, F., Pinzger, M., van Deursen, A.: Automatically extracting class diagrams from spreadsheets. In: Proceedings of the 24th European conference on Object-oriented programming, Springer-Verlag, Berlin, Heidelberg, ECOOP’10, pp. 52–75 (2010) Hermans, F., Pinzger, M., van Deursen, A.: Automatically extracting class diagrams from spreadsheets. In: Proceedings of the 24th European conference on Object-oriented programming, Springer-Verlag, Berlin, Heidelberg, ECOOP’10, pp. 52–75 (2010)
Zurück zum Zitat Höst, M., Regnell, B., Wohlin, C.: Using students as subjects–a comparative study of students and professionals in lead-time impact assessment. Empir. Softw. Eng. 5(3), 201–214 (2000). doi:10.1023/A:1026586415054 CrossRefMATH Höst, M., Regnell, B., Wohlin, C.: Using students as subjects–a comparative study of students and professionals in lead-time impact assessment. Empir. Softw. Eng. 5(3), 201–214 (2000). doi:10.​1023/​A:​1026586415054 CrossRefMATH
Zurück zum Zitat Isakowitz, T., Schocken, S., Lucas Jr, H.C.: Toward a logical/physical theory of spreadsheet modeling. ACM Trans. Inf. Syst. 13(1), 1–37 (1995)CrossRef Isakowitz, T., Schocken, S., Lucas Jr, H.C.: Toward a logical/physical theory of spreadsheet modeling. ACM Trans. Inf. Syst. 13(1), 1–37 (1995)CrossRef
Zurück zum Zitat Lämmel, R., Visser, J. : A Strafunski application letter. In: Dahl, V., Wadler, P. (eds) Proceedings of Practical Aspects of Declarative Programming (PADL’03), LNCS, vol. 2562, pp. 357–375. Springer (2003) Lämmel, R., Visser, J. : A Strafunski application letter. In: Dahl, V., Wadler, P. (eds) Proceedings of Practical Aspects of Declarative Programming (PADL’03), LNCS, vol. 2562, pp. 357–375. Springer (2003)
Zurück zum Zitat Maier, D.: The Theory of Relational Databases. Computer Science Press, Rockville (1983)MATH Maier, D.: The Theory of Relational Databases. Computer Science Press, Rockville (1983)MATH
Zurück zum Zitat Novelli, N., Cicchetti, R.: Fun: an efficient algorithm for mining functional and embedded dependencies. In: ICDT ’01: Proceedings of the 8th International Conference on Database Theory, pp. 189–203. Springer-Verlag, London, UK (2001) Novelli, N., Cicchetti, R.: Fun: an efficient algorithm for mining functional and embedded dependencies. In: ICDT ’01: Proceedings of the 8th International Conference on Database Theory, pp. 189–203. Springer-Verlag, London, UK (2001)
Zurück zum Zitat Panko, R.: Spreadsheet errors: what we know. what we think we can do. In: Proceedings of the European Spreadsheet Risks Interest Group Annual Conference (EuSpRIG), pp. 7–17 (2000) Panko, R.: Spreadsheet errors: what we know. what we think we can do. In: Proceedings of the European Spreadsheet Risks Interest Group Annual Conference (EuSpRIG), pp. 7–17 (2000)
Zurück zum Zitat Perry, D.E., Porter, A.A., Votta, L.G.: Empirical studies of software engineering: a roadmap. In: ICSE ’00 Proceedings of the Conference on The Future of Software Engineering, pp. 345–355. ACM, New York, NY, USA (2000) Perry, D.E., Porter, A.A., Votta, L.G.: Empirical studies of software engineering: a roadmap. In: ICSE ’00 Proceedings of the Conference on The Future of Software Engineering, pp. 345–355. ACM, New York, NY, USA (2000)
Zurück zum Zitat Powell, S.G., Baker, K.R.: The Art of Modeling with Spreadsheets. Wiley, New York (2003) Powell, S.G., Baker, K.R.: The Art of Modeling with Spreadsheets. Wiley, New York (2003)
Zurück zum Zitat Rajalingham, K., Chadwick, D., Knight, B.: Classification of spreadsheet errors. In: Proceedings of the European Spreadsheet Risks Interest Group Annual Conference (EuSpRIG), pp. 23–34 (2001) Rajalingham, K., Chadwick, D., Knight, B.: Classification of spreadsheet errors. In: Proceedings of the European Spreadsheet Risks Interest Group Annual Conference (EuSpRIG), pp. 23–34 (2001)
Zurück zum Zitat Ullman, J.D., Widom, J.: A First Course in Database Systems. Prentice Hall, Upper Saddle River (1997) Ullman, J.D., Widom, J.: A First Course in Database Systems. Prentice Hall, Upper Saddle River (1997)
Zurück zum Zitat Visser, J., Saraiva, J.: Tutorial on strategic programming across programming paradigms. In: 8th Brazilian Symposium on Programming Languages, Niteroi, Brazil (2004) Visser, J., Saraiva, J.: Tutorial on strategic programming across programming paradigms. In: 8th Brazilian Symposium on Programming Languages, Niteroi, Brazil (2004)
Metadaten
Titel
Model inference for spreadsheets
verfasst von
Jácome Cunha
Martin Erwig
Jorge Mendes
João Saraiva
Publikationsdatum
01.09.2016
Verlag
Springer US
Erschienen in
Automated Software Engineering / Ausgabe 3/2016
Print ISSN: 0928-8910
Elektronische ISSN: 1573-7535
DOI
https://doi.org/10.1007/s10515-014-0167-x

Weitere Artikel der Ausgabe 3/2016

Automated Software Engineering 3/2016 Zur Ausgabe