Skip to main content

2015 | OriginalPaper | Buchkapitel

A Reverse Engineering Process for Inferring Data Models from Spreadsheet-based Information Systems: An Automotive Industrial Experience

verfasst von : Domenico Amalfitano, Anna Rita Fasolino, Porfirio Tramontana, Vincenzo De Simone, Giancarlo Di Mare, Stefano Scala

Erschienen in: Data Management Technologies and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays Spreadsheet-based Information Systems are widely used in industries to support different phases of their production processes. The intensive employment of Spreadsheets in industry is mainly due to their ease of use that allows the development of Information Systems even by not experienced programmers. The development of such systems is further aided by integrated scripting languages (e.g. Visual Basic for Applications, Libre Office Basic, JavaScript, etc.) that offer features for the implementation of Rapid Application Development processes. Although Spreadsheet-based Information Systems can be developed with a very short time to market, they are usually poorly documented or in some case not documented at all. As a consequence, they are very difficult to be comprehended, maintained or migrated towards other architectures, such as Database Oriented Information Systems or Web Applications. The abstraction of a data model from the source spreadsheet files represents a fundamental activity of the migration process towards different architectures. In our work we present an heuristic- based reverse engineering process for inferring a data model from an Excel based information system. The process is fully automatic and it is based on seven sequential steps. Both the applicability and the effectiveness of the proposed process have been assessed by an experiment we conducted in the automotive industrial context. The process was successfully used to obtain the UML class diagrams representing the conceptual data models of three different Spreadsheet-based Information Systems. The paper presents the results of the experiment and the lessons we learned from it.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abraham, R., Erwig, M.: Header and unit inference for spreadsheets through spatial analyses. In: Proceedings of the IEEE International Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 165–172 (2004) Abraham, R., Erwig, M.: Header and unit inference for spreadsheets through spatial analyses. In: Proceedings of the IEEE International Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 165–172 (2004)
2.
Zurück zum Zitat Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: Proceedings of the 28th International Conference on Software Engineering (ICSE), pp. 182–191. ACM, New York (2006) Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: Proceedings of the 28th International Conference on Software Engineering (ICSE), pp. 182–191. ACM, New York (2006)
3.
Zurück zum Zitat Abraham, R., Erwig, M., Andrew, S.: A type system based on end-user vocabulary. In: Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 215–222. IEEE Computer Society, Washington, DC (2007) Abraham, R., Erwig, M., Andrew, S.: A type system based on end-user vocabulary. In: Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 215–222. IEEE Computer Society, Washington, DC (2007)
4.
Zurück zum Zitat Abraham, R., Erwig, M.: Mutation operators for spreadsheets. IEEE Trans. Softw. Eng. 35(1), 94–108 (2009)CrossRef Abraham, R., Erwig, M.: Mutation operators for spreadsheets. IEEE Trans. Softw. Eng. 35(1), 94–108 (2009)CrossRef
5.
Zurück zum Zitat Ahmad, Y., Antoniu, T., Goldwater, S., Krishnamurthi S.: A type system for statically detecting spreadsheet errors. In: Proceedings of the IEEE International Conference on Automated Software Engineering, pp. 174–183. (2003) Ahmad, Y., Antoniu, T., Goldwater, S., Krishnamurthi S.: A type system for statically detecting spreadsheet errors. In: Proceedings of the IEEE International Conference on Automated Software Engineering, pp. 174–183. (2003)
6.
Zurück zum Zitat Amalfitano, D., Fasolino, A.R., Maggio, V., Tramontana, P., Di Mare, G., Ferrara, F., Scala, S.: Migrating legacy spreadsheets-based systems to Web MVC architecture: an industrial case study. In: Proceedings of CSMR-WCRE, pp. 387–390 (2014) Amalfitano, D., Fasolino, A.R., Maggio, V., Tramontana, P., Di Mare, G., Ferrara, F., Scala, S.: Migrating legacy spreadsheets-based systems to Web MVC architecture: an industrial case study. In: Proceedings of CSMR-WCRE, pp. 387–390 (2014)
7.
Zurück zum Zitat Amalfitano, D., Fasolino, A.R., Maggio, V., Tramontana, P., De Simone, V.: Reverse engineering of data models from legacy spreadsheets-based systems: an Industrial Case Study. In: Proceedings of the 22nd Italian Symposium on Advanced Database System, pp. 123–130 (2014) Amalfitano, D., Fasolino, A.R., Maggio, V., Tramontana, P., De Simone, V.: Reverse engineering of data models from legacy spreadsheets-based systems: an Industrial Case Study. In: Proceedings of the 22nd Italian Symposium on Advanced Database System, pp. 123–130 (2014)
8.
Zurück zum Zitat Amalfitano, D., Fasolino, A.R., Tramontana, P., De Simone, V., Di Mare, G., Scala, S.: Information extraction from legacy spreadsheet-based information system - an experience in the automotive context. In: DATA 2014, pp. 389–398 (2014) Amalfitano, D., Fasolino, A.R., Tramontana, P., De Simone, V., Di Mare, G., Scala, S.: Information extraction from legacy spreadsheet-based information system - an experience in the automotive context. In: DATA 2014, pp. 389–398 (2014)
9.
Zurück zum Zitat Bovenzi, D., Canfora, G., Fasolino, A.R.: Enabling legacy system accessibility by Web heterogeneous clients. In: Proceedings of the Seventh European Conference on Software Maintenance and Reengineering, pp. 73–81. IEEE CS Press (2003) Bovenzi, D., Canfora, G., Fasolino, A.R.: Enabling legacy system accessibility by Web heterogeneous clients. In: Proceedings of the Seventh European Conference on Software Maintenance and Reengineering, pp. 73–81. IEEE CS Press (2003)
10.
Zurück zum Zitat Canfora, G., Fasolino, A.R., Frattolillo, G., Tramontana, P.: A wrapping approach for migrating legacy system interactive functionalities to service oriented architectures. Elsevier, J. Syst. Softw. 81(4), 463–480 (2008)CrossRef Canfora, G., Fasolino, A.R., Frattolillo, G., Tramontana, P.: A wrapping approach for migrating legacy system interactive functionalities to service oriented architectures. Elsevier, J. Syst. Softw. 81(4), 463–480 (2008)CrossRef
11.
Zurück zum Zitat Chen, Z., Cafarella, M.: Automatic web spreadsheet data extraction. In: Proceedings of the 3rd International Workshop on Semantic Search Over the Web (SS@ 2013), p. 8. ACM, New York (2013) Chen, Z., Cafarella, M.: Automatic web spreadsheet data extraction. In: Proceedings of the 3rd International Workshop on Semantic Search Over the Web (SS@ 2013), p. 8. ACM, New York (2013)
12.
Zurück zum Zitat Cunha, J., Saraiva J., Visser, J.: From spreadsheets to relational databases and back. In: Proceedings of the 2009 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, PEPM 2009, pp 179–188. ACM, New York (2009) Cunha, J., Saraiva J., Visser, J.: From spreadsheets to relational databases and back. In: Proceedings of the 2009 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, PEPM 2009, pp 179–188. ACM, New York (2009)
13.
Zurück zum Zitat Cunha, J., Erwig, M., Saraiva, J.: Automatically inferring ClassSheet models from spreadsheets. In: Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2010, pp 93–100. IEEE Computer Society (2010) Cunha, J., Erwig, M., Saraiva, J.: Automatically inferring ClassSheet models from spreadsheets. In: Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2010, pp 93–100. IEEE Computer Society (2010)
14.
Zurück zum Zitat Cunha, J., Mendes J., Fernandes J.P., Saraiva J.: Embedding and evolution of spreadsheet models in spreadsheet systems. In: VL/HCC 2011: IEEE Symposium on Visual Languages and Human-Centric Computing, pp 186–201. IEEE Computer Society (2011) Cunha, J., Mendes J., Fernandes J.P., Saraiva J.: Embedding and evolution of spreadsheet models in spreadsheet systems. In: VL/HCC 2011: IEEE Symposium on Visual Languages and Human-Centric Computing, pp 186–201. IEEE Computer Society (2011)
15.
Zurück zum Zitat Cunha, J., Fernandes, J.P., Mendes, J., Pacheco, H., Saraiva, J.: Bidirectional transformation of model-driven spreadsheets. In: Hu, Z., de Lara, J. (eds.) ICMT 2012. LNCS, vol. 7307, pp. 105–120. Springer, Heidelberg (2012)CrossRef Cunha, J., Fernandes, J.P., Mendes, J., Pacheco, H., Saraiva, J.: Bidirectional transformation of model-driven spreadsheets. In: Hu, Z., de Lara, J. (eds.) ICMT 2012. LNCS, vol. 7307, pp. 105–120. Springer, Heidelberg (2012)CrossRef
16.
Zurück zum Zitat Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: MDSheet: A framework for model-driven spreadsheet engineering. In: Proceedings of the 34rd International Conference on Software Engineering, ICSE 2012, pp 1412–1415. ACM (2012) Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: MDSheet: A framework for model-driven spreadsheet engineering. In: Proceedings of the 34rd International Conference on Software Engineering, ICSE 2012, pp 1412–1415. ACM (2012)
17.
Zurück zum Zitat Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: Towards an evaluation of bidirectional model-driven spreadsheets. In: User evaluation for Software Engineering Researchers, USER 2012, pp 25–28. ACM Digital Library (2012) Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: Towards an evaluation of bidirectional model-driven spreadsheets. In: User evaluation for Software Engineering Researchers, USER 2012, pp 25–28. ACM Digital Library (2012)
18.
Zurück zum Zitat Cunha, J., Fernandes, J.P., Saraiva, J.: From relational ClassSheets to UML+OCL. In: The Software Engineering Track at the 27th Annual ACM Symposium on Applied Computing (SAC 2012), Riva del Garda (Trento), Italy, pp. 1151–1158. ACM (2012) Cunha, J., Fernandes, J.P., Saraiva, J.: From relational ClassSheets to UML+OCL. In: The Software Engineering Track at the 27th Annual ACM Symposium on Applied Computing (SAC 2012), Riva del Garda (Trento), Italy, pp. 1151–1158. ACM (2012)
19.
Zurück zum Zitat Cunha, J., Mendes, J., Saraiva, J., Visser, J.: Model-based programming environments for spreadsheets. Sci. Comput. Program. (SCP) 96(2), 254–275 (2014)CrossRef Cunha, J., Mendes, J., Saraiva, J., Visser, J.: Model-based programming environments for spreadsheets. Sci. Comput. Program. (SCP) 96(2), 254–275 (2014)CrossRef
20.
Zurück zum Zitat Cunha, J., Fernandes, J., Mendes, J., Saraiva, J.: Embedding, evolution, and validation of model-driven spreadsheets. IEEE Trans. Softw. Eng. 41(3), 241–263 (2014)CrossRef Cunha, J., Fernandes, J., Mendes, J., Saraiva, J.: Embedding, evolution, and validation of model-driven spreadsheets. IEEE Trans. Softw. Eng. 41(3), 241–263 (2014)CrossRef
21.
Zurück zum Zitat Cunha, J., Erwig, M., Mendes, J., Saraiva, J.: Model inference for spreadsheets. Autom. Softw. Eng., 1–32 (2014). Springer, USA Cunha, J., Erwig, M., Mendes, J., Saraiva, J.: Model inference for spreadsheets. Autom. Softw. Eng., 1–32 (2014). Springer, USA
22.
Zurück zum Zitat De Lucia, A., Francese, R., Scanniello, G., Tortora, G.: Developing legacy system migration methods and tools for technology transfer. Softw. Pract. Experience 38(13), 1333–1364 (2008). WileyCrossRef De Lucia, A., Francese, R., Scanniello, G., Tortora, G.: Developing legacy system migration methods and tools for technology transfer. Softw. Pract. Experience 38(13), 1333–1364 (2008). WileyCrossRef
23.
Zurück zum Zitat Di Lucca, G.A., Fasolino, A.R., De Carlini, U.: Recovering class diagrams from data-intensive legacy systems. In: Proceedings of International Conference on Software Maintenance, ICSM, pp. 52–62. IEEE CS Press (2000) Di Lucca, G.A., Fasolino, A.R., De Carlini, U.: Recovering class diagrams from data-intensive legacy systems. In: Proceedings of International Conference on Software Maintenance, ICSM, pp. 52–62. IEEE CS Press (2000)
24.
Zurück zum Zitat Fisher, M., Rothermel, G.: The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms. In: 1st Workshop on End-User Software Engineering, pp. 47–51 (2005) Fisher, M., Rothermel, G.: The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms. In: 1st Workshop on End-User Software Engineering, pp. 47–51 (2005)
25.
Zurück zum Zitat Hermans, F., Pinzger, M., van Deursen, A.: Automatically extracting class diagrams from spreadsheets. In: D’Hondt, T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 52–75. Springer, Heidelberg (2010)CrossRef Hermans, F., Pinzger, M., van Deursen, A.: Automatically extracting class diagrams from spreadsheets. In: D’Hondt, T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 52–75. Springer, Heidelberg (2010)CrossRef
26.
Zurück zum Zitat Hermans F., Pinzger, M., van Deursen, A.: Supporting professional spreadsheet users by generating leveled dataflow diagrams. In: Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011), pp. 451–460. ACM, New York (2011) Hermans F., Pinzger, M., van Deursen, A.: Supporting professional spreadsheet users by generating leveled dataflow diagrams. In: Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011), pp. 451–460. ACM, New York (2011)
27.
Zurück zum Zitat Hung, V., Benatallah, B., Saint-Paul R.: Spreadsheet-based complex data transformation. In: Proceedings of the 20th ACM International Conference on Information and Knowledge management (CIKM 2011), pp. 1749–1754. ACM, New York (2011) Hung, V., Benatallah, B., Saint-Paul R.: Spreadsheet-based complex data transformation. In: Proceedings of the 20th ACM International Conference on Information and Knowledge management (CIKM 2011), pp. 1749–1754. ACM, New York (2011)
28.
Zurück zum Zitat Janvrin, D., Morrison, J.: Using a structured design approach to reduce risks in end user spreadsheet development. Inf. Manag. 37(1), 1–12 (2000)CrossRef Janvrin, D., Morrison, J.: Using a structured design approach to reduce risks in end user spreadsheet development. Inf. Manag. 37(1), 1–12 (2000)CrossRef
29.
Zurück zum Zitat Mittermeir, R., Clermont, M.: Finding high-level structures in spreadsheet programs. In: Proceedings of the Ninth Working Conference on Reverse Engineering (WCRE), pp. 221–232. IEEE Computer Society (2002) Mittermeir, R., Clermont, M.: Finding high-level structures in spreadsheet programs. In: Proceedings of the Ninth Working Conference on Reverse Engineering (WCRE), pp. 221–232. IEEE Computer Society (2002)
30.
Zurück zum Zitat Panko, R.R., Halverson, R.P.: Individual and group spreadsheet design: patterns of errors. In: Proceedings of the Hawaii International Conference on System Sciences (HICSS), pp. 4–10 (1994) Panko, R.R., Halverson, R.P.: Individual and group spreadsheet design: patterns of errors. In: Proceedings of the Hawaii International Conference on System Sciences (HICSS), pp. 4–10 (1994)
31.
Zurück zum Zitat Ronen, B., Palley, M.A., Lucas, H.C.: Spreadsheet analysis and design. Commun. ACM 32, 84–93 (1989)CrossRef Ronen, B., Palley, M.A., Lucas, H.C.: Spreadsheet analysis and design. Commun. ACM 32, 84–93 (1989)CrossRef
32.
Zurück zum Zitat Scaffidi, C., Shaw, M., Myers, B.: Estimating the numbers of end users and end user programmers. In: 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, 20–24 September 2015, pp. 207–214 (2005) Scaffidi, C., Shaw, M., Myers, B.: Estimating the numbers of end users and end user programmers. In: 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, 20–24 September 2015, pp. 207–214 (2005)
33.
Zurück zum Zitat Shokry, H., Hinchey, M.: Model-based verification of embedded software. IEEE Comput. 42(4), 53–59 (2009)CrossRef Shokry, H., Hinchey, M.: Model-based verification of embedded software. IEEE Comput. 42(4), 53–59 (2009)CrossRef
Metadaten
Titel
A Reverse Engineering Process for Inferring Data Models from Spreadsheet-based Information Systems: An Automotive Industrial Experience
verfasst von
Domenico Amalfitano
Anna Rita Fasolino
Porfirio Tramontana
Vincenzo De Simone
Giancarlo Di Mare
Stefano Scala
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-25936-9_9

Premium Partner