Skip to main content

Benchmarking ETL Workflows

  • Conference paper
Performance Evaluation and Benchmarking (TPCTC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5895))

Included in the following conference series:

Abstract

Extraction–Transform–Load (ETL) processes comprise complex data workflows, which are responsible for the maintenance of a Data Warehouse. A plethora of ETL tools is currently available constituting a multi-million dollar market. Each ETL tool uses its own technique for the design and implementation of an ETL workflow, making the task of assessing ETL tools extremely difficult. In this paper, we identify common characteristics of ETL workflows in an effort of proposing a unified evaluation method for ETL. We also identify the main points of interest in designing, implementing, and maintaining ETL workflows. Finally, we propose a principled organization of test suites based on the TPC-H schema for the problem of experimenting with ETL workflows.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ab Initio (2009), http://www.abinitio.com/

  2. Adzic, J., Fiore, V.: Data Warehouse Population Platform. In: DMDW (2003)

    Google Scholar 

  3. Briand, L.C., Morasca, S., Basili, V.R.: Property-Based Software Engineering Measurement. IEEE Trans. on Software Engineering 22(1) (1996)

    Google Scholar 

  4. Carey, M.J., DeWitt, D.J., Naughton, J.F.: The OO7 Benchmark. In: SIGMOD (1993)

    Google Scholar 

  5. Carey, M.J., et al.: The BUCKY Object-Relational Benchmark. In: SIGMOD (1997)

    Google Scholar 

  6. Dayal, U., Castellanos, M., Simitsis, A., Wilkinson, K.: Data Integration Flows for Business Intelligence. In: EDBT (2009)

    Google Scholar 

  7. IBM, IBM InfoSphere Information Server (2009), http://www-01.ibm.com/software/data/integration/info_server_platform/

  8. Informatica, PowerCenter (2009), http://www.informatica.com/products/powercenter/

  9. Microsoft. SQL Server Integration Services (SSIS) (2009), http://technet.microsoft.com/en-us/sqlserver/bb331782.aspx

  10. Oracle, Oracle Warehouse Builder 11g (2009), http://www.oracle.com/technology/products/warehouse/

  11. Othayoth, R., Poess, M.: The Making of TPC-DS. In: VLDB (2006)

    Google Scholar 

  12. Simitsis, A., Vassiliadis, P., Skiadopoulos, S., Sellis, T.: Data Warehouse Refreshment. In: Data Warehouses and OLAP: Concepts, Architectures and Solutions. IRM Press (2006)

    Google Scholar 

  13. Simitsis, A., Wilkinson, K., Castellanos, M., Dayal, U.: QoX-Driven ETL Design: Reducing the Cost of the ETL Consulting Engagements. In: SIGMOD (2009)

    Google Scholar 

  14. TPC. TPC Benchmark Status. TPC-ETL (2009), http://www.tpc.org/reports/status/

  15. TPC. TPC-H benchmark. Transaction Processing Council (2009), http://www.tpc.org/

  16. Vassiliadis, P., Karagiannis, A., Tziovara, V., Simitsis, A.: Towards a Benchmark for ETL Workflows. In: QDB (2007), http://www.cs.uoi.gr/~pvassil/publications/publications.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Simitsis, A., Vassiliadis, P., Dayal, U., Karagiannis, A., Tziovara, V. (2009). Benchmarking ETL Workflows. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2009. Lecture Notes in Computer Science, vol 5895. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10424-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10424-4_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10423-7

  • Online ISBN: 978-3-642-10424-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics