Skip to main content
Top
Published in:
Cover of the book

2015 | OriginalPaper | Chapter

1. Data Quality Monitoring of Cloud Databases Based on Data Quality SLAs

Authors : Dimas C. Nascimento, Carlos Eduardo Pires, Demetrio Mestre

Published in: Big-Data Analytics and Cloud Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This chapter provides an overview of the tasks related to the continuous process of monitoring the quality of cloud databases as their content is modified over time. In the Software as a Service context, this process must be guided by data quality service level agreements, which aim to specify customers’ requirements regarding the process of data quality monitoring. In practice, factors such as the Big Data scale, lack of data structure, strict service level agreement requirements, and the velocity of the changes over the data imply many challenges for an effective accomplishment of this process. In this context, we present a high-level architecture of a cloud service, which employs cloud computing capabilities in order to tackle these challenges, as well as the technical and research problems that may be further explored to allow an effective deployment of the presented service.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Loshin D (2010) The practitioner’s guide to data quality improvement. Elsevier, Burlington Loshin D (2010) The practitioner’s guide to data quality improvement. Elsevier, Burlington
2.
go back to reference Sadiq S (ed) (2013) Handbook of data quality. Springer, New York Sadiq S (ed) (2013) Handbook of data quality. Springer, New York
3.
go back to reference Buhl HU, Röglinger M, Moser DK, Heidemann J (2013) Big data: a fashionable topic with(out) sustainable relevance for research and practice? Bus Inf Syst Eng 5(2):65–69CrossRef Buhl HU, Röglinger M, Moser DK, Heidemann J (2013) Big data: a fashionable topic with(out) sustainable relevance for research and practice? Bus Inf Syst Eng 5(2):65–69CrossRef
4.
go back to reference Kaisler S, Armour F, Espinosa JA, Money W (2013) Big data: issues and challenges moving forward. In: Proceedings of the 46th Hawaii international conference on system sciences (HICSS), pp 995–1004 Kaisler S, Armour F, Espinosa JA, Money W (2013) Big data: issues and challenges moving forward. In: Proceedings of the 46th Hawaii international conference on system sciences (HICSS), pp 995–1004
5.
go back to reference Katal A, Wazid M, Goudar RH (2013) Big data: issues, challenges, tools and good practices. In: Proceedings of the 6th international conference on contemporary computing, pp 404–409 Katal A, Wazid M, Goudar RH (2013) Big data: issues, challenges, tools and good practices. In: Proceedings of the 6th international conference on contemporary computing, pp 404–409
6.
go back to reference Badidi E (2013) A cloud service broker for SLA-based SaaS provisioning. In: Proceedings of the international conference on information society, pp 61–66 Badidi E (2013) A cloud service broker for SLA-based SaaS provisioning. In: Proceedings of the international conference on information society, pp 61–66
7.
go back to reference Schnjakin M, Alnemr R, Meinel C (2010) Contract-based cloud architecture. In: Proceedings of the second international workshop on cloud data management, pp 33–40 Schnjakin M, Alnemr R, Meinel C (2010) Contract-based cloud architecture. In: Proceedings of the second international workshop on cloud data management, pp 33–40
8.
go back to reference Christen P (2012) A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans Knowl Data Eng 24(9):1537–1555CrossRef Christen P (2012) A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans Knowl Data Eng 24(9):1537–1555CrossRef
9.
go back to reference Bizer C, Boncz P, Brodie ML, Erling O (2012) The meaningful use of big data: four perspectives – four challenges. ACM SIGMOD Record 40(4):56–60CrossRef Bizer C, Boncz P, Brodie ML, Erling O (2012) The meaningful use of big data: four perspectives – four challenges. ACM SIGMOD Record 40(4):56–60CrossRef
10.
go back to reference Gruenheid A, Dong XL, Srivastava D (2014) Incremental record linkage. Proc VLDB Endowment 7(9):697–708CrossRef Gruenheid A, Dong XL, Srivastava D (2014) Incremental record linkage. Proc VLDB Endowment 7(9):697–708CrossRef
11.
go back to reference Ioannou E, Rassadko N, Velegrakis Y (2013) On generating benchmark data for entity matching. J Data Semantics 2(1):37–56CrossRef Ioannou E, Rassadko N, Velegrakis Y (2013) On generating benchmark data for entity matching. J Data Semantics 2(1):37–56CrossRef
12.
go back to reference Hsueh SC, Lin MY, Chiu YC (2014) A load-balanced mapreduce algorithm for blocking-based entity-resolution with multiple keys. In: Proceedings of the 12th Australasian symposium on parallel and distributed computing, pp 3–9 Hsueh SC, Lin MY, Chiu YC (2014) A load-balanced mapreduce algorithm for blocking-based entity-resolution with multiple keys. In: Proceedings of the 12th Australasian symposium on parallel and distributed computing, pp 3–9
13.
go back to reference Mestre DG, Pires CE, Nascimento DC (2015) Adaptive sorted neighborhood blocking for entity matching with mapReduce. In: Proceedings of the 30th ACM/SIGAPP symposium on applied computing, pp 981–987 Mestre DG, Pires CE, Nascimento DC (2015) Adaptive sorted neighborhood blocking for entity matching with mapReduce. In: Proceedings of the 30th ACM/SIGAPP symposium on applied computing, pp 981–987
14.
go back to reference Baxter R, Christen P, Churches T (2003) A comparison of fast blocking methods for record linkage. ACM SIGKDD 3:25–27 Baxter R, Christen P, Churches T (2003) A comparison of fast blocking methods for record linkage. ACM SIGKDD 3:25–27
15.
go back to reference Dillon T, Wu C, Chang E (2010) Cloud computing: issues and challenges. In: Proceedings of the 24th IEEE international conference on advanced information networking and applications, pp 27–33 Dillon T, Wu C, Chang E (2010) Cloud computing: issues and challenges. In: Proceedings of the 24th IEEE international conference on advanced information networking and applications, pp 27–33
16.
go back to reference Nascimento DC, Pires CE, Mestre D (2015) A data quality-aware cloud service based on metaheuristic and machine learning provisioning algorithms. In: Proceedings of the 30th ACM/SIGAPP symposium on applied computing, pp 1696–1703 Nascimento DC, Pires CE, Mestre D (2015) A data quality-aware cloud service based on metaheuristic and machine learning provisioning algorithms. In: Proceedings of the 30th ACM/SIGAPP symposium on applied computing, pp 1696–1703
17.
go back to reference Dan A, Davis D, Kearney R, Keller A, King R, Kuebler D, Youssef A (2004) Web services on demand: WSLA-driven automated management. IBM Syst J 43(1):136–158CrossRef Dan A, Davis D, Kearney R, Keller A, King R, Kuebler D, Youssef A (2004) Web services on demand: WSLA-driven automated management. IBM Syst J 43(1):136–158CrossRef
18.
go back to reference Ferretti S, Ghini V, Panzieri F, Pellegrini M, Turrini E (2010) Qos–aware clouds. In: Proceedings of the IEEE 3rd international conference on cloud computing, pp 321–328 Ferretti S, Ghini V, Panzieri F, Pellegrini M, Turrini E (2010) Qos–aware clouds. In: Proceedings of the IEEE 3rd international conference on cloud computing, pp 321–328
19.
go back to reference Skene J, Lamanna DD, Emmerich W (2004) Precise service level agreements. In: Proceedings of the 26th international conference on software engineering, pp 179–188 Skene J, Lamanna DD, Emmerich W (2004) Precise service level agreements. In: Proceedings of the 26th international conference on software engineering, pp 179–188
21.
go back to reference Sidi F, Shariat PH, Affendey LS, Jabar MA, Ibrahim H, Mustapha A (2012) Data quality: a survey of data quality dimensions. In: Proceedings of the international conference on information retrieval and knowledge management, pp 300–304 Sidi F, Shariat PH, Affendey LS, Jabar MA, Ibrahim H, Mustapha A (2012) Data quality: a survey of data quality dimensions. In: Proceedings of the international conference on information retrieval and knowledge management, pp 300–304
22.
go back to reference Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst 12(4):5–33CrossRef Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst 12(4):5–33CrossRef
24.
go back to reference Reynolds MB, Hopkinson KM, Oxley ME, Mullins BE (2011) Provisioning norm: an asymmetric quality measure for SaaS resource allocation. In: Proceedings of the IEEE international conference on services computing, pp 112–119 Reynolds MB, Hopkinson KM, Oxley ME, Mullins BE (2011) Provisioning norm: an asymmetric quality measure for SaaS resource allocation. In: Proceedings of the IEEE international conference on services computing, pp 112–119
25.
go back to reference Kolb L, Thor A, Rahm E (2013) Load balancing for mapreduce-based entity resolution. In: Proceedings of the IEEE 28th international conference on data engineering, pp 618–629 Kolb L, Thor A, Rahm E (2013) Load balancing for mapreduce-based entity resolution. In: Proceedings of the IEEE 28th international conference on data engineering, pp 618–629
26.
go back to reference Mestre DG, Pires CE (2013) Improving load balancing for mapreduce-based entity matching. In: IEEE symposium on computers and communications, pp 618–624 Mestre DG, Pires CE (2013) Improving load balancing for mapreduce-based entity matching. In: IEEE symposium on computers and communications, pp 618–624
27.
go back to reference Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
28.
go back to reference Christen P, Goiser K (2007) Quality and complexity measures for data linkage and deduplication. In: Quality measures in data mining. Springer, Berlin/Heidelberg Christen P, Goiser K (2007) Quality and complexity measures for data linkage and deduplication. In: Quality measures in data mining. Springer, Berlin/Heidelberg
Metadata
Title
Data Quality Monitoring of Cloud Databases Based on Data Quality SLAs
Authors
Dimas C. Nascimento
Carlos Eduardo Pires
Demetrio Mestre
Copyright Year
2015
DOI
https://doi.org/10.1007/978-3-319-25313-8_1

Premium Partner