Skip to main content
Erschienen in: The VLDB Journal 1/2016

01.02.2016 | Special Issue Paper

VDDA: automatic visualization-driven data aggregation in relational databases

verfasst von: Uwe Jugel, Zbigniew Jerzak, Gregor Hackenbroich, Volker Markl

Erschienen in: The VLDB Journal | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Contemporary RDBMS-based systems for visualization of high-volume numerical data have difficulty to cope with the hard latency requirements and high ingestion rates of interactive visualizations. Existing solutions for lowering the volume of large data sets disregard the spatial properties of visualizations, resulting in visualization errors. In this work, we introduce VDDA, a visualization-driven data aggregation that models visual aggregation at the pixel level as data aggregation at the query level. Based on the M4 aggregation for producing pixel-perfect line charts from highly reduced data subsets, we define a complete set of data reduction operators that simulate the overplotting behavior of the most frequently used chart types. Relying only on the relational algebra and the common data aggregation functions, our approach is generic and applicable to any visualization system that consumes data stored in relational databases. We demonstrate our visualization-driven data aggregation using real-world data sets from high-tech manufacturing, stock markets, and sports analytics, reducing data volumes by up to two orders of magnitude, while preserving pixel-perfect visualizations, as producible from the raw data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We use the relational algebra notations \(\pi \) for projection, \(\sigma \) for selection, and \(_{[GroupFunction]^{*}}G_{[Aggregation]^+}\) or \(G_{[GroupKey|Aggregation]^+}\) for aggregation.
 
2
Given w equally sized groups of a continuous range, we obtain \(2\cdot w\) equally sized groups by adding intersections in the center of each group. The original intersections of the value range and thus their corresponding first and last tuples are still part of the query result. Similarly, the original min and max tuples become min or max tuples in the new subgroups.
 
Literatur
1.
Zurück zum Zitat Agarwal, S., Panda, A., Mozafari, B., Iyer, A.P., Madden, S., Stoica, I.: Blink and it’s done: Interactive queries on very large data. PVLDB 5(12), 1902–1905 (2012) Agarwal, S., Panda, A., Mozafari, B., Iyer, A.P., Madden, S., Stoica, I.: Blink and it’s done: Interactive queries on very large data. PVLDB 5(12), 1902–1905 (2012)
2.
Zurück zum Zitat Aigner, W., Miksch, S., Schumann, H., Tominski, C.: Visualization of Time-Oriented Data. Human-Computer Interaction Series. Springer, Berlin (2011)CrossRef Aigner, W., Miksch, S., Schumann, H., Tominski, C.: Visualization of Time-Oriented Data. Human-Computer Interaction Series. Springer, Berlin (2011)CrossRef
3.
Zurück zum Zitat Battle, L., Stonebraker, M., Chang, R.: Dynamic reduction of query result sets for interactive visualizaton. In: IEEE Big Data, pp. 1–8. IEEE (2013) Battle, L., Stonebraker, M., Chang, R.: Dynamic reduction of query result sets for interactive visualizaton. In: IEEE Big Data, pp. 1–8. IEEE (2013)
4.
Zurück zum Zitat Bresenham, J.E.: Algorithm for computer control of a digital plotter. IBM Syst. J. 4(1), 25–30 (1965)CrossRef Bresenham, J.E.: Algorithm for computer control of a digital plotter. IBM Syst. J. 4(1), 25–30 (1965)CrossRef
5.
Zurück zum Zitat Burtini, G., Fazackerley, S., Lawrence, R.: Time series compression for adaptive chart generation. In: CCECE, pp. 1–6. IEEE (2013) Burtini, G., Fazackerley, S., Lawrence, R.: Time series compression for adaptive chart generation. In: CCECE, pp. 1–6. IEEE (2013)
6.
Zurück zum Zitat Chen, J.X., Wang, X.: Approximate line scan-conversion and antialiasing. Comput. Graph. Forum 18(1), 69–78 (1999)CrossRef Chen, J.X., Wang, X.: Approximate line scan-conversion and antialiasing. Comput. Graph. Forum 18(1), 69–78 (1999)CrossRef
7.
Zurück zum Zitat Chi, E.H., Riedl, J.T.: An operator interaction framework for visualization systems. In: Symposium on Information Visualization, pp. 63–70. IEEE (1998) Chi, E.H., Riedl, J.T.: An operator interaction framework for visualization systems. In: Symposium on Information Visualization, pp. 63–70. IEEE (1998)
8.
Zurück zum Zitat Cudré-Mauroux, P., Kimura, H., Lim, K.T., Rogers, J., Simakov, R., Soroush, E., Velikhov, P., Wang, D.L., Balazinska, M., Becla, J., et al.: A demonstration of SciDB: a science-oriented DBMS. PVLDB 2(2), 1534–1537 (2009) Cudré-Mauroux, P., Kimura, H., Lim, K.T., Rogers, J., Simakov, R., Soroush, E., Velikhov, P., Wang, D.L., Balazinska, M., Becla, J., et al.: A demonstration of SciDB: a science-oriented DBMS. PVLDB 2(2), 1534–1537 (2009)
9.
Zurück zum Zitat Salomon, David: Data Compression. Springer, Berlin (2007) Salomon, David: Data Compression. Springer, Berlin (2007)
10.
Zurück zum Zitat Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr. J. 10(2), 112–122 (1973)CrossRef Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr. J. 10(2), 112–122 (1973)CrossRef
11.
Zurück zum Zitat Duan, Q., Wang, P., Wu, M., Wang, W., Huang, S.: Approximate query on historical stream data. In: DEXA, pp. 128–135. Springer (2011) Duan, Q., Wang, P., Wu, M., Wang, W., Huang, S.: Approximate query on historical stream data. In: DEXA, pp. 128–135. Springer (2011)
13.
Zurück zum Zitat Elmqvist, N., Fekete, J.D.: Hierarchical aggregation for information visualization: overview, techniques and design guidelines. TVCG 16(3), 439–454 (2010) Elmqvist, N., Fekete, J.D.: Hierarchical aggregation for information visualization: overview, techniques and design guidelines. TVCG 16(3), 439–454 (2010)
14.
Zurück zum Zitat Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 12–34 (2012)CrossRef Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 12–34 (2012)CrossRef
15.
Zurück zum Zitat Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database-data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2012)CrossRef Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database-data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2012)CrossRef
16.
Zurück zum Zitat Fu, T., Chung, F., Luk, R., Ng, C.: Representing financial time series based on data point importance. EAAI J. 21(2), 277–300 (2008) Fu, T., Chung, F., Luk, R., Ng, C.: Representing financial time series based on data point importance. EAAI J. 21(2), 277–300 (2008)
17.
Zurück zum Zitat Fu, T.C.: A review on time series data mining. EAAI J. 24(1), 164–181 (2011) Fu, T.C.: A review on time series data mining. EAAI J. 24(1), 164–181 (2011)
18.
Zurück zum Zitat Gandhi, S., Foschini, L., Suri, S.: Space-efficient online approximation of time series data: streams, amnesia, and out-of-order. In: ICDE, pp. 924–935. IEEE (2010) Gandhi, S., Foschini, L., Suri, S.: Space-efficient online approximation of time series data: streams, amnesia, and out-of-order. In: ICDE, pp. 924–935. IEEE (2010)
19.
Zurück zum Zitat Haber, R.B., McNabb, D.A.: Visualization idioms: a conceptual model for scientific visualization systems. Vis. Sci. Comput. 74, 93 (1990) Haber, R.B., McNabb, D.A.: Visualization idioms: a conceptual model for scientific visualization systems. Vis. Sci. Comput. 74, 93 (1990)
20.
Zurück zum Zitat Hershberger, J., Snoeyink, J.: Speeding up the Douglas–Peucker line-simplification algorithm. University of British Columbia, Department of Computer Science (1992) Hershberger, J., Snoeyink, J.: Speeding up the Douglas–Peucker line-simplification algorithm. University of British Columbia, Department of Computer Science (1992)
21.
Zurück zum Zitat Jerzak, Z., Heinze, T., Fehr, M., Gröber, D., Hartung, R., Stojanovic, N.: The DEBS 2012 grand challenge. In: DEBS, pp. 393–398. ACM (2012) Jerzak, Z., Heinze, T., Fehr, M., Gröber, D., Hartung, R., Stojanovic, N.: The DEBS 2012 grand challenge. In: DEBS, pp. 393–398. ACM (2012)
22.
Zurück zum Zitat Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: Faster visual analytics through pixel-perfect aggregation. PVLDB 7(13), 1705–1708 (2014) Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: Faster visual analytics through pixel-perfect aggregation. PVLDB 7(13), 1705–1708 (2014)
23.
Zurück zum Zitat Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: M4: a visualization-oriented time series data aggregation. PVLDB 7(10), 797–808 (2014) Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: M4: a visualization-oriented time series data aggregation. PVLDB 7(10), 797–808 (2014)
24.
Zurück zum Zitat Jugel, U., Markl, V.: Interactive visualization of high-velocity event streams. PVLDB (PhD Workshop) 5(13) (2012) Jugel, U., Markl, V.: Interactive visualization of high-velocity event streams. PVLDB (PhD Workshop) 5(13) (2012)
25.
Zurück zum Zitat Keim, D.A., Panse, C., Schneidewind, J., Sips, M., Hao, M.C., Dayal, U.: Pushing the limit in visual data exploration: techniques and applications. LNCS 2821, 37–51 (2003) Keim, D.A., Panse, C., Schneidewind, J., Sips, M., Hao, M.C., Dayal, U.: Pushing the limit in visual data exploration: techniques and applications. LNCS 2821, 37–51 (2003)
26.
Zurück zum Zitat Keogh, E.J., Pazzani: A simple dimensionality reduction technique for fast similarity search in large time series databases. In: PAKDD, pp. 122–133. Springer (2000) Keogh, E.J., Pazzani: A simple dimensionality reduction technique for fast similarity search in large time series databases. In: PAKDD, pp. 122–133. Springer (2000)
27.
Zurück zum Zitat Kolesnikov, A.: Efficient Algorithms for Vectorization and Polygonal Approximation. University of Joensuu, Joensuu (2003) Kolesnikov, A.: Efficient Algorithms for Vectorization and Polygonal Approximation. University of Joensuu, Joensuu (2003)
28.
Zurück zum Zitat Lindstrom, P., Isenburg, M.: Fast and efficient compression of floating-point data. TVCG 12(5), 1245–1250 (2006) Lindstrom, P., Isenburg, M.: Fast and efficient compression of floating-point data. TVCG 12(5), 1245–1250 (2006)
29.
Zurück zum Zitat Liu, Z., Jiang, B., Heer, J.: imMens: real-time visual querying of big data. Comput. Graph. Forum 32(3pt4), 421–430 (2013)CrossRef Liu, Z., Jiang, B., Heer, J.: imMens: real-time visual querying of big data. Comput. Graph. Forum 32(3pt4), 421–430 (2013)CrossRef
30.
Zurück zum Zitat Ma, W., Bedner, I., Chang, G., Kuchinsky, A., Zhang, H.: A framework for adaptive content delivery in heterogeneous network environments. In: Proceedings of SPIE, Multimedia Computing and Networking, vol. 3969, pp. 86–100. SPIE (2000) Ma, W., Bedner, I., Chang, G., Kuchinsky, A., Zhang, H.: A framework for adaptive content delivery in heterogeneous network environments. In: Proceedings of SPIE, Multimedia Computing and Networking, vol. 3969, pp. 86–100. SPIE (2000)
31.
Zurück zum Zitat Mackinlay, J., Hanrahan, P., Stolte, C.: Show me: automatic presentation for visual analysis. TVCG 13(6), 1137–1144 (2007) Mackinlay, J., Hanrahan, P., Stolte, C.: Show me: automatic presentation for visual analysis. TVCG 13(6), 1137–1144 (2007)
32.
Zurück zum Zitat Mutschler, C., Ziekow, H., Jerzak, Z.: The DEBS 2013 grand challenge. In: DEBS, pp. 289–294. ACM (2013) Mutschler, C., Ziekow, H., Jerzak, Z.: The DEBS 2013 grand challenge. In: DEBS, pp. 289–294. ACM (2013)
34.
Zurück zum Zitat Przymus, P., Boniewicz, A., Burzańska, M., Stencel, K.: Recursive query facilities in relational databases: a survey. In: DTA and BSBT, pp. 89–99. Springer (2010) Przymus, P., Boniewicz, A., Burzańska, M., Stencel, K.: Recursive query facilities in relational databases: a survey. In: DTA and BSBT, pp. 89–99. Springer (2010)
35.
Zurück zum Zitat Reumann, K., Witkam, A.P.M.: Optimizing curve segmentation in computer graphics. In: Proceedings of the International Computing Symposium, pp. 467–472. North-Holland Publishing Company (1974) Reumann, K., Witkam, A.P.M.: Optimizing curve segmentation in computer graphics. In: Proceedings of the International Computing Symposium, pp. 467–472. North-Holland Publishing Company (1974)
36.
Zurück zum Zitat Shi, W., Cheung, C.: Performance evaluation of line simplification algorithms for vector generalization. Cartogr. J. 43(1), 27–44 (2006)CrossRef Shi, W., Cheung, C.: Performance evaluation of line simplification algorithms for vector generalization. Cartogr. J. 43(1), 27–44 (2006)CrossRef
37.
Zurück zum Zitat Upson, C., Faulhaber Jr, T.A., Kamins, D., Laidlaw, D., Schlegel, D., Vroom, J., Gurwitz, R., Van Dam, A.: The application visualization system: a computational environment for scientific visualization. IEEE Comput. Graph. Appl. 9(4), 30–42 (1989)CrossRef Upson, C., Faulhaber Jr, T.A., Kamins, D., Laidlaw, D., Schlegel, D., Vroom, J., Gurwitz, R., Van Dam, A.: The application visualization system: a computational environment for scientific visualization. IEEE Comput. Graph. Appl. 9(4), 30–42 (1989)CrossRef
38.
Zurück zum Zitat Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of points. Cartogr. J. 30(1), 46–51 (1993)CrossRef Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of points. Cartogr. J. 30(1), 46–51 (1993)CrossRef
39.
Zurück zum Zitat Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRef Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRef
40.
Zurück zum Zitat Wesley, R., Eldridge, M., Terlecki, P.T.: An analytic data engine for visualization in Tableau. In: SIGMOD, pp. 1185–1194. ACM (2011) Wesley, R., Eldridge, M., Terlecki, P.T.: An analytic data engine for visualization in Tableau. In: SIGMOD, pp. 1185–1194. ACM (2011)
41.
Zurück zum Zitat Wu, E., Battle, L., Madden, S.R.: The case for data visualization management systems. PVLDB 7(10), 903–906 (2014) Wu, E., Battle, L., Madden, S.R.: The case for data visualization management systems. PVLDB 7(10), 903–906 (2014)
42.
Zurück zum Zitat Wu, Y., Agrawal, D., El Abbadi, A.: A comparison of DFT and DWT based similarity search in timeseries databases. In: CIKM, pp. 488–495. ACM (2000) Wu, Y., Agrawal, D., El Abbadi, A.: A comparison of DFT and DWT based similarity search in timeseries databases. In: CIKM, pp. 488–495. ACM (2000)
Metadaten
Titel
VDDA: automatic visualization-driven data aggregation in relational databases
verfasst von
Uwe Jugel
Zbigniew Jerzak
Gregor Hackenbroich
Volker Markl
Publikationsdatum
01.02.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
The VLDB Journal / Ausgabe 1/2016
Print ISSN: 1066-8888
Elektronische ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-015-0396-z

Weitere Artikel der Ausgabe 1/2016

The VLDB Journal 1/2016 Zur Ausgabe