article

ROLAP implementations of the data cube

Authors:
Konstantinos Morfonios

University of Athens

University of Athens
View Profile

,
Stratis Konakas

University of Athens

University of Athens
View Profile

,
Yannis Ioannidis

University of Athens

University of Athens
View Profile

,
Nikolaos Kotsis

Capgemini UK

Capgemini UK
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 39 Issue 4pp 12–eshttps://doi.org/10.1145/1287620.1287623

Published:02 November 2007Publication History

ACM Computing Surveys

Abstract

Implementation of the data cube is an important and scientifically interesting issue in On-Line Analytical Processing (OLAP) and has been the subject of a plethora of related publications. Naive implementation methods that compute each node separately and store the result are impractical, since they have exponential time and space complexity with respect to the cube dimensionality. To overcome this drawback, a wide range of methods that provide efficient cube implementation (with respect to both computation and storage) have been proposed, which make use of relational, multidimensional, or graph-based data structures. Furthermore, there are several other methods that compute and store approximate descriptions of data cubes, sacrificing accuracy for condensation. In this article, we focus on Relational-OLAP (ROLAP), following the majority of the efforts so far. We review existing ROLAP methods that implement the data cube and identify six orthogonal parameters/dimensions that characterize them. We place the existing techniques at the appropriate points within the problem space defined by these parameters and identify several clusters that the techniques form with various interesting properties. A careful study of these properties leads to the identification of particularly effective values for the space parameters and indicates the potential for devising new algorithms with better overall performance.

References

Acharya, S., Gibbons, P. B., and Poosala, V. 2000. Congressional samples for approximate answering of group-by queries. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 487--498. Google ScholarDigital Library
Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J. F., Ramakrishnan, R., and Sarawagi, S. 1996. On the computation of multidimensional aggregates. In Proceedings of Very Large Data Bases (VLDB). 506--521. Google ScholarDigital Library
Agrawal, R., Mehta, M., Shafer, J. C., Srikant, R., Arning, A., and Bollinger, T. 1996. The quest data mining system. In Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD). 244--249.Google Scholar
Baralis, E., Paraboschi, S., and Teniente, E. 1997. Materialized views selection in a multidimensional database. In Proceedings of Very Large Data Bases (VLDB). 156--165. Google ScholarDigital Library
Barbará, D. and Sullivan, M. 1997. Quasi-cubes: Exploiting approximations in multidimensional databases. SIGMOD Record 26, 3, 12--17. Google ScholarDigital Library
Barbará, D. and Sullivan, M. 1998. A space-efficient way to support approximate multidimensional databases. In Tech. Rep., ISSE-TR-98-03, George Mason University.Google Scholar
Beyer, K. S. and Ramakrishnan, R. 1999. Bottom-up computation of sparse and iceberg cubes. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 359--370. Google ScholarDigital Library
Chaudhuri, S. and Dayal, U. 1997. An overview of data warehousing and OLAP technology. SIGMOD Record 26, 1, 65--74. Google ScholarDigital Library
Codd, E. F. 1970. A relational model of data for large shared data banks. Comm. ACM 13, 6, 377--387. Google ScholarDigital Library
Feng, Y., Agrawal, D., Abbadi, A. E., and Metwally, A. 2004. Range cube: Efficient cube computation by exploiting data correlation. In Proceedings of International Conference on Data Engineering (ICDE). 658--670. Google ScholarDigital Library
Fu, L. and Hammer, J. 2000. Cubist: A new algorithm for improving the performance of ad-hoc olap queries. In ACM International Workshop on Data Warehousing and OLAP (DOLAP). 72--79. Google ScholarDigital Library
Gibbons, P. B. and Matias, Y. 1998. New sampling-based summary statistics for improving approximate query answers. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 331--342. Google ScholarDigital Library
Gray, J., Bosworth, A., Layman, A., and Pirahesh, H. 1996. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. In Proceedings of International Conference on Data Engineering (ICDE). 152--159. Google ScholarDigital Library
Gunopulos, D., Kollios, G., Tsotras, V. J., and Domeniconi, C. 2000. Approximating multidimensional aggregate range queries over real attributes. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 463--474. Google ScholarDigital Library
Gupta, H. 1997. Selection of views to materialize in a data warehouse. In Proceedings of International Conference on Database Theory (ICDT). 98--112. Google ScholarDigital Library
Gupta, H., Harinarayan, V., Rajaraman, A., and Ullman, J. D. 1997. Index selection for olap. In Proceedings of International Conference on Data Engineering (ICDE). 208--219. Google ScholarDigital Library
Gupta, H. and Mumick, I. S. 1999. Selection of views to materialize under a maintenance cost constraint. In Proceedings of International Conference on Database Theory (ICDT). 453--470. Google ScholarDigital Library
Haas, P. J., Naughton, J. F., Seshadri, S., and Stokes, L. 1995. Sampling-based estimation of the number of distinct values of an attribute. In Proceedings of Very Large Data Bases (VLDB). 311--322. Google ScholarDigital Library
Han, J., Pei, J., Dong, G., and Wang, K. 2001. Efficient computation of iceberg cubes with complex measures. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 1--12. Google ScholarDigital Library
Harinarayan, V., Rajaraman, A., and Ullman, J. D. 1996. Implementing data cubes efficiently. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 205--216. Google ScholarDigital Library
Ioannidis, Y. E. and Kang, Y. C. 1990. Randomized algorithms for optimizing large join queries. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 312--321. Google ScholarDigital Library
Johnson, T. and Shasha, D. 1997. Some approaches to index design for cube forest. IEEE Data Eng. Bull. 20, 1, 27--35.Google Scholar
Kalnis, P., Mamoulis, N., and Papadias, D. 2002. View selection using randomized search. Data Knowl. Eng. 42, 1, 89--111. Google ScholarDigital Library
Karayannidis, N., Sellis, T. K., and Kouvaras, Y. 2004. Cube file: A file structure for hierarchically clustered olap cubes. In Proceedings of International Conference on Extending Database Technology (EDBT). 621--638.Google Scholar
Kotidis, Y. and Roussopoulos, N. 1998. An alternative storage organization for rolap aggregate views based on cubetrees. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 249--258. Google ScholarDigital Library
Kotidis, Y. and Roussopoulos, N. 1999. Dynamat: A dynamic view management system for data warehouses. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 371--382. Google ScholarDigital Library
Kotsis, N. 2000. Multidimensional aggregation in olap systems. PhD thesis. Department of Computer Science, University of Strathclyde.Google Scholar
Kotsis, N. and McGregor, D. R. 2000. Elimination of redundant views in multidimensional aggregates. In Proceedings of Data Warehousing and Knowledge Discovery (DaWaK). 146--161. Google ScholarDigital Library
Lakshmanan, L. V. S., Pei, J., and Han, J. 2002. Quotient cube: How to summarize the semantics of a data cube. In Proceedings of Very Large Data Bases (VLDB). 778--789. Google ScholarDigital Library
Lakshmanan, L. V. S., Pei, J., and Zhao, Y. 2003. Qc-trees: An efficient summary structure for semantic olap. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 64--75. Google ScholarDigital Library
Lawrence, M. 2006. Multiobjective genetic algorithms for materialized view selection in olap data warehouses. In Proceedings of Genetic and Evolutionary Computation Congress (GECCO). 699--706. Google ScholarDigital Library
Lee, M. and Hammer, J. 1999. Speeding up warehouse physical design using a randomized algorithm. In Proceedings of Design and Management of Data Warehouses (DMDW). 3.Google Scholar
Liang, W., Wang, H., and Orlowska, M. E. 2001. Materialized view selection under the maintenance time constraint. Data Knowl. Eng. 37, 2, 203--216. Google ScholarDigital Library
Morfonios, K. and Ioannidis, Y. 2006a. Cure for cubes: Cubing using a rolap engine. In Proceedings of Very Large Data Bases (VLDB). 379--390. Google ScholarDigital Library
Morfonios, K. and Ioannidis, Y. 2006b. Supporting the data cube lifecycle: The power of ROLAP. Int. J. VLDB. http//dx.doi.org/10.1007/s00778-006-0036-8. Google ScholarDigital Library
Nadeau, T. P. and Teorey, T. J. 2002. Achieving scalability in olap materialized view selection. In Proceedings of ACM International Workshop on Data Warehousing and OLAP (DOLAP). 28--34. Google ScholarDigital Library
Poosala, V. and Ganti, V. 1999. Fast approximate answers to aggregate queries on a data cube. In Proceedings of International Conference on Statistical and Scientific Database Management (SSDBM). 24--33. Google ScholarDigital Library
Ross, K. A. and Srivastava, D. 1997. Fast computation of sparse datacubes. In Proceedings of Very Large Data Bases (VLDB). 116--125. Google ScholarDigital Library
Roussopoulos, N., Kotidis, Y., and Roussopoulos, M. 1997. Cubetree: Organization of and bulk updates on the data cube. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 89--99. Google ScholarDigital Library
Shanmugasundaram, J., Fayyad, U. M., and Bradley, P. S. 1999. Compressed data cubes for OLAP aggregate query approximation on continuous dimensions. In Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD). 223--232. Google ScholarDigital Library
Shao, Z., Han, J., and Xin, D. 2004. Mm-cubing: Computing iceberg cubes by factorizing the lattice space. In Proceedings of International Conference on Scientific and Statistical Database Management (SSDBM). 213--222. Google ScholarCross Ref
Shukla, A., Deshpande, P., and Naughton, J. F. 1998. Materialized view selection for multidimensional datasets. In Proceedings of Very Large Data Bases (VLDB). 488--499. Google ScholarDigital Library
Shukla, A., Deshpande, P., Naughton, J. F., and Ramasamy, K. 1996. Storage estimation for multidimensional aggregates in the presence of hierarchies. In Proceedings of Very Large Data Bases (VLDB). 522--531. Google ScholarDigital Library
Sismanis, Y., Deligiannakis, A., Roussopoulos, N., and Kotidis, Y. 2002. Dwarf: shrinking the petacube. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 464--475. Google ScholarDigital Library
Sismanis, Y. and Roussopoulos, N. 2004. The complexity of fully materialized coalesced cubes. In Proceedings of Very Large Data Bases (VLDB). 540--551. Google ScholarDigital Library
Theodoratos, D., Dalamagas, T., Simitsis, A., and Stavropoulos, M. 2001. A randomized approach for the incremental design of an evolving data warehouse. In Proceedings of the 20th International Conference on Conceptual Modeling (ER'01). 325--338. Google ScholarDigital Library
Theodoratos, D. and Sellis, T. K. 1997. Data warehouse configuration. In Proceedings of Very Large Data Bases (VLDB). 126--135. Google ScholarDigital Library
Uchiyama, H., Runapongsa, K., and Teorey, T. J. 1999. A progressive view materialization algorithm. In Proceedings of ACM International Workshop on Data Warehousing and OLAP (DOLAP). 36--41. Google ScholarDigital Library
Vitter, J. S. and Wang, M. 1999. Approximate computation of multidimensional aggregates of sparse data using wavelets. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 193--204. Google ScholarDigital Library
Vitter, J. S., Wang, M., and Iyer, B. R. 1998. Data cube approximation and histograms via wavelets. In Proceedings of International Conference on Information and Knowledge Management (CIKM). 96--104. Google ScholarDigital Library
Wang, W., Feng, J., Lu, H., and Yu, J. X. 2002. Condensed cube: An efficient approach to reducing data cube size. In Proceedings of International Conference on Data Engineering (ICDE). 155--165. Google ScholarDigital Library
Xin, D., Han, J., Li, X., and Wah, B. W. 2003. Star-cubing: Computing iceberg cubes by top-down and bottom-up integration. In Proceedings of Very Large Data Bases (VLDB). 476--487. Google ScholarDigital Library
Yu, J. X., Yao, X., Choi, C.-H., and Gou, G. 2003. Materialized view selection as constrained evolutionary optimization. IEEE Trans. Syst., Man, Cybernetics, Part C 33, 4, 458--467.Google ScholarDigital Library
Zhao, Y., Deshpande, P., and Naughton, J. F. 1997. An array-based algorithm for simultaneous multidimensional aggregates. In Proceedings of ACM Special Interest Group on Management of Data (SIGMOD). 159--170. Google ScholarDigital Library

Index Terms

ROLAP implementations of the data cube
1. Information systems
  1. Information systems applications

Recommendations

Graph cube: on warehousing and OLAP multidimensional networks
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

We consider extending decision support facilities toward large sophisticated networks, upon which multidimensional attributes are associated with network entities, thereby forming the so-called multidimensional networks. Data warehouses and OLAP (Online ...
Read More
An efficient method for maintaining data cubes incrementally

The data cube operator computes group-bys for all possible combinations of a set of dimension attributes. Since computing a data cube typically incurs a considerable cost, the data cube is often precomputed and stored as materialized views in data ...
Read More
Multidimensional models: constructing data CUBE
CompSysTech '04: Proceedings of the 5th international conference on Computer systems and technologies

The goal of this article is to depict algorithms and new approach of creating data cube efficiently and to set tasks for future work. In short, the paper has described it as a data abstraction that allows one to view aggregated data from a number of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Computing Surveys Volume 39, Issue 4
2007
134 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/1287620
Issue’s Table of Contents

Copyright © 2007 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2007
Published in csur Volume 39, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Data cube
OLAP
data warehouse
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 44
  Total Citations
  View Citations
- 2,897
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ROLAP implementations of the data cube

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Graph cube: on warehousing and OLAP multidimensional networks

An efficient method for maintaining data cubes incrementally

Multidimensional models: constructing data CUBE

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

ROLAP implementations of the data cube

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Graph cube: on warehousing and OLAP multidimensional networks

An efficient method for maintaining data cubes incrementally

Multidimensional models: constructing data CUBE

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media