2012 | OriginalPaper | Buchkapitel
TPC-H Benchmark Analytics Scenarios and Performances on Hadoop Data Clouds
verfasst von : Rim Moussa
Erschienen in: Networked Digital Technologies
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
NoSQL systems rose alongside internet companies, which have different challenges in dealing with data that the traditional RDBMS solutions could not cope with. Indeed, in order to handle the continuous growth of data, NoSQL alternatives feature dynamic horizontal scaling rather than vertical scaling. To date few studies address OLAP benchmarking of NoSQL systems. This paper overviews NoSQL and adjacent technologies, and evaluates Hadoop/Pig using TPC-H benchmark, through two different scenarios of clouds. The first scenario assumes that data is saved on a data cloud and business questions are routed to the cloud for processing; while the second scenario assumes pre-summarized data calculus in a first step and multidimensional analysis in a second step. Finally, the paper reports thorough performance tests on Hadoop for various data volumes, workloads, and cluster’ sizes.