2012 | OriginalPaper | Buchkapitel
HIP: Information Passing for Optimizing Join-Intensive Data Processing Workloads on Hadoop
verfasst von : Seokyong Hong, Kemafor Anyanwu
Erschienen in: Database and Expert Systems Applications
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Hadoop-based data processing platforms translate join intensive queries into multiple “jobs” (MapReduce cycles). Such multi-job workflows lead to a significant amount of data movement through the disk, network and memory fabric of a Hadoop cluster which could negatively impact performance and scalability. Consequently, techniques that minimize sizes of intermediate results will be useful in this context. In this paper, we present an information passing technique (
HIP
) that can minimize the size of intermediate data on Hadoop-based data processing platforms.