Skip to main content
Erschienen in: The Journal of Supercomputing 9/2021

04.03.2021

IDCOS: optimization strategy for parallel complex expression computation on big data

verfasst von: Yang Song, Helin Jin, Hongzhi Wang, You Liu

Erschienen in: The Journal of Supercomputing | Ausgabe 9/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Complex expressions are the basis of data analytics. To process complex expressions on big data efficiently, we developed a novel optimization strategy for parallel computation platforms such as Hadoop and Spark. We attempted to minimize the rounds of data repartition to achieve high performance. Aiming at this goal, we modeled the expression as a graph and developed a simplification algorithm for this graph. Based on the graph, we converted the round minimization problem into a graph decomposition problem and developed a linear algorithm for it. We also designed appropriated implementation for the optimization strategy. Extensive experimental results demonstrate that the proposed approach could optimize the computation of complex expressions effectively with small cost.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aho AV, Sethi R, Ullman JD (1986) Compilers, principles, techniques. Addison Wesley 7(8):9MATH Aho AV, Sethi R, Ullman JD (1986) Compilers, principles, techniques. Addison Wesley 7(8):9MATH
2.
Zurück zum Zitat Althebyan Q, Jararweh Y, Yaseen Q, Alqudah O, Al-Ayyoub M (2016) Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure. Concurr Comput Pract Exp 27(18):5686–5699CrossRef Althebyan Q, Jararweh Y, Yaseen Q, Alqudah O, Al-Ayyoub M (2016) Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure. Concurr Comput Pract Exp 27(18):5686–5699CrossRef
5.
Zurück zum Zitat Church K, Gale W, Hanks P, Hindle D (1991) Using statistics in lexical analysis. Lexical acquisition: exploiting on-line resources to build a lexicon 115:164 Church K, Gale W, Hanks P, Hindle D (1991) Using statistics in lexical analysis. Lexical acquisition: exploiting on-line resources to build a lexicon 115:164
8.
Zurück zum Zitat Dodhia RM (2005) A review of applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). J Educ Behav Stat 30(2):227–229CrossRef Dodhia RM (2005) A review of applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). J Educ Behav Stat 30(2):227–229CrossRef
9.
Zurück zum Zitat Dorre J, Apel S, Lengauer C (2015) Modeling and optimizing MapReduce programs. Concurr Comput Pract Exp 27(7):1734–1766CrossRef Dorre J, Apel S, Lengauer C (2015) Modeling and optimizing MapReduce programs. Concurr Comput Pract Exp 27(7):1734–1766CrossRef
13.
Zurück zum Zitat Idris M, Hussain S, Ali M, Abdulali A, Siddiqi MH, Kang BH, Lee S (2015) Context-aware scheduling in MapReduce: a compact review. Concurr Comput Pract Exp 27(17):5332–5349CrossRef Idris M, Hussain S, Ali M, Abdulali A, Siddiqi MH, Kang BH, Lee S (2015) Context-aware scheduling in MapReduce: a compact review. Concurr Comput Pract Exp 27(17):5332–5349CrossRef
14.
Zurück zum Zitat Jaggi M, Smith V, Takác M, Terhorst J, Krishnan S, Hofmann T, Jordan MI (2014) Communication-efficient distributed dual coordinate ascent. In: Advances in Neural Information Processing Systems, pp 3068–3076 Jaggi M, Smith V, Takác M, Terhorst J, Krishnan S, Hofmann T, Jordan MI (2014) Communication-efficient distributed dual coordinate ascent. In: Advances in Neural Information Processing Systems, pp 3068–3076
16.
17.
Zurück zum Zitat Liu Y, Jing W, Liu Y, Lv L, Qi M, Xiang Y (2017) A sliding window-based dynamic load balancing for heterogeneous Hadoop clusters. Concurr Comput Pract Exp 29(3), n/a–n/a Liu Y, Jing W, Liu Y, Lv L, Qi M, Xiang Y (2017) A sliding window-based dynamic load balancing for heterogeneous Hadoop clusters. Concurr Comput Pract Exp 29(3), n/a–n/a
18.
Zurück zum Zitat Ma C, Smith V, Jaggi M, Jordan MI, Richtárik P, Takáč M (2015) Adding vs. averaging in distributed primal-dual optimization. ArXiv preprint arXiv:1502.03508 Ma C, Smith V, Jaggi M, Jordan MI, Richtárik P, Takáč M (2015) Adding vs. averaging in distributed primal-dual optimization. ArXiv preprint arXiv:​1502.​03508
22.
Zurück zum Zitat Recht B, Re C, Wright S, Niu F (2011) Hogwild!: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp 693–701 Recht B, Re C, Wright S, Niu F (2011) Hogwild!: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp 693–701
24.
Zurück zum Zitat Segal B, Robertson L, Gagliardi F, Carminati F (2000) Grid computing: the European data grid project. In: Nuclear Science Symposium Conference Record, vol 1, p 2/1 Segal B, Robertson L, Gagliardi F, Carminati F (2000) Grid computing: the European data grid project. In: Nuclear Science Symposium Conference Record, vol 1, p 2/1
25.
Zurück zum Zitat Torma B, Boglárka G (2010) An efficient descent direction method with cutting planes. Central Eur J Oper Res 18(2):105–130MathSciNetCrossRef Torma B, Boglárka G (2010) An efficient descent direction method with cutting planes. Central Eur J Oper Res 18(2):105–130MathSciNetCrossRef
26.
Zurück zum Zitat Wang G, Venkataraman S, Phanishayee A, Thelin J, Devanur N, Stoica I (2019) Blink: fast and generic collectives for distributed ml. ArXiv preprint arXiv:1910.04940 Wang G, Venkataraman S, Phanishayee A, Thelin J, Devanur N, Stoica I (2019) Blink: fast and generic collectives for distributed ml. ArXiv preprint arXiv:​1910.​04940
28.
Zurück zum Zitat Yao H, Xu J, Luo Z, Zeng D (2016) MEMoMR: accelerate MapReduce via reuse of intermediate results. Concurr Comput Pract Exp 28(14):3814–3829CrossRef Yao H, Xu J, Luo Z, Zeng D (2016) MEMoMR: accelerate MapReduce via reuse of intermediate results. Concurr Comput Pract Exp 28(14):3814–3829CrossRef
29.
Zurück zum Zitat Yu P, Chowdhury M (2019) Salus: fine-grained GPU sharing primitives for deep learning applications. ArXiv preprint arXiv:1902.04610 Yu P, Chowdhury M (2019) Salus: fine-grained GPU sharing primitives for deep learning applications. ArXiv preprint arXiv:​1902.​04610
30.
Zurück zum Zitat Yuan K, Ying B, Liu J, Sayed AH (2018) Variance-reduced stochastic learning by networked agents under random reshuffling. IEEE Trans Signal Process 67(2):351–366MathSciNetCrossRef Yuan K, Ying B, Liu J, Sayed AH (2018) Variance-reduced stochastic learning by networked agents under random reshuffling. IEEE Trans Signal Process 67(2):351–366MathSciNetCrossRef
Metadaten
Titel
IDCOS: optimization strategy for parallel complex expression computation on big data
verfasst von
Yang Song
Helin Jin
Hongzhi Wang
You Liu
Publikationsdatum
04.03.2021
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 9/2021
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-021-03674-y

Weitere Artikel der Ausgabe 9/2021

The Journal of Supercomputing 9/2021 Zur Ausgabe

Premium Partner