Skip to main content
Erschienen in: Cluster Computing 3/2012

01.09.2012

The impact of heterogeneous multi-core clusters on graph partitioning: an empirical study

verfasst von: Siew Yin Chan, Teck Chaw Ling, Eric Aubanel

Erschienen in: Cluster Computing | Ausgabe 3/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The advent of multi-core architectures provides an opportunity for accelerating parallelism in mesh-based applications. This multi-core environment, however, imposes challenges not addressed by conventional graph-partitioning techniques that are originally designed for distributed-memory uniprocessors. As the first step to exploit the multi-core platform, this paper presents experimental evaluation to understand partitioning performance on small-scaled heterogeneous multi-core clusters. With results and analyses gathered, we propose a hierarchical framework for resource-aware graph partitioning on heterogeneous multi-core clusters. Preliminary evaluation demonstrates the potential of the framework and motivates directions for incorporating application requirements into graph partitioning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
While the OS and other software versions were the same on the testbed, we compiled numerical libraries such as ATLAS 3.9.14 and CLAPACK 3.1.1.1 on different microprocessors to harvest maximum optimization.
 
2
Compilation of certain libraries on which the HPCC benchmarks rely was performed separately on each machine architecture, and the resulted binaries were placed in different folders on the frontend.
 
3
Poisson and SynApp are available on request to aubanel@unb.​ca.
 
4
156,061 mesh vertices and 467,315 edges.
 
5
For brevity, the partitions for Top_v1 and Top_v2 will be named henceforth after the underlying topology.
 
6
We found out that JOSTLE constantly created disconnected partitions for four-level processor graph with P=20. Thus, we opted for three levels to minimize the likelihood of producing disjointed partitions.
 
7
We have submitted the respective graphs to DIMACS10 [38] for public use.
 
Literatur
1.
Zurück zum Zitat Alam, S.R., Agarwal, P.K., Hampton, S.S., Ong, H.: Experimental evaluation of molecular dynamics simulations on multi-core systems. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V. (eds.) High Performance Computing—HiPC 2008. Lecture Notes in Computer Science, vol. 5374, pp. 131–141. Springer, Berlin (2008). doi:10.1007/978-3-540-89894-8_15 CrossRef Alam, S.R., Agarwal, P.K., Hampton, S.S., Ong, H.: Experimental evaluation of molecular dynamics simulations on multi-core systems. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V. (eds.) High Performance Computing—HiPC 2008. Lecture Notes in Computer Science, vol. 5374, pp. 131–141. Springer, Berlin (2008). doi:10.​1007/​978-3-540-89894-8_​15 CrossRef
2.
Zurück zum Zitat Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: A view from Berkeley. Tech. rep, EECS Department, University of California, Berkeley (2006) Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: A view from Berkeley. Tech. rep, EECS Department, University of California, Berkeley (2006)
3.
Zurück zum Zitat Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009). doi:10.1145/1562764.1562783 CrossRef Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009). doi:10.​1145/​1562764.​1562783 CrossRef
4.
Zurück zum Zitat Aubanel, E.: Resource-aware load balancing of parallel applications. In: Udoh, E., Wang, F.Z. (eds.) Handbook of Research on Grid Technologies and Utility Computing: Concepts for Managing Large-Scale Applications, pp. 12–21. IGI Global, Hershey (2009) CrossRef Aubanel, E.: Resource-aware load balancing of parallel applications. In: Udoh, E., Wang, F.Z. (eds.) Handbook of Research on Grid Technologies and Utility Computing: Concepts for Managing Large-Scale Applications, pp. 12–21. IGI Global, Hershey (2009) CrossRef
5.
Zurück zum Zitat Barnard, S.T., Simon, H.D.: Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. Concurrency 6(2), 101–117 (1994). doi:10.1002/cpe.433006020 CrossRef Barnard, S.T., Simon, H.D.: Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. Concurrency 6(2), 101–117 (1994). doi:10.​1002/​cpe.​433006020 CrossRef
8.
Zurück zum Zitat Broquedis, F., Clet-Ortega, J., Moreaud, S., Furmento, N., Goglin, B., Mercier, G., Thibault, S., Namyst, R.: hwloc: a generic framework for managing hardware affinities in hpc applications. In: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010), Pisa, Italy, pp. 180–186 (2010) CrossRef Broquedis, F., Clet-Ortega, J., Moreaud, S., Furmento, N., Goglin, B., Mercier, G., Thibault, S., Namyst, R.: hwloc: a generic framework for managing hardware affinities in hpc applications. In: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010), Pisa, Italy, pp. 180–186 (2010) CrossRef
9.
Zurück zum Zitat Bui, T., Jones, C.: A heuristic for reducing fill-in sparse matrix factorization. In: Proceedings of the 6th SIAM Conf. Parallel Processing for Scientific Computing, pp. 445–452. SIAM, Philadelphia (1993) Bui, T., Jones, C.: A heuristic for reducing fill-in sparse matrix factorization. In: Proceedings of the 6th SIAM Conf. Parallel Processing for Scientific Computing, pp. 445–452. SIAM, Philadelphia (1993)
11.
Zurück zum Zitat Chai, L., Gao, Q., Panda, D.: Understanding the impact of multi-core architecture in cluster computing: A case study with Intel dual-core system. In: Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGRID 2007), Rio De Janeiro, pp. 471–478. IEEE, New York (2007). doi:10.1109/CCGRID.2007.119 CrossRef Chai, L., Gao, Q., Panda, D.: Understanding the impact of multi-core architecture in cluster computing: A case study with Intel dual-core system. In: Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGRID 2007), Rio De Janeiro, pp. 471–478. IEEE, New York (2007). doi:10.​1109/​CCGRID.​2007.​119 CrossRef
12.
Zurück zum Zitat Chan, S.Y., Ling, T.C., Aubanel, E.: Benchmarking and profiling heterogeneous multi-core clusters using graph-partitioning workload. Tech. rep. tech. report cs-2011-01, Faculty of Computer Science and Information Technology, University of Malaya (2011) Chan, S.Y., Ling, T.C., Aubanel, E.: Benchmarking and profiling heterogeneous multi-core clusters using graph-partitioning workload. Tech. rep. tech. report cs-2011-01, Faculty of Computer Science and Information Technology, University of Malaya (2011)
13.
Zurück zum Zitat Chartrand, G., Zhang, P.: Introduction to Graph Theory. Walter Rudin Series in Advanced Mathematics. McGraw-Hill Higher Education, Singapore (2005) MATH Chartrand, G., Zhang, P.: Introduction to Graph Theory. Walter Rudin Series in Advanced Mathematics. McGraw-Hill Higher Education, Singapore (2005) MATH
14.
15.
Zurück zum Zitat Chen, H., Chen, W., Huang, J., Robert, B., Kuhn, H.: Mpipp: an automatic profile-guided parallel process placement toolset for smp clusters and multiclusters. In: Proceedings of the 20th Annual International Conference on Supercomputing, Cairns, Queensland, Australia, pp. 353–360 (2006). doi:10.1145/1183401.1183451 CrossRef Chen, H., Chen, W., Huang, J., Robert, B., Kuhn, H.: Mpipp: an automatic profile-guided parallel process placement toolset for smp clusters and multiclusters. In: Proceedings of the 20th Annual International Conference on Supercomputing, Cairns, Queensland, Australia, pp. 353–360 (2006). doi:10.​1145/​1183401.​1183451 CrossRef
16.
Zurück zum Zitat Clout, B., Aubanel, E.: Ehgrid: an emulator of heterogeneous computational grids. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2009), Rome, pp. 1–8 (2009). doi:10.1109/IPDPS.2009.5161167 CrossRef Clout, B., Aubanel, E.: Ehgrid: an emulator of heterogeneous computational grids. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2009), Rome, pp. 1–8 (2009). doi:10.​1109/​IPDPS.​2009.​5161167 CrossRef
18.
Zurück zum Zitat Devine, K., Boman, E., Heaphy, R., Hendrickson, B., Vaughan, C.: Zoltan data management services for parallel dynamic applications. Comput. Sci. Eng. 7(2), 90–97 (2002) CrossRef Devine, K., Boman, E., Heaphy, R., Hendrickson, B., Vaughan, C.: Zoltan data management services for parallel dynamic applications. Comput. Sci. Eng. 7(2), 90–97 (2002) CrossRef
19.
Zurück zum Zitat Dümmler, J., Rauber, T., Rünger, G.: Mapping algorithms for multiprocessor tasks on multi-core clusters. In: Proc. of the 37th International Conference on Parallel Processing (ICPP 2008), Portland, Oregon, USA, pp. 141–148. IEEE Computer Society, Los Alamitos (2008). doi:10.1109/ICPP.2008.42 CrossRef Dümmler, J., Rauber, T., Rünger, G.: Mapping algorithms for multiprocessor tasks on multi-core clusters. In: Proc. of the 37th International Conference on Parallel Processing (ICPP 2008), Portland, Oregon, USA, pp. 141–148. IEEE Computer Society, Los Alamitos (2008). doi:10.​1109/​ICPP.​2008.​42 CrossRef
20.
Zurück zum Zitat Faik, J., Flaherty, J.E., Gervasio, L.G., Teresco, J.D., Devine, K.D.: A model for resource aware load balancing on heterogeneous clusters. Tech. Rep. CS-05-01, Williams College Department of Computer Science (2005) Faik, J., Flaherty, J.E., Gervasio, L.G., Teresco, J.D., Devine, K.D.: A model for resource aware load balancing on heterogeneous clusters. Tech. Rep. CS-05-01, Williams College Department of Computer Science (2005)
22.
Zurück zum Zitat Hendrickson, B.: Load balancing fictions, falsehoods and fallacies. Appl. Math. Model. 25, 99–108 (2000) MATHCrossRef Hendrickson, B.: Load balancing fictions, falsehoods and fallacies. Appl. Math. Model. 25, 99–108 (2000) MATHCrossRef
24.
Zurück zum Zitat Hendrickson, B., Leland, R.: A multilevel algorithm for partitioning graphs. In: Karin, S. (ed.) Proceedings of the ACM/IEEE conference on Supercomputing, p. 28. ACM, New York (1995). doi: 10.1145/224170.224228 Hendrickson, B., Leland, R.: A multilevel algorithm for partitioning graphs. In: Karin, S. (ed.) Proceedings of the ACM/IEEE conference on Supercomputing, p. 28. ACM, New York (1995). doi: 10.​1145/​224170.​224228
25.
Zurück zum Zitat Hood, R., Jin, H., Mehrotra, P., Chang, J., Djomehri, J., Gavali, S., Jespersen, D., Taylor, K., Biswas, R.: Performance impact of resource contention in multicore systems. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2010), Atlanta, GA, pp. 1–12. IEEE, New York (2010). doi:10.1109/IPDPS.2010.5470399 CrossRef Hood, R., Jin, H., Mehrotra, P., Chang, J., Djomehri, J., Gavali, S., Jespersen, D., Taylor, K., Biswas, R.: Performance impact of resource contention in multicore systems. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2010), Atlanta, GA, pp. 1–12. IEEE, New York (2010). doi:10.​1109/​IPDPS.​2010.​5470399 CrossRef
26.
Zurück zum Zitat Huang, S., Aubanel, E., Bhavsar, V.: Pagrid: A mesh partitioner for computational grids. J. Grid Comput. 4(1), 71–88 (2006) CrossRef Huang, S., Aubanel, E., Bhavsar, V.: Pagrid: A mesh partitioner for computational grids. J. Grid Comput. 4(1), 71–88 (2006) CrossRef
27.
Zurück zum Zitat Jeannot, E., Mercier, G.: Near-optimal placement of mpi processes on hierarchical numa architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010—Parallel Processing. Lecture Notes in Computer Science, vol. 6272, pp. 199–210. Springer, Berlin (2010). doi:10.1007/978-3-642-15291-7_20 Jeannot, E., Mercier, G.: Near-optimal placement of mpi processes on hierarchical numa architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010—Parallel Processing. Lecture Notes in Computer Science, vol. 6272, pp. 199–210. Springer, Berlin (2010). doi:10.​1007/​978-3-642-15291-7_​20
31.
Zurück zum Zitat Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(1), 291–307 (1970) MATH Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(1), 291–307 (1970) MATH
32.
Zurück zum Zitat Koenig, G.A., Kalé, L.V.: Optimizing distributed application performance using dynamic grid topology-aware load balancing. In: International Parallel and Distributed Processing Symposium, Long Beach, CA, pp. 1–10 (2007) CrossRef Koenig, G.A., Kalé, L.V.: Optimizing distributed application performance using dynamic grid topology-aware load balancing. In: International Parallel and Distributed Processing Symposium, Long Beach, CA, pp. 1–10 (2007) CrossRef
35.
Zurück zum Zitat Leng, T., Ali, R., Hsieh, J., Mashayekhi, V., Rooholamini, R.: Performance impact of process mapping on small-scale smp clusters—a case study using high performance linpack. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02), pp. 236–243. IEEE, New York (2002). doi:10.1109/IPDPS.2002.1016657 CrossRef Leng, T., Ali, R., Hsieh, J., Mashayekhi, V., Rooholamini, R.: Performance impact of process mapping on small-scale smp clusters—a case study using high performance linpack. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02), pp. 236–243. IEEE, New York (2002). doi:10.​1109/​IPDPS.​2002.​1016657 CrossRef
37.
Zurück zum Zitat Luszczek, P., Dongarra, J.J., Koester, D., Rabenseifner, R., Lucas, B., Kepner, J., McCalpin, J., Bailey, D., Takahashi, D.: Introduction to the hpc challenge benchmark suite. Tech. Rep. Paper LBNL-57493, Lawrence Berkeley National Laboratory (2005) Luszczek, P., Dongarra, J.J., Koester, D., Rabenseifner, R., Lucas, B., Kepner, J., McCalpin, J., Bailey, D., Takahashi, D.: Introduction to the hpc challenge benchmark suite. Tech. Rep. Paper LBNL-57493, Lawrence Berkeley National Laboratory (2005)
40.
Zurück zum Zitat Moulitsas, I., Karypis, G.: Architecture aware partitioning algorithms. In: Algorithms and Architectures for Parallel Processing. Lecture Notes in Computer Science, vol. 5022, pp. 42–53. Springer, Berlin (2008). doi:10.1007/978-3-540-69501-1 CrossRef Moulitsas, I., Karypis, G.: Architecture aware partitioning algorithms. In: Algorithms and Architectures for Parallel Processing. Lecture Notes in Computer Science, vol. 5022, pp. 42–53. Springer, Berlin (2008). doi:10.​1007/​978-3-540-69501-1 CrossRef
41.
42.
Zurück zum Zitat Schloegel, K., Karyis, G., Kumar, V.: Graph partitioning for high-performance scientific simulations. In: Sourcebook of Parallel Computing, pp. 491–541. Morgan Kaufmann, San Francisco (2003) Schloegel, K., Karyis, G., Kumar, V.: Graph partitioning for high-performance scientific simulations. In: Sourcebook of Parallel Computing, pp. 491–541. Morgan Kaufmann, San Francisco (2003)
44.
Zurück zum Zitat Sinha, S., Parashar, M.: Adaptive system sensitive partitioning of amr applications on heterogeneous clusters. Clust. Comput. 5(4), 343–352 (2002) CrossRef Sinha, S., Parashar, M.: Adaptive system sensitive partitioning of amr applications on heterogeneous clusters. Clust. Comput. 5(4), 343–352 (2002) CrossRef
45.
Zurück zum Zitat Teresco, J.D., Faik, J., Flaherty, J.E.: Hierarchical partitioning and dynamic load balancing for scientific computation. In: Dongarra, J., Madsen, K., Wasniewski, J. (eds.) Applied Parallel Computing. Lecture Notes in Computer Science, vol. 3732, pp. 911–920. Springer, Berlin (2006). doi:10.1007/11558958 Teresco, J.D., Faik, J., Flaherty, J.E.: Hierarchical partitioning and dynamic load balancing for scientific computation. In: Dongarra, J., Madsen, K., Wasniewski, J. (eds.) Applied Parallel Computing. Lecture Notes in Computer Science, vol. 3732, pp. 911–920. Springer, Berlin (2006). doi:10.​1007/​11558958
49.
Zurück zum Zitat Walshaw, C., Cross, M.: Jostle: parallel multilevel graph-partitioning software—an overview. In: Magoules, F. (ed.) Mesh Partitioning Techniques and Domain Decomposition Techniques, pp. 27–58. Civil-Comp, Stirling (2007) Walshaw, C., Cross, M.: Jostle: parallel multilevel graph-partitioning software—an overview. In: Magoules, F. (ed.) Mesh Partitioning Techniques and Domain Decomposition Techniques, pp. 27–58. Civil-Comp, Stirling (2007)
Metadaten
Titel
The impact of heterogeneous multi-core clusters on graph partitioning: an empirical study
verfasst von
Siew Yin Chan
Teck Chaw Ling
Eric Aubanel
Publikationsdatum
01.09.2012
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 3/2012
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-012-0229-4

Weitere Artikel der Ausgabe 3/2012

Cluster Computing 3/2012 Zur Ausgabe