nach oben

The Journal of Supercomputing

Erschienen in:

01.02.2014

Measurement of the latency parameters of the Multi-BSP model: a multicore benchmarking approach

Erschienen in: The Journal of Supercomputing | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Computer benchmarking is a common method for measuring the parameters of a computational model. It helps to measure the parameters of any computer. With the emergence of multicore computers, the evaluation of computers was brought under consideration. Since these types of computers can be viewed and considered as parallel computers, the evaluation methods for parallel computers may be appropriate for multicore computers. However, because multicore architectures seriously focus on cache hierarchy, there is a need for new and different benchmarks to evaluate them correctly.

To this end, this paper presents a method for measuring the parameters of one of the most famous multicore computational models, namely Multi-Bulk Synchronous Parallel (Multi-BSP). This method measures the hardware latency parameters of multicore computers, namely communication latency (g _i) and synchronization latency (L _i) for all levels of the cache memory hierarchy in a bottom-up manner. By determining the parameters, the performance of algorithms on multicore architectures can be evaluated as a sequence.

Vorheriger Artikel Recent progress and challenges in exploiting graphics processors in computational fluid dynamics

Nächster Artikel SHAPE—an approach for self-healing and self-protection in complex distributed networks

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Aggarwal A, Vitter JS (1988) The input/output complexity of sorting and related problems. Commun ACM 31:1116–1127. doi:10.1145/48529.48535 CrossRefMathSciNet

Alpern B, Carter L, Feig E, Selker T (1994) The uniform memory hierarchy model of computation. Algorithmica 12:72–109. doi:10.1007/BF01185206 CrossRefMATHMathSciNet

Arge L, Goodrich MT, Nelson M, Sitchinava N (2008) Fundamental parallel algorithms for private-cache chip multiprocessors. In: Proceedings of the twentieth annual symposium on parallelism in algorithms and architectures, SPAA ’08. ACM, New York, pp 197–206. http://doi.acm.org/10.1145/1378533.1378573 CrossRef

Arge L, Goodrich MT, Sitchinava N, Nelson M (2008) Fundamental parallel algorithms for privatecache chip multiprocessors. In: 20th ACM symposium on parallelism in algorithm and architectures (SPAA). ACM, New York, pp 197–206

Bailey D, Barszcz E, Barton J, Browning D, Carter R, Dagum L, Fatoohi R, Frederickson P, Lasinski T, Schreiber R, Simon H, Venkatakrishnan V, Weeratunga S (1991) The nas parallel benchmarks summary and preliminary results. In: Supercomputing, 1991. Proceedings of the 1991 ACM/IEEE conference on supercomputing, pp 158–165 CrossRef

Bisseling RH (2004) Parallel scientific computation: a structured approach using BSP and MPI. Oxford University Press, Oxford CrossRef

Blelloch GE, Chowdhury RA, Gibbons PB, Ramachandran V, Chen S, Kozuch M (2008) Provably good multicore cache performance for divide-and-conquer algorithms. In: Proceedings of the nineteenth annual ACM-SIAM symposium on discrete algorithms, SODA ’08. Society for Industrial and Applied Mathematics, Philadelphia, pp 501–510. http://portal.acm.org/citation.cfm?id=1347082.1347137

Blelloch GE, Fineman JT, Gibbons PB, Simhadri HV (2011) Scheduling irregular parallel computations on hierarchical caches. In: Proceedings of the 23rd ACM symposium on parallelism in algorithms and architectures, SPAA ’11. ACM, New York, pp 355–366. http://doi.acm.org/10.1145/1989493.1989553 CrossRef

Butenhof DR Programming with POSIX threads. Addison-Wesley

10.

Che S, Boyer M, Meng J, Tarjan D, Sheaffer J, Lee SH, Skadron K (2009) Servet: a benchmark suite for autotuning on multicore clusters. In: Proceedings of international symposium on workload characterization, IISWC2009, pp 44–54 CrossRef

11.

Chowdhury R, Silvestri F, Blakeley B, Ramachandran V (2010) Oblivious algorithms for multicores and network of processors. In: IEEE international symposium on parallel distributed processing (IPDPS), pp 1–12

12.

Culler DE, Karp RM, Patterson D, Sahay A, Santos EE, Schauser KE, Subramonian R, von Eicken T (1996) Logp: a practical model of parallel computation. Commun ACM 39:78–85. http://doi.acm.org/10.1145/240455.240477 CrossRef

13.

Fortune S, Wyllie J (1978) Parallelism in random access machines. In: Proceedings of the tenth annual ACM symposium on theory of computing, STOC ’78. ACM, New York, pp 114–118. http://doi.acm.org/10.1145/800133.804339 CrossRef

14.

Frigo M, Leiserson CE, Prokop H, Ramachandran S (1999) Cache-oblivious algorithms. In: Annual IEEE symposium on foundations of computer science, p 285

15.

Gal-On S, Levy M (2008) Measuring multicore performance. Computer 41:99–102 CrossRef

16.

Gerbessiotis AV, Lee SY (2004) Remote memory access: a case for portable, efficient and library independent parallel programming. Sci Program 12(3):169–183. http://dl.acm.org/citation.cfm?id=1240140.1240144

17.

Gonzalez-Dominguez J, Taboada G, Fraguela B, Martin M, Tourio J (2010) Servet: a benchmark suite for autotuning on multicore clusters. In: Proceedings of international symposium on parallel and distributed processing, IPDPS2010, pp 1–9

18.

Hill JMD, Donaldson SR, Skillicorn DB (1997) Stability of communication performance in practice: From the cray t3e to networks of workstations. Technical Report PRG-TR-33-97, Programming Research Group, Oxford University Computing Laboratory

19.

Hill JMD, Skillicorn DB (1998) Practical barrier synchronisation. In: 6th EuroMicro workshop on parallel and distributed processing (PDP’98). IEEE Computer Society Press, Los Alamitos, pp 438–444

20.

Hill JM, McColl B, Stefanescu DC, Goudreau MW, Lang K, Rao SB, Suel T, Tsantilas T, Bisseling RH (1998) Bsplib: the bsp programming library. Parallel Comput 24(14):1947–1980. http://www.sciencedirect.com/science/article/pii/S0167819198000933 CrossRef

21.

Kayi A, Yao Y, El-Ghazawi T, Newby G (2007) Experimental evaluation of emerging multi-core architectures. In: Proceedings of 21th international parallel and distributed processing symposium, IPDPS2007

22.

Linux-operating-systems. http://linux.die.net

23.

Mellor-Crummey JM, Scott ML (1991) Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans Comput Syst 9(1):21–65. http://doi.acm.org/10.1145/103727.103729 CrossRef

24.

Ramachandran V (1997) Qsm: a general purpose shared-memory model for parallel computation. In: Ramesh S, Sivakumar G (eds) Foundations of software technology and theoretical computer science. Lecture notes in computer science, vol 1346. Springer, Berlin, pp 1–5. doi:10.1007/BFb0058018 CrossRef

25.

Sampson J, González R, Collard JF, Jouppi NP, Schlansker M (2005) Fast synchronization for chip multiprocessors. SIGARCH Comput Archit News 33(4):64–69. http://doi.acm.org/10.1145/1105734.1105743 CrossRef

26.

Savage J (1995) Extending the hong-kung model to memory hierarchies. In: Du DZ, Li M (eds) Computing and combinatorics. Lecture notes in computer science, vol 959. Springer, Berlin, pp 270–281. doi:10.1007/BFb0030842 CrossRef

27.

Savage JE, Zubair M (2008) A unified model for multicore architectures. In: Proceedings of the 1st international forum on next-generation multicore/manycore technologies, IFMT ’08. ACM, New York, pp 9:1–9:12. http://doi.acm.org/10.1145/1463768.1463780

28.

Singh JP, Weber WD, Gupta A (1992) Splash: Stanford parallel applications for shared-memory. SIGARCH Comput Archit News 20(1):5–44. http://doi.acm.org/10.1145/130823.130824 CrossRef

29.

S.P.E.C.S. benchmarks http://www.spec.org/

30.

Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33:103–111. http://doi.acm.org/10.1145/79173.79181 CrossRef

31.

Valiant LG (2011) A bridging model for multi-core computing. J Comput Syst Sci 77(1):154–166. http://www.sciencedirect.com/science/article/B6WJ0-508X3P4-5/2/92a9dec04839a9e93887ca26e0d9b3f2, celebrating Karp’s Kyoto Prize CrossRefMATHMathSciNet

32.

Yang TF, Lin CH, Yang CL (2010) Cache-aware task scheduling on multi-core architecture. In: 2010 international symposium on VLSI design automation and test (VLSI-DAT), pp 139–142. doi:10.1109/VDAT.2010.5496710 CrossRef

33.

Yotov K, Roeder T, Pingali K, Gunnels J, Gustavson F (2007) An experimental comparison of cache-oblivious and cache-conscious programs. In: Proceedings of the nineteenth annual ACM symposium on parallel algorithms and architectures, SPAA ’07. ACM, New York, pp 93–104. http://doi.acm.org/10.1145/1248377.1248394 CrossRef

Titel: Measurement of the latency parameters of the Multi-BSP model: a multicore benchmarking approach
Publikationsdatum: 01.02.2014
Erschienen in: The Journal of Supercomputing / Ausgabe 2/2014
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-013-1018-4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 2/2014

Parallel labeling of massive XML data with MapReduce

Direct volume rendering of unstructured tetrahedral meshes using CUDA and OpenMP

Formal process algebraic modeling, verification, and analysis of an abstract Fuzzy Inference Cloud Service

SHAPE—an approach for self-healing and self-protection in complex distributed networks

Supporting faulty banks in NUCA by NoC assisted remapping mechanisms

Parallel multigrid algorithms based on generic approximate sparse inverses: an SMP approach