1 Introduction
2 Experimental resources
2.1 PS3 cluster architecture and setting
-
PS3 is open-platform—meaning, it can run different operating systems (e.g., Fedora Core 8 for PPC).
-
The PS3 System contains Cell B.E. processor slightly different from the original (and 6×SPU 1×PPU).
-
Very low price approximately $300 makes it very attractive as a computing node in a cluster system.
-
Formatting the 9 PS3 nodes.
-
Installing the operating system Fedora Core 8 for PPC 64.
-
Installing the SSH service on each station and NFS, used to MPI communication and file sharing.
-
NFS setting up master node station and the other eight client stations.
-
Installed and configures OpenMPI library on all stations.
-
Installing Cell SDK 3.0 on each station.
-
t setup(n)—time to set up the nodes; initiate PPU and SPU threads.
-
max(t com(i))—max of times to transmit initial parameters to PPU threads (MPI-library).
-
t SPU(6)—max of times to transmit initial parameters to SPU threads (CellSDK-library).
-
max(t calculation(1/n))—max time of π fractions.
-
t PPUπ —time to add π fractions from SPU to PPU.
-
max(t comR (i))—max time to send the partial result from nodes to master node using MPI library.
-
\(t_{\mathrm{master}_{n}\mathrm{ode}\pi}\)—time to add π fractions from nodes to master node.
2.2 Cell architecture
3 Experimental work
3.1 Trapezoid area calculation method using the MPI distribution
3.2 Trapezoid area calculation method using MPI-SDK distribution
-
Pi→20+LOG(Error) obtained for the classical version of MPI.
-
Err 7.3→20+LOG(Error) obtained for the optimized version of MPI-SDK unbalanced with a coefficient k=7.3.
-
Pi→SQRT(Time(s)) obtained for the classical version of MPI.
-
Time 7.3→SQRT(Time(s)) obtained for the optimized version of MPI-SDK unbalanced with a coefficient k=7.3.
-
Using SPUs in the double precision calculation, greatly delayed his performance.
-
Making a PPU-SPU unbalanced distribution brings an improvement in computing time, in comparison with the use of PPU units only.
-
Distribution coefficient must be determined for a small number of calculation points (a few tenths of a second) and then used to calculate a high resolution (1012—several hours).
-
Distribution coefficient is linear with resolution increasing.
3.3 Calculation of the rectangle area method using MPI distribution
3.4 Rectangle area calculation method using MPI-SDK distribution
4 Conclusions
-
The introduction of the SPU units into the calculation of (DP-Double Precision) improves calculation time by about 50 % but is well below of SPU units potential if they had a unit DP.
-
Also, we get a lower error in calculating of the π value.
-
To obtain maximum efficiency of computing time, you must use an unbalanced distribution formula calculations between SPU and PPU units.
-
Distribution coefficient k takes different values depending on the number and complexity of the calculations.
-
Preferably use SPU units in double precision calculations, but reserve their use to a lower level.