Next:
The C-DAC PARAM Padma.
Up:
Recount of (almost) available ....
Previous:
Recount of (almost) available ....
| Machine type |
ccNUMA system. |
| Models |
NovaScale 5080, 5160. |
| Operating system |
Linux, WindowsServer 2003, GCOS 8 |
| Connection structure |
Full crossbar |
| Compilers |
Intel's Fortran 95, C(++) |
| Vendors information Web page |
http://www.bull.com/novascale/ |
| Year of introduction |
2002. |
System parameters:
| Model |
NovaScale 5080 |
NovaScale 5160 |
| Clock cycle |
1.5 GHz |
1.5 GHz |
| Theor. peak performance |
48 Gflop/s |
96 Gflop/s |
| No. of processors |
8 |
16 |
| Comm. bandwidth |
| Point-to-point |
6.4 GB/s |
6.4 GB/s |
| Aggregate |
12.8 GB/s |
25.6 GB/s |
Remarks:
The availability of the Itanium 2 has spurred some vendors that are
traditionally not active in the HPC business to try their hand in this area.
One of these is Bull that markets its NovaScale ccNUMA SMPs with up to 16
nodes. The NovaScale systems are built from standard Intel Quad Building Blocks
(QBBs) each housing 4 Itanium 2 processors and a part of the memory. The QBBs
in turn are connected by Bull's proprietary FAME Scalability Switch (FSS)
providing an aggregate bandwidth of 25.6 GB. For reliability reasons a NovaScale
5160 is equipped with 2 FSSes. This ensures that when any link between a QBB and
a switch or between switches fails the system is still operational, be it on a
lower communication performance level. As each FSS has 8 ports and only 6 of
these are occupied within a 5160 system, the remaining ports can be used to
couple two of these systems thus making a 32-processor ccNUMa system. Larger
configurations can be made by coupling systems via QsNet II (see section QsNet).
Bull provides its own MPI implementation which turns out to be very efficient
(see "Measured Performances" below and
[42]).
A distinctive feature of the NovaScale systems is that they can be partitioned
such that different nodes can run different operating systems and that
repartitioning can be done dynamically. Although this is not particularly
enticing for HPC users, it might be interesting for other markets, especially as
Bull still has clients that use their proprietary GCOS operating system.
A smaller system, the NovaScale 4040, with 4 processor is also available as a
departmental server. As Bull employs the Itanium 2, the Fortran 95 and C
compilers from Intel are automatically available. Bull's documentation gives no
information about other HPC software that might be available but it should have
all third-party software that has been ported to the Itanium 2 platform.
Measured Performances:
In the spring of 2004 rather extensive benchmark experiments with the EuroBen
Benchmark were performed on a 16-processor NovaScale 5160 with the 1.3 GHz
variant of the processor. Using the EuroBen benchmark, the MPI version of a
dense matrix-vector multiply was found to be 13.3 Gflop/s on 16 processors
while both for solving a dense linear system of size N = 1,000 and a 1-D
FFT of size N = 65,356 speeds of 3.3—3.4 Gflop/s are observed (see
[42]).
Next:
The C-DAC PARAM Padma.
Up:
Recount of (almost) available ....
Previous:
Recount of (almost) available ....
Aad van der Steen
Tue Mar 8 14:08:13 CET 2005
|