Next:
The Fujitsu/Siemens PRIMEPOWER.
Up:
Recount of (almost) available ...
Previous:
The Cray Inc. XD1.
| Machine type |
Distributed-memory multi-vector processor. |
| Models |
XT3. |
| Operating system |
UNICOS/lc, Cray's microkernel Unix. |
| Connection structure |
3-D Torus. |
| Compilers |
Fortran 95, C, C++. |
| Vendors information Web page |
www.cray.com/products/xt3/ |
| Year of introduction |
2004. |
System parameters:
| Model |
Cray XT3 |
| Clock cycle |
2.4 GHz |
| Theor. peak performance |
| Per Processor |
4.8 Gflop/s |
| Per Cabinet |
460.8 Gflop/s |
| Max. Configuration |
147 Tflop/s |
| Memory |
| Per Cabinet |
≤ 768 GB |
| Max. Configuration |
196 TB |
| No. of processors |
| Per Cabinet |
96 |
| Max. Configuration |
30,508 |
| Communication bandwidth |
| Bisectional/cabinet |
333 GB/s |
Remarks:
The Cray XT3 is the commercial spinoff of the 10,000+ processor Red Storm
machine, built by Cray for Sandia Laboratories. The structure is similar, be it
that there are no provisions are made to have a &lquo;classified&rquo; and an
&lquo;unclassified&rquo; part in the machine. The basic processor in a node,
called PE (Processing Element) in Cray jargon, is the AMD Opteron 100, at 2.4
GHz. Cray has chosen for this uniprocessor version of the chip because of the
lower memory latency (about 60 ns) in contrast to the SMP-enabled versions that
have a memory latency that can be up to 2 times higher. Per PE up to 8 GB of
memory can be configured, connected by a 6.4 HyperTransport to the processor.
For connection to the outside world a PE harbours 2 PCI-X busses, a dual-ported
FiberChannel Host Bus Adaptor for connecting to disk, and a 10 GB Ethernet
card.
The Opteron was also chosen because of the high bandwidth the relatively ease
of connecting the processor of to the network processor, Cray's SeaStar chip.
For the physical connection another HyperTransport channel at 6.4 GB/s is used.
The SeaStar has 6 ports with a bandwidth of 7.6 GB/s each (3.8 GB/s, incoming
and outgoing). Because of its 6 ports the natural interconnection mode is
therefore a 3-D torus.
Like for the earlier Cray T3E (see \ref{gone}), Cray has chosen to use a
microkernel approach for the compute PEs. These are dedicated to computation
and communication and are not disturbed by other OS tasks that can seriously
influence the scalability (see [31]). For
tasks like communicating with users, networking, and I/O special PEs are added
that have versions of the OS that can handle these tasks.
The XT3 is obviously designed for a distributed memory parallel model,
supporting Cray's MPI 2.0 and its one-way communication shmem library
that date back to the Cray T3D/T3E systems but is still popular because of its
simplicity and efficiency. The system comes in cabinets of 96 PEs, including
service PEs. For larger configurations the ratio of service PEs to compute PEs
(generally) can be lowered. So, a hypothetical maximal configuration of 30,508
PEs would need only 106 service PEs.
Measured Performances:
The Cray XT3 is quite new and as yet no independent performance results are
available.
Next:
The Fujitsu/Siemens PRIMEPOWER.
Up:
Recount of (almost) available ...
Previous:
The Cray Inc. XD1.
Aad van der Steen
Aad van der Steen
Tue Mar 8 12:00:08 CET 2005
|