Technical FAQ

Why NVIDIA?
Do you have enough bandwidth?
Does the power supply deliver enough power?
What about cooling?
Don't you need SLI?
Would it be faster to use 9800GTX cards?
Comments from the experts

Why NVIDIA?

GPU Computing is a rapidly evolving field. Both NVIDIA and ATI develop their own framework for GPU program development, which is not compatible with the competitor's graphics cards. A relatively safe way to avoid vendor lock-in is the use of an open standard, such as OpenGL for GPU Computing. All computations must then be “disguised” as conventional OpenGL graphics operations. This approach has two drawbacks. First, the programming model is rather artificial and inflexible. Second, it does not allow to use specific features of the graphics cards, such as fast shared memory present in the processing cores. Recently, NVIDIA released the CUDA-platform, a C-like language along with a compiler and several supporting libraries for GPU Computing on NVIDIA GPUs. Writing code that runs optimally on the GPU still involves a lot of handwork, but the exposure of more hardware details lead to greater control for the programmer as well.

Do you have enough bandwidth?

FASTRA uses four 9800GX2 graphics cards, each of which contain two GPUs. Even within a single card, the two GPUs function independently. There is no direct communication between the GPUs (SLI cannot be used with CUDA). Therefore, all communication between the GPUs, as well as all communication between a GPU and CPU, must traverse the PCI-Express bus. Moreover, two GPUs on each card must share the bandwidth of their PCI-Express slot. For computations where communication bandwidth is a bottleneck, this would impose a limitation on the performance of FASTRA. Fortunately, for our tomography computations, the ratio between computation and communication is extremely large. The basic programming model is very simple: the task of reconstructing of a 3D volume is split into a large number of subtasks that are completely independent (no communication necessary between the subtasks). Each of the tasks is assigned to a GPU. The time required to upload the task data to the GPU memory and read out the result in the end is much shorter than the computation time on the GPU, resulting in a huge speedup. Moreover, for our application the computational power of FASTRA increases linearly with the number of GPUs used.

Does the power supply deliver enough power?

We experienced no problems at all with the Thermaltake Toughpower 1500W Power Supply. Even running several hours under at a 20% overclock under full load does not appear to be a problem for the PSU.

What about cooling?

Cooling is a major issue when using four 9800GX2 in tandem. We decided to use air-cooling, as water-cooling would make the system more complicated than necessary. Surprisingly, standard air-cooling works very well, as long as the side panel remains open. The standard Lian-Li PC-P80 side panel has no fans or ventilation holes, so it is not suited for our setup. We are currently waiting for a windowed side-panel, which can then either be fully opened (removing the plexi-glass) or partially opened to install a fan. That said, with the side panel removed, all GPUs run stable at 55°C idle, and 86°C under full load. Even with a 20% overclock of the shader-clocks on all cards, temperatures stay below 100°C under full load. Clearly, such high temperatures will restrict the lifetime of this system. However, for its price, a new system can easily be bought within one or two years.

Don't you need SLI?

SLI cannot be used within the NVIDIA CUDA programming model. Moreover, we don't even need SLI for our application, as our GPU computation involves no communication between the GPUs. Surprisingly, the FASTRA does not even have a SLI motherboard! It uses a crossfire motherboard with the AMD 780 chipset. The basic driver needs for such a system are twofold: the motherboard BIOS must be able to support eight separate GPUs and the NVIDIA drivers must be able to detect four 9800GX2 cards and use them as eight separate GPUs. We were not sure beforehand if this would work, but it turns out to work exactly as planned.

Would it be faster to use 9800GTX cards?

A 9800GTX card is faster than a single GPU on the 9800GX2 card. For games, using two GPUs typically does not result in double framerates. For our tomography computations, doubling the number of GPUs does result in double computation speed. Therefore, using 9800GX2 cards provides us with much more computation power than using 9800GTX cards.

Comments from the experts

Designing, building and testing FASTRA was quite a risky undertaking. Therefore, we continuously consulted the expert community on several public web forums, and obtained lots of advice doing so. Some comments that we received:

  • MSI helpdesk: "In theory you can install four dual-GPU cards to use. By the way, as MSI has not tested this configuration, MSI takes no responsibility for any damage caused by improper use or lack of technical expertise."
  • A public hardware forum: "What exactly are you planning to do with 8 GPUs if you aren't doing 3D acceleration or rendering? I don't work for NASA but I seriously doubt if any application in the world would require such a high end GPU setting."
  • A public hardware forum: "Just a guess, but I'd think you are going to need water cooling for the thing not to melt."
  • A public hardware forum: "Being a computer scientist I'm sure you have done your research and realize by now that (4) 9800 GX2 graphics cards (giving you in your situation) a proposed 8 GPUs with which to do computations is a complete and utter waste of time because the software that runs these cards is not designed to have all four cards working as a team."