CUDA Performance Vs CPU

Guys at have demonstrated the power of a NVIDIA CUDA Enabled GPU against an Intel Core 2 Quad CPU. The member has created an application which implements a “Matrix Multiplication” that can multiply like 2048×2048 dimension matrices using a Single NVIDIA GPU, Single threaded CPU and multithreaded CPU using OpenMP.

The test basically shows the time consumed by each configured device, hence inferring the performance. You can visit here to see and compare the results.

The following tests where performed on a system with the following configuration:

  • Intel Core 2 Duo E8400 @ 3.00 GHz.
  • XFX NVIDIA GeForce 8500GT with 16 CUDA cores and appx. 1GB DDR2 memory.
  • NVIDIA Drivers 260.99 WHQL Drivers.

For the first three tests, GPU was overclocked to

Attribute Default Value
Overclocked Value (Stable Values at 82 Degrees Celcius, TDP 40W)
Core Clock 450 MHz 710 MHz
Shader Clock 1024 MHz 1618 MHz
Memory Clock 400 MHz 480 MHz

1. First Test: Single GPU, 128 Threads for CPU, Matrix Size of 2048 x 2048.

2. Second Test: Single GPU, 128 Threads for CPU, Matrix Size of 1536x 1536.

3. Third Test: Single GPU, 96 Threads for CPU, Matrix Size of 1536 x 1536.

As you can see that the GPU has consumed the least time even though GeForce 8500GT is a low-priced low-end mainstream GPU from NVIDIA GeForce line. If a 16 core!! GeForce can perform so well than a powerful Intel Core 2 Duo CPU, then imagine the horse power of GeForce 200, 300, 400 and 500 line of series/Fermi GPU architectures. No matter the GPGPU computing with CUDA is high performance super computing.

You might be wondering what would the results look like when the GPU is not OC’d. Then lets take a look at the following screen shot.

4. Fourth Test: Single GPU, 128 Threads for CPU, Matrix Size of 2048×2048.

Astonished!!. The result is still in the favour of CUDA.


