Any suggestions for upgrading desktop GPU for doing CUDA computing?

조회 수: 29 (최근 30일)
Thomas Barrett
Thomas Barrett 2023년 2월 2일
댓글: Walter Roberson 2023년 2월 3일
Hi,
We are currently running Matlab on a regular office desktop workstation which has a Nvidia Quadro P4000, that we regularly do GPU computing with. We are now looking to add a second workstation, but since the P4000 is almost 6 years old now, I am looking for any advice on what model would be a good (Matlab-compatible) GPU to buy for this type of general purpose use these days?
Our current GPU has 1792 CUDA cores and 8GB RAM, so would ideally more than this on the new one. Cost should be less than around $2500 if possible.
I have read about Tesla cards, but these seem to be more for server racks, and also some cards also seem to have additional cooling requirements (our current card does not have additional cooling as far as I'm aware). So I would like a little advice before purchasing.
(the Nvidia T1000 looks similar, but number of CUDA cores seems less than the P4000. The RTX series, e.g. A2000, has a lot of cores, but I'm not sure if it's complicated to install in a desktop machine).
Thank you
  댓글 수: 10
Thomas Barrett
Thomas Barrett 2023년 2월 3일
@Walter Roberson that's great, thanks. At least I wasn't misunderstanding something.
If I compare the Quadro GP100 (2016) to, for example, a current model like the RTX A4000 (2021), something in the specs confuses me:
  • The more modern RTX A4000 has 6144 CUDA cores and a double precision rate of 599 Gflops.
  • The older GP100 has fewer CUDA cores at 3584, but a much faster double precision rate of 5168 Gflops.
Before we started this discussion, I assumed that more CUDA cores would be better for our calculations, but now I'm not sure if it's an important spec to look at, and I should be focusing on the double precision rate?
Thanks for your patience
Walter Roberson
Walter Roberson 2023년 2월 3일
NVIDIA implements double precision in four different ways.
  • most of their systems implement double precision in software by using the internal single-precision cores to emulate double precision. It takes 32 single precision instructions to emulate a double precision instruction, so the double precision rate is 1/32 of the single precision rate on those machines
  • Some of their older boards (I think it was some of the earlier Quadro, not sure anymore) had a hardware assist for double precision that permitted double precision to execute a 1/24 of the single precision rate
  • Some of the boards such as the GP100 have a hardware assist that permits double precision as 1/8 of the single precision rate
  • Recently, double precision has been implemented in TensorFlow cores; I do not know anything about the implementation of that. Those are for high-end systems, such as the H100 accelerator card for their workstations.
The list price of the RTX 3090 is pretty much the same as your budget; some places are still charging a premium for it, but other places are apparently currently discounting it (because the RTX 4090 is out.) Depending on the exact model, the RTX 3090 double precision rate is roughly 500 gigaflops (the wikipedia list columns show teraflops for that series, which is why it looks slower at first glance.) But that is still roughly 10 times slower than the GP100. It is roughly 3 times faster than what you have now -- but the GP100 (roughly twice the price, and so out of your original budget) is more than 30 times as fast as your current system. Unless, that is, I am badly misreading the tables.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Graphics Performance에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by