March 15th, 2019 by Adam Armstrong
NVIDIA Quadro RTX 5000 Review
Back in August at SIGGRAPH, NVIDIA announced an all new graphics architecture, Turing, as well as new Quadro RTX GPUs. Of the several new GPUs announced then, today we will be looking specifically at the NVIDIA Quadro RTX 5000 GPU. Billed as the first ray-tracing GPUs that also uses deep learning and advanced shading. The RTX 5000 is designed for next-generation workloads with the potential of rendering photorealistic scenes in real-time, a boon to video editors as well as automotive and architectural designers.
The driving force behind the new wave of GPUs is the NVIDIA new Turing architecture. The company is revered for its GPU leadership and has built upon this with its new core GPU architecture. The subject of the architecture is a bit too deep to get into here, but to sum it up: Turing uses several hardware advancements to achieve impressive new results. For ray-tracing, the architecture leverages processors called RT cores that accelerate the computation of how light and sound travel in 3D environments by up to 10 Giga Rays per second. A streaming multiprocessor improves raster performance and adds an enhanced graphics pipeline and new programmable shading technologies. Turing comes with new Tensor Cores that provide 500 trillion tensor operations per second. And Turing allows users to take advantage of more CUDA cores to support up to 16 trillion floating-point operations in parallel with 16 trillion integer operations per second.
The NVIDIA Quadro RTX 5000 is geared for creative professional that need to work on complex projects quickly and effectively. The GPU has 3,072 CUDA cores, 384 Tensor cores, 48 RT Cores and 16GB GDDR6 memory. This impressive amount of hardware is able to render complex models and scenes with physically accurate shadows, reflections, and refractions. The RTX 5000 supports NVIDIA NVLink letting users scale their memory and performance with multiple GPU configurations. Assuming there is room in their workstation, users can connect two Quadro RTX 5000 GPUs for up to 50GB/s of bandwidth and a combined 32 GB of GDDR6 memory. The GPU also comes with VirtualLink providing connectivity to the next-generation of high-resolution VR head-mounted displays.
NVIDIA Quadro RTX 5000 Specifications
|GPU Memory||16GB GDDR6|
|Memory Bandwidth||Up to 448 GB/s|
|NVIDIA CUDA Cores||3,072|
|NVIDIA Tensor Cores||384|
|NVIDIA RT Cores||48|
|Single-Precision Performance||11.2 TFLOPS|
|Tensor Performance||89.2 TFLOPS|
|NVIDIA NVLink||Connects 2 Quadro RTX 5000 GPUs|
|NVIDIA NVLink bandwidth||50GB/s (bidirectional)|
|System Interface||PCI Express 3.0 x 16|
|Power Consumption||Total board power: 265W
Total graphics power: 230W
Thermal Solution Active
|Form Factor||4.4” H x 10.5” L, Dual Slot, Full Height|
|Display Connectors||4xDP 1.4, 1x USB-C|
|Max Simultaneous Displays||4x 4096x2160 @ 120 Hz
4x 5120x2880 @ 60 Hz
2x 7680x4320 @ 60 Hz
|Encode/ Decode Engines||1X Encode, 2X Decode|
|Graphics APIs||DirectX 12.0
Shader Model 5.1
In order to test the performance of the new architecture in the NVIDIA Quadro RTX 5000 GPU, we installed it in our Lenovo ThinkSystem P920 workstation running Windows 10. For a comprehensive look at how each card performs, we leveraged multiple industry benchmarks and GPU-accelerated software that can take full advantage of the card under test. Not only will we be comparing it to the NVIDIA Quadro RTX 4000, which shares the Turning architecture, we will also be comparing it the previous Pascal Quadro line including the P6000, the P5000, and the P4000. This is less of, which is better, and more of what to expect with the GPU chosen.
In order to get a better idea of how these GPUs have scaled from different architectures, we’ve included the following table that summarizes the RTX family as it sits today. The RTX 5000 sits in a middle slot, one step up from the entry RTX 4000 and below the two more powerful RTX 6000 and RTX 8000 sibblings.
|NVIDIA Quadro GPUs|
|RTX 4000||RTX 5000||RTX 6000||RTX 8000|
|GPU Memory||8GB GDDR6||16GB GDDR6||24GB GDDR6||48GB GDDR6|
|Memory Bandwidth||UP to 416GB/s||Up to 448GB/s||Up to 672GB/s||Up to 672GB/s|
|NVIDIA CUDA Cores||2,304||3,072||4,608||4,608|
|NVIDIA Tensor Cores||288||384||576||576|
|NVIDIA RT Cores||36||48||72||72|
|Single-Precision Performance||7.1 TFLOPS||11.2 TFLOPS||16.3 TFLOPS||16.3 TFLOPS|
|Tensor Performance||57.0 TFLOPS||89.2 TFLOPS||130.5 TFLOPS||130.5 TFLOPS|
Our first benchmark is the LuxMark cross-platform OpenCL benchmark tool. LuxMark is based on the LuxCore API, and offered as a promotional component of the LuxCoreRender suite. It uses a new micro-kernel based OpenCL path tracer as the rendering more for its benchmark, offering a unique way to stress the GPU installed in a given workstation.
While the Pascal GPUs came off the LuxMark with good results there is an obvious jump in performance when looking at the Turning GPUs. The RTX 5000 was the top performer to no surprise with a score of 29,404.
Next up is Arion, a CUDA benchmarking tool, developed by RandomControl that allows workstations to stress CPUs or GPUs in a rendering application. ArionBench is a software tool based on Arion 2 Technology that puts CPU/GPUs under heavy stress through the task of simulating the flow of light in a 3D scene.
Another large jump in scores going from Pascal to Turing with the RTX 5000 leaping way out ahead of the rest, significantly faster than the P6000.
Our next benchmark leverages SolidWorks 2019 and four 3D models coveraging an Audi R8, a construction digger, a jet engine as well as a ralley car. Solidworks is an industry-leading GPU-accelerated 3D CAD modeling application that operates on Windows-based systems. SolidWorks is developed by Dassault Systèmes and is used by over two million engineers and more than 165,000 companies worldwide. For benchmarking purposes we leverage the new "performance pipeline" feature inside SolidWorks 2019. This architecture provides a more responsive, real-time display especially for large models. It takes advantage of modern OpenGL (4.5) and hardware-accelerated rendering to maintain a high level of detail and frame rate when you pan, zoom, or rotate large models.
After each model is rendered our script rotates each model five times and measures the time required to complete this task. It then divides that by the number of frames rendered and calculates the average frames per section (FPS) score.
|Solidworks R8||Average FPS|
|Solidworks Digger||Average FPS|
|Solidworks Jet Engine||Average FPS|
|Solidworks Rally Car||Average FPS|
It is interesting to see a slight downturn in performance for the Turing GPUs in Solidworks R8 and Rally Car compared to the Pascal, though those files may not fully leverage the newer GPUs. The RTX 5000 did provide superior performance in Digger and Jet Engine, outperforming the others by a wide margin. With our Solidworks we were using the beta display mode that may be the cause of the unusual scaling seen on the Audi R8 and RallyCar Assemblies.
Next up is the Environmental Systems Research Institute (Esri) benchmark. Esri is a supplier of Geographic Information System (GIS) software. Esri’s Performance Team designed their PerfTool add-in scripts to automatically launch the ArcGIS Pro. This application uses a “ZoomToBookmarks” function to browse various pre-defined bookmarks and create a log file with all the key data points required to predict the user experience. The script automatically loops the bookmarks three times to account for caching (memory and disk cache). In other words, this benchmark simulates heavy graphical use that one might see through Esri’s ArcGIS Pro 2.3 software.
The tests consist of three main datasets. Two are 3-D city views of Philadelphia, PA and Montreal, QC. These city views contain textured 3-D multipatch buildings draped on a terrain model and draped aerial images. The third dataset is a 2-D map view of the Portland, OR region. This data contains detailed information for roads, landuse parcels, parks and schools, rivers, lakes, and hillshaded terrain.
Looking at drawtime of the Montreal model, the NVIDIA Quadro RTX 5000 showed an average drawtime of 00:01:31.067, while average and minimum FPS showed 527.636 and 190.775, respectively.
|ESRI ArcGIS Pro 2.3 Montreal|
|Quadro RTX 4000||00:01:31.284|
|Quadro RTX 5000||00:01:31.067|
|Quadro RTX 4000||502.395|
|Quadro RTX 5000||527.636|
|Quadro RTX 4000||180.699|
|Quadro RTX 5000||190.775|
Next up is our Philly model, where the RTX 5000 showed an average drawtime of 00:01:01.111, while average and minimum FPS showed 531.315 and 224.341, respectively.
|ESRI ArcGIS Pro 2.3 Philly|
|Quadro RTX 4000||00:01:00.231|
|Quadro RTX 5000||00:01:01.111|
|Quadro RTX 4000||434.170|
|Quadro RTX 5000||531.315|
|Quadro RTX 4000||196.825|
|Quadro RTX 5000||224.341|
Our last model is of Portland. Here, the RTX 5000 had an average drawtime of 00:00:32.541. Average FPS showed 2,783.547 while Minimum FPS showed 1,007.309.
|ESRI ArcGIS Pro 2.3 Portland|
|Quadro RTX 4000||00:00:32.646|
|Quadro RTX 5000||00:00:32.541|
|Quadro RTX 4000||2,821.928|
|Quadro RTX 5000||2,783.547|
|Quadro RTX 4000||1,083.260|
|Quadro RTX 5000||1,007.309|
The NVIDIA Quadro RTX 5000 is one of the company’s newer GPUs based off of its Turing architecture. Turing is set to be a completely new take on GPU architecture as NVIDIA is looking to both change things up now with an eye on future developments. Aimed at creative professionals that have complex projects that need efficient and quick work, the RTX 5000 has impressive hardware under its hood including 3,072 CUDA cores, 384 Tensor cores, 48 RT Cores and 16GB GDDR6 memory. For those needing even more GPU performance, the RTX can scale with a second GPU through NVIDIA NVLink.
For all of its components that should lead to superior performance, we put it through a barrage of tests, new and old, just to see what it can do. A surprise to no one, the NVIDIA Quadro RTX 5000 was the top performer in most of our tests. In LuxMark and Arion the RTX 5000 more than doubled the scores of the P5000. The RTX 5000 had strong performance in the Solidworks Digger and Jet Engine benchmarks. It should be kept in mind that the RTX 5000, as powerful as it is, is not the top of the line in Turing GPUs.
If a creative professional is looking for a larger performance leap in most areas, the NVIDIA Quadro RTX 5000 will fit this bill. Our above performance results highlight the areas where the RTX 5000 shines and a few spots where a Pascal-based GPU performs well enough. Overall, with the RTX family NVIDIA has done an excellent job continuing to push the boundaries of what's available to creatives within a desktop. For its part, the RTX 5000 fills out the midrange offering well, offering a good balance of performance and price.