Home ConsumerClient Accessories NVIDIA Quadro RTX 4000 Review

NVIDIA Quadro RTX 4000 Review

by Marshall Gunnell

The Quadro RTX 4000, announced in November of last year, is part of NVIDIA’s professional GPU family. The RTX 4000 is specifically designed for the CAD software professional, providing intense realism and immersive interaction with their designs. Consequently, this allows them to run advanced simulations and analyses on their local workstation.


The Quadro RTX 4000, announced in November of last year, is part of NVIDIA’s professional GPU family. The RTX 4000 is specifically designed for the CAD software professional, providing intense realism and immersive interaction with their designs. Consequently, this allows them to run advanced simulations and analyses on their local workstation.

Not unlike other NVIDIA GPU’s, the RTX 4000 utilizes NVIDIA Quadro Scalable Visual Solutions (SVS). As a result, each individual RTX 4000 card can support up to four 5K monitors at 60Hz, or dual 8K displays. When utilizing two Quadro Sync II boards, one system can support up to eight RTX 4000 GPU’s, synchronizing 32 separate displays. NVIDIA also claims a 40% improvement in bandwidth than the previous generation Quadro P4000, thanks in part to 8GB of GDDR6 graphics memory.

The RTX 4000 sports a 4.4” H x 9.5” L single slot form factor, allowing the GPU to fit a variety of workstation chassis. In this slim form factor NVIDIA was able to fit 2304 CUDA cores, 288 Tensor Cores, 36 RT cores and 8GB GDDR6 memory. This hardware is designed for intense AEC, DCC, AI, VR and graphics workloads. The RTX 4000 comes with VirtualLink to simplify the connectivity to next-generation, high-resolution VR head-mounted displays.

Quadro RTX 4000 Specifications

ArchitectureNVIDIA Turing
GPU Memory8GB GDDR6
Memory Interface256-bit
Memory BandwidthUp to 416GB/s
NVIDIA CUDA Cores2,304
NVIDIA Tensor Cores288
NVIDIA RT Cores36
Single-Precision Performance7.1 TFLOPS
Tensor Performance57.0 TFLOPS
System InterfacePCI Express 3.0 x 16
Power ConsumptionTotal board power: 160W
Total graphics power: 125W
​Thermal Solution Active
Form Factor4.4” H x 9.5” L, Single Slot
Max Simultaneous Displays4x 3840×2160 @ 120 Hz
4x 5120×2880 @ 60 Hz
​2x 7680×4320 @ 60 Hz
VR ReadyYes
Graphics APIsShader Model 5.1
OpenGL 4.5
DirectX 12.0
Vulkan 1.0
Compute APIsCUDA
DirectCompute
​OpenCL

Performance

In order to test the performance of the new architecture in the NVIDIA Quadro RTX 4000 GPU, we installed it in our Lenovo ThinkSystem P920 workstation running Windows 10. For a comprehensive look at how each card performs, we leveraged multiple industry benchmarks and GPU-accelerated software that can take full advantage of the card under test. Not only will we be comparing it to the NVIDIA Quadro RTX 5000, which shares the Turning architecture, we will also be comparing it the previous Pascal Quadro line including the P6000, the P5000, and the P4000. This is less of, which is better, and more of what to expect with the GPU chosen.

In order to get a better idea of how these GPUs have scaled from different architectures, we’ve included the following table for the RTX as it is today. The RTX 4000 is clearly the entry-level card in the family, where the P-series cards started out with the P1000. Of course the RTX family scales all the way up to the RTX 8000, bringing more graphics memory, bandwidth and cores along the way.

NVIDIA Quadro GPUs
RTX 4000RTX 5000RTX 6000RTX 8000
GPU Memory8GB GDDR616GB GDDR624GB GDDR648GB GDDR6
Memory Interface256-bit256-bit384-bit384-bit
Memory BandwidthUP to 416GB/sUp to 448GB/sUp to 672GB/sUp to 672GB/s
NVIDIA CUDA Cores2,3043,0724,6084,608
NVIDIA Tensor Cores288384576576
NVIDIA RT Cores36487272
Single-Precision Performance7.1 TFLOPS11.2 TFLOPS16.3 TFLOPS16.3 TFLOPS
Tensor Performance57.0 TFLOPS 89.2 TFLOPS130.5 TFLOPS130.5 TFLOPS

Our first benchmark is the LuxMark cross-platform OpenCL benchmark tool. LuxMark is based on the LuxCore API, and offered as a promotional component of the LuxCoreRender suite. It uses a new micro-kernel based OpenCL path tracer as the rendering more for its benchmark, offering a unique way to stress the GPU installed in a given workstation.

LuxMark
GPUsResults
P400015,303
P500013,170
P600021,297
RTX 400028,338
RTX 500029,404

While the Pascal GPUs came off the LuxMark with good results there is an obvious jump in performance when looking at the Turning GPUs. The RTX 4000 came in second to the RTX 5000 with a score of 28,338.

Next up is Arion, a CUDA benchmarking tool, developed by RandomControl that allows workstations to stress CPUs or GPUs in a rendering application. ArionBench is a software tool based on Arion 2 Technology that puts CPU/GPUs under heavy stress through the task of simulating the flow of light in a 3D scene.

Arion
GPUsResults
P40001,865
P50002,738
P60003,731
RTX 40004,484
RTX 50006,193

Another large jump in scores going from Pascal to Turing with the RTX 4000 making a fairly large jump over the P6000.

Our next benchmark leverages SolidWorks 2019 and four 3D models coveraging an Audi R8, a construction digger, a jet engine as well as a ralley car. Solidworks is an industry-leading GPU-accelerated 3D CAD modeling application that operates on Windows-based systems. SolidWorks is developed by Dassault Systèmes and is used by over two million engineers and more than 165,000 companies worldwide. For benchmarking purposes we leverage the new “performance pipeline” feature inside SolidWorks 2019. This architecture provides a more responsive, real-time display especially for large models. It takes advantage of modern OpenGL (4.5) and hardware-accelerated rendering to maintain a high level of detail and frame rate when you pan, zoom, or rotate large models.

After each model is rendered our script rotates each model five times and measures the time required to complete this task. It then divides that by the number of frames rendered and calculates the average frames per section (FPS) score.

Solidworks
Solidworks R8Average
P4000198.0232
P5000214.9254
P6000217.9745
RTX 4000211.1824
RTX 5000208.8849
Solidworks DiggerAverage
P4000186.4832
P5000211.9595
P6000230.9774
RTX 4000259.6056
RTX 5000294.2529
Solidworks Jet EngineAverage
P4000163.0573
P5000198.5351
P6000210.411
RTX 4000220.6897
RTX 5000283.2206
Solidworks Rally CarAverage
P4000205.6225
P5000219.0114
P6000218.4922
RTX 4000214.4253
RTX 5000217.256

With Solidworks R8 and Rally Car that is a sligh dip in performance for the Turing models, however there is a large jump in Digger and Jet Engine. With our Solidworks testing we were using the beta display mode that may be the cause of the unusual scaling seen on the Audi R8 and RallyCar Assemblies.

Next up is the Environmental Systems Research Institute (Esri) benchmark. Esri is a supplier of Geographic Information System (GIS) software. Esri’s Performance Team designed their PerfTool add-in scripts to automatically launch the ArcGIS Pro. This application uses a “ZoomToBookmarks” function to browse various pre-defined bookmarks and create a log file with all the key data points required to predict the user experience. The script automatically loops the bookmarks three times to account for caching (memory and disk cache). In other words, this benchmark simulates heavy graphical use that one might see through Esri’s ArcGIS Pro 2.3 software.

The tests consist of three main datasets. Two are 3-D city views of Philadelphia, PA and Montreal, QC. These city views contain textured 3-D multipatch buildings draped on a terrain model and draped aerial images. The third dataset is a 2-D map view of the Portland, OR region. This data contains detailed information for roads, landuse parcels, parks and schools, rivers, lakes, and hillshaded terrain.

Looking at drawtime of the Montreal model, the NVIDIA Quadro RTX 4000 showed an average drawtime of 00:01:31.284, while average and minimum FPS showed 502.395 and 180.699, respectively.

ESRI ArcGIS Pro 2.3 Montreal
DrawtimeAverage
Quadro P400000:01:31.084
Quadro P500000:01:31:082
Quadro P600000:01:31.081
Quadro RTX 400000:01:31.284
Quadro RTX 500000:01:31.067
Average FPSAverage
Quadro P4000432.327
Quadro P5000489.889
Quadro P6000521.551
Quadro RTX 4000502.395
Quadro RTX 5000527.636
Minimum FPSAverage
Quadro P4000164.546
Quadro P5000194.218
Quadro P6000190.336
Quadro RTX 4000180.699
Quadro RTX 5000190.775

Next up is our Philly model, where the RTX 4000 showed an average drawtime of 00:01:00.231, while average and minimum FPS showed 434.170 and 196.825, respectively.

ESRI ArcGIS Pro 2.3 Philly
DrawtimeAverage
Quadro P400000:02:53.928
Quadro P500000:01:01.109
Quadro P600000:01:01.245
Quadro RTX 400000:01:00.231
Quadro RTX 500000:01:01.111
Average FPSAverage
Quadro P4000304.340
Quadro P5000451.826
Quadro P6000469.879
Quadro RTX 4000434.170
Quadro RTX 5000531.315
Minimum FPSAverage
Quadro P4000160.152
Quadro P5000212.910
Quadro P6000207.879
Quadro RTX 4000196.825
Quadro RTX 5000224.341

Our last model is of Portland. Here, the RTX 4000 had an average drawtime of 00:00:32.646. Average FPS showed 2,821.928 while Minimum FPS showed 1,083.260.

ESRI ArcGIS Pro 2.3 Portland
DrawtimeAverage
Quadro P400000:00:32.426
Quadro P500000:00:32.310
Quadro P600000:00:32.552
Quadro RTX 400000:00:32.646
Quadro RTX 500000:00:32.541
Average FPSAverage
Quadro P40002,051.053
Quadro P50002,057.395
Quadro P60002,343.948
Quadro RTX 40002,821.928
Quadro RTX 50002,783.547
Minimum FPSAverage
Quadro P40001,179.974
Quadro P50001,189.524
Quadro P60001,282.045
Quadro RTX 40001,083.260
Quadro RTX 50001,007.309

Conclusion

The NVIDIA Quadro RTX 4000 is the lower level Turing architecture GPU but that doesn’t mean it isn’t powerful. The RTX 4000 comes equipped with 2304 CUDA cores, and 8GB GDDR6 GPU. Like all of the Qaudro RTX, the 4000 is able to deliver accelerated ray tracing, deep learning, and advanced shading in its accessible single slot form factor. This can give creative professionals faster time to insight while allowing them to accelerate their creative efforts. The RTX 4000 also comes with VirtualLink that simplifies connectivity to next-generation, high-resolution VR head-mounted displays.

In terms of performance, the RTX 4000 performed very well, especially considering it is on the lower end of the new GPUs. In our LuxMark benchmark it nearly doubled its Pascal counterpart and even surpassed the P6000. In Arion the RTX 4000 more than doubled the P4000 and again easily surpassed the P6000. In our Solidworks benchmarks the RTX 4000 easily surpassed the P4000 and shined brighter in the Digger and Jet Engine benchmark. In ESRi the RTX 4000 had much better performance than the P4000 (and P5000 in some cases), but there are workloads where the P6000 was the better performer. It should be kept in mind the the RTX 4000 is on the low end of the Turing architecture and the P6000 is at the highest end of the Pascal architecture.

All in all, the Quadro RTX 4000 is a much-welcomed addition to NVIDIA’s large line of impressive GPU’s and offers very impressive performance numbers for the entry-level card while carrying a price tag of only around $900.

Quadro RTX 4000 on Amazon

Quadro RTX 4000 Product Page

Discuss this review

Sign up for the StorageReview newsletter

    Share via
    Copy link