The big announcement from NVIDIA was long-awaited not only by the gamer community. Filmmakers were also intrigued by the possibilities new generation GPUs could bring them. Finally the day had come. Let’s dig into the goodies it has brought.
The previous legend from NVIDIA was RTX 2080 Ti. We could see the speed getting up to 45.4 FPS in our NeatBench tests working with FullHD frames. The new RTX 30 Series are set to change the game! RTX 3080 has outperformed the ancestor by 31% with 59.7 FPS working at the same settings while 4K results with default settings are 41% better.
So what makes RTX 3080 a better GPU for denoising and rendering in general?
First, the RTX 30 series are powered by Ampere architecture announced earlier in the year. However, we were surprised to see that the number of CUDA cores (aka shaders) packed in the RTX 30 GPUs is much higher than we anticipated. Specifically, RTX 3080 has 8704 CUDA cores, which is exactly two times more than the previous NVIDIA’s flagship in the consumer class RTX 2080 Ti has had (4352). This combined with somewhat higher clock speeds makes the processing power of RTX 3080 more than twice higher than that of RTX 2080 Ti.
But the real-life video denoising performance does not depend on the raw computing horsepower alone. In many cases, the actual performance of a GPU is not dictated by the number of cores, but also by the internal memory bandwidth as cores may be waiting for data to be read from GPU memory into registers or written from registers to GPU memory. Therefore internal memory bandwidth has a big impact on calculation speeds as data is made available to GPU computing cores sooner. It also allows the results to be written faster. Because of this reason we really welcomed the higher internal memory bandwidth of the new GPUs. More specifically, RTX 3080 offers 760 GB/s compared to RTX 2080 Ti with 616 GB/s.
The improvement in internal memory bandwidth seems to have been achieved thanks to increasing memory clock speed from 7000 to 9500 MHz compared to RTX 2080 Ti. Despite the fact that the reduction in bus width from 352 to 320 bits has taken away some of the gains, the resulting memory bandwidth of RTX 3080 is still significantly better than that of its predecessor.
Our direct tests indeed demonstrate a 24% improvement in the internal transfer speed of RTX 3080 compared to RTX 2080 Ti: the actually observed speeds on the test computers were 649,087 MB/s and 521,318 MB/s respectively.
The RTX 30 Series GPUs are also the first NVIDIA’s consumer-grade GPUs that support the new generation PCI Express 4.0 of serial bus protocol.
PCI Express 4.0 is twice as fast as PCI Express 3.0 when the same number of lanes is used. Our direct tests confirm that RTX 3080 indeed delivers the expected gains: we have observed 120% and 103% better transfer speeds to and from GPU compared to another system with PCIe 3.0-based RTX 2080 Ti installed.
However, the improvement in data transfer speeds between the GPU and the main memory may not significantly affect the performance of Neat Video, as in many cases PCI Express 3.0 has already been fast enough.
Having said that, you should be able to notice the benefit from PCI Express 4.0 if you are using:
- a single GPU and not enough GPU memory is available to Neat Video (check the green triangle in Neat Video Preferences > Performance > GPU) OR
- more than one PCI Express 4.0 GPU (provided that both the CPU and the motherboard are able to deliver 16 PCIe 4.0 lanes to each GPU) OR
- a GPU-accelerated video editing application (such as DaVinci Resolve)
The exact gains in these scenarios are yet to be measured - that’s something we are looking forward to do once the new GPUs reach our labs.
Anyway… without further ado let’s jump straight onto NeatBench test results:
|Frame size||Temporal Radius||RTX 2080 Ti|
Frames per Second
Frames per Second
As always, when looking at NeatBench’s results it’s important to remember that the figures show the performance of Neat Video alone without the overhead of a video editing application, input and output codecs and possibly other effects. The actual render speed may be lower and will vary from project to project and application to application.
So what’s next? We are working on Neat Video update that will allow the first happy owners of RTX 3080 to unleash the performance gains offered by these devices. Hopefully, it will bake just in time by the moment RTX 3080 becomes available in stores. The work on optimizing Neat Video will not stop there: once we get the new GPUs in our computers, we will take a closer look and should hopefully be able to squeeze even more frames per second out of the new hardware.
We are also looking forward to the release of an even larger beast — RTX 3090. With 10,496 CUDA cores onboard and 24 GB or memory offering the internal bandwidth of more than 935 GB/s, we are expecting this GPU to be much more impressive than its younger brother RTX 3080.
We would like to thank NVIDIA for collaboration in running our tests on the new GPUs. Without their support, we would not be able to access RTX 3080 so quickly.