Benchmarked: Crysis 3 Running on GPU VRAM and System RAM Disks
What else are you going to do with 24GB of VRAM?
A couple of days ago, word got out that a gamer was running Crysis from a 3090 GPU RAM disk. It worked, but of course, we had questions. Specifically, how well does Crysis 3 run from a RAM disk compared to boring old SSD storage? That's easy enough to test, so we set about downloading Crysis 3, VRAM Drive, and ImDisk Toolkit—those last two were required for the GPU and system RAM disk testing.
We hoped to see some actual benefit from using a RAM disk, but let's set expectations first. A RAM disk—GPU or system RAM, it doesn't matter—isn't for everyone. In fact, there are numerous reasons not to even bother. The biggest drawback is that RAM is volatile: Shut down the application providing the disk, or power off your PC, and whatever was stored on the RAM disk goes poof. That often means you have to engage in workarounds, like installing apps and games to the RAM drive, but then keeping a copy on non-volatile storage. Each time you reboot, you have to restore the files to the RAM drive.
There are other concerns with RAM disk storage, however. The biggest is that it only really helps when your storage drive's performance specifically bottlenecks an application. Many, even most, applications and games simply don't fall into that category, particularly if you're comparing RAM disk storage to a fast SSD. The entirety of Crysis 3 amounts to just 14.2GB, which means a fast NVMe SSD can read the whole game into memory in about five seconds. It's not a particularly taxing storage application, in other words.
But reading the data into memory is only a small part of what a game does. There's a lot of processing of data, which can take quite a bit longer than the loading of data into memory. Actually, many games store data in a compressed format, which has to be decompressed to be useful. DirectStorage and RTX IO may reduce such bottlenecks on future games, but Crysis 3 came out in 2013 and obviously isn't using anything like that.
Finally, before we get to the results, we encountered some anomalies with running Crysis 3 on the latest Windows 10 (release 2004) platform. Specifically, we ended up with a 65 fps performance limit that we couldn't get around at 1080p and 1440p in a consistent manner. Turning off vsync in the game and Nvidia drivers didn't help, but 4K and maxed out settings seem to have mostly gotten around it, with performance in the 70-80 fps range. And with that out of the way, let's look at the results.
Yeah, that earlier bit about Crysis 3 not being very storage limited? This is the result. Launch times varied by about 0.2 seconds in our testing, but some of that might be human error. No one is likely to notice a 0.2 second difference in load times, and the SATA SSD actually outperformed the NVMe SSD. The GPU RAM Disk ended up coming in last, perhaps just due to software overhead. The VRAM Drive ought to perform as well as other storage options, but again, a few tenths of a second aren't particularly meaningful. The time to load a save game was effectively tied.
As for actual in-game performance, there's a bit more variability between runs, with the VRAM Drive coming out just a hair ahead of the two SSDs. Running the game off of system RAM ended up being the slowest, which again doesn't make much sense, but it was consistently nearly 1 fps slower than the other storage options. There were also still occasional stutters on all of the test options (particularly on the first run, where minimum fps dropped into the single digits), so extreme RAM drive storage didn't fix that.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Why Aren't RAM Drives Faster?
The issue with RAM drives is that the applications have no idea they're residing on blazingly fast storage. This is why letting the application or game or even OS manage memory is usually a better overall solution. Think about what we've done here in our testing of Crysis 3.
For the RAM drive, we've allocated a chunk of system memory as storage. When we launch the game, the CPU reads data out of that portion of memory, copies it into another section of RAM, then processes the data in various ways and loads some portions of the data into the GPU memory. There's a lot of wasted resources and effort.
For the VRAM drive, it's even worse. Data gets copied over the PCIe bus to the system RAM when the game launches, then it gets processed there, and eventually, textures and other portions of the game get copied back to the VRAM over the PCIe bus. We're using DDR4-3600 memory that provides 57.6 GBps of bandwidth, but the PCIe Gen3 bus only manages 16 GBps. PCIe Gen4 might help in this case, but even so, we're still generally going to end up being limited by other elements rather than storage throughput.
The takeaway is that, besides being an extremely expensive storage solution, VRAM and RAM disks typically aren't necessary for games. It would be better for games to optimize their use of memory better than to try and pre-allocate a fixed portion of memory—system or GPU—as storage, then copy over files to that drive, and maybe gain some benefit. There are situations where a RAM disk can be more beneficial, particularly in the server realm. But for games like Crysis 3? Not so much.
Jarred Walton is a senior editor at Tom's Hardware focusing on everything GPU. He has been working as a tech journalist since 2004, writing for AnandTech, Maximum PC, and PC Gamer. From the first S3 Virge '3D decelerators' to today's GPUs, Jarred keeps up with all the latest graphics trends and is the one to ask about game performance.
Intel Arc B580 Battlemage GPU specs leaked in accidental retailer listing — Arc B580 features PCIe 5.0 x8 interface, 12GB GDDR6, and 192-bit memory interface
The RTX 5090's GB202 GPU will reportedly be the largest desktop chip from Nvidia since 2018 coming in at 744mm-squared — 22% larger than AD102 on the RTX 4090
-
Giroro warezme said:So don't waste your time. It's not like you can even get a 3090 at this point anyway.
Nor is a ~10% performance gain in gaming worth paying over double the price of a 3080 -
NP Giroro said:Nor is a ~10% performance gain in gaming worth paying over double the price of a 3080
That value proposition is 100% subjective, and whether "it is worth it" depends largely on what your priorities are and how much money have you got.
I guess the overwhelming majority would agree with you. Then again, the overwhelming majority is not who the card is designed for, and everyone who knows their stuff knew it way before the release of 3090. -
jasonf2 It is really impressive how far the industry has come in such a short period of time with SSD technology. Back in the day I used to use RAM drives all of the time when spin drives topped out at 30 MBps and probably 100 IOPs and constituted a major bottleneck. The fact that it made no statistical difference is really saying something as we continue to move forward in areas other than CPU/GPU development.Reply -
vinay2070 Its surprising how the ssd manages to keep up with nvme. I have an nvme and ssd and nvme sure feels a lot faster especially when copying files and compression/decompression actions. For game loading made no noticeble difference.Reply -
DZIrl vinay2070 said:Its surprising how the ssd manages to keep up with nvme. I have an nvme and ssd and nvme sure feels a lot faster especially when copying files and compression/decompression actions. For game loading made no noticeble difference.
Aha, try to copy GB of small files and tell me -
vinay2070
The nvme is still ~50% faster than ssd depending on how small/big the files are.DZIrl said:Aha, try to copy GB of small files and tell me -
jasonf2
The article from what I could see didn't go into much detail on what the drives were or more importantly what assistive technologies were used. If they were Samsung drives with magicianvinay2070 said:Its surprising how the ssd manages to keep up with nvme. I have an nvme and ssd and nvme sure feels a lot faster especially when copying files and compression/decompression actions. For game loading made no noticeble difference.
there is a ram caching routine that could have been turned on that pretty significantly boosts performance on both SATA and NVMe in small repetitive loads. This might also attribute for the benchmarks being so close if every thing was running pretty close to RAM level before the whole thing started. Crysis isn't that small but the ideal that the boost software prefetched most of it isn't implausible. This is pretty speculative, but this is also coupled with the fact that generic RAM Disk software has rarely given me all of the performance you would think you should get from RAM when you run the numbers. That would probably take a hardware solution, that to my knowledge is not available in the consumer sector. If it is you certainly aren't going to pick it up at best buy. -
knowom Bandwidth saturation is something that needs to be considered the VRAM came out ahead slightly in the real world 4K performance for average and 99 percentile results. I think that could be due to being faster than SSD/NVME while at the same time being as fast as RAMDISK at read speeds which is what games do mostly read data and most of the written data isn't large bandwidth intensive files anyway. Normally a GPU is slotted in a x16 slot so that has higher bandwidth than a standard PCIE x4 bandwidth NVME device would performance wise and I'll assume it also has less overhead than a x16 quad M.2 device subsequently as well plus no real performance throttling concerns due to cooling. The lack of testing PCIE 4.0 is rather unfortunate I'd suspect improved performance, but that holds true of NVME on PCIE 4.0 as well. I also think perhaps the AMD Ryzen 3950X or a 5950X would be a great test it's L2 cache is more substantial than Intel's with less cores and in turn less combined L2 cache. I think what may have happened with the SSD/NVME/VRAM over the RAMDISK is you've got the system memory bandwidth saturation + the SSD/NVME/VRAM bandwidth combined as opposed to only system memory bandwidth to saturate fully so it scores a bit higher in practice. Though I think it's also limited by how quickly and how much you can fill the the L2 Cache's bandwidth structure perhaps as well. In general the L2 cache isn't real enormous though these more multi-core heavy CPU's are effectively widening them further even at the same KB sizes which means less L3 latency penalties imposed under stress. I think it really takes something like a VRAM drive and PCIE 4.0 and more multi-core heavy CPU's to start to show the maximum upside benefit to it though in practice. That's not something a user in general would test for in reality it's a very obscure and rare use case with high end hardware compared to older hardware you'd never normally see that difference and might even rule it out as a CPU variance when it might not be the case. I'm curious if Microsoft will bump up the NTFS format unit allocation size from 4096bytes to a higher threshold while still enabling compression that's a limitation that benefit newer CPU's potentially NTFS itself is pretty old so when that when that was put in place when quad cores was the max core count for consumers that could've been a reason they didn't enable compression beyond that unit allocation size settings I'm not sure just my speculation on it. It may have been tested and saw no upside so didn't bother making it a option unless there is another reason I'm unaware of that just limits it to that point and can't be resolved like with 4GB on x86.Reply
Something I'd like to see is ATTO disk benchmark tested for the the following scenario's for Ryzen 3950x/5950X and Intel i9-9900K with NTFS format and unit allocation size 4096 with compression enabled.
Ryzen 3950X/5950X.
256KB to 8MB I/O size/file size
256KB to 16MB I/O size/file size
256KB to 64MB I/O size/file size
256KB to 128MB I/O size/file size
Intel's i9-9900K
256KB to 2MB I/O size/file size
256KB to 8MB I/O size/file size
256KB to 16MB I/O size/file size
256KB to 128MB I/O size/file size