Microsoft aims to boost ray-tracing performance in VRAM-constrained scenarios — patent describes a new level of detail system for RT effects
8GB GPUs could be saved.
Microsoft published a new patent that describes a method to reduce the memory footprint of ray-tracing graphics; directly addressing concerns surrounding the increasingly large memory requirements ray-tracing (and path-tracing) requires. The technology described in the patent utilizes an aggressive level of detail (LOD) philosophy to improve or reduce ray-tracing quality as needed.
The patent describes the ray-tracing pipeline as an acceleration structure that can be optimized with a level of detail system. This is achieved through a residency map corresponding to a bounding volume hierarchy of objects. The graphics processing system is then able to use this map to determine what quality level objects need to be at the appropriate time.
To de-complicate this, Microsoft's patent is proposing a design philosophy that video games already use within 3D environments. If you play first-person or 3rd person video games, you will know that at certain distances, terrain quality and texture quality will diminish the further you look ahead of your character. This is a cost-saving measure to boost performance by cutting back image quality in areas where it is less noticeable or not needed.
In the context of the patent, this LOD system would be applied to the ray-tracing pipeline instead and reduce its overall memory footprint.
This is potentially a fantastic optimization, that could have serious memory-saving (and performance-enhancing) implications in the future. Apparently, today's ray-tracing design philosophy does not have a level of detail system that can boost or reduce ray-tracing quality on the fly, making scenes expensive to render on the storage side of things. (We already know that ray tracing is seriously demanding on the GPU side.)
Technically, current RT implementations get around this by heavily utilizing upscaling technologies such as DLSS, FSR, XeSS, or checkerboarding (console upscaling) to hide the performance penalty ray tracing has. However, having a dedicated LOD system for the ray tracing pipeline independent of render resolution will give developers a lot more flexibility on how performance can be optimized.
In real-world terms, this system should make VRAM-limited graphics cards more viable in modern ray tracing games. In particular, this could help 8GB and 10GB GPUs achieve smoother frame pacing with RT enabled in games where those VRAM capacities bottleneck performance. Potentially, Microsoft's patented solution could also make ray-tracing more playable on 6GB and even 4GB (i.e. mobile RTX 3050) Nvidia GPUs that sport hardware-accelerated RT capabilities. This system could also help consoles like the PlayStation 5 achieve more playable ray-tracing performance in memory-constrained environments. Despite it having 16GB of memory, only around 12GB is accessible by games.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.
Redditor allegedly purchased two Intel Arc B570 GPUs at Micro Center days before the official launch — the CPU couldn't recognize the GPUs due to the lack of driver support
Spanish retailer lists RX 9070 and RX 9070 XT GPUs, though the prices might be mere placeholders — RX 9070 for $912 and RX 9070 XT for $1,097
-
JarredWaltonGPU I have to say, stuff like this always makes me highly skeptical. Yes, there are ray tracing games that exceed 8GB of VRAM use, in part because of the RT BVH structures. But there are also a lot of games that now exceed 8GB of VRAM use without any ray tracing. And the reality is that complex RT games are quickly moving beyond older hardware that would have less than 12GB of VRAM. Right now, only the 4060 and 4060 Ti 8GB are in the "problem" category for Nvidia, and next generation I wouldn't be surprised to see 5060 and above have 12GB.Reply
The other issue is that even if this potentially works, then game developers actually need to support it. That can often take a year or two after an API update is available before it gets commonly supported. I hope this will help, of course, but I'm doubtful it will be all that meaningful in practice. (Side note: Nvidia RTX 40-series has some stuff to drastically reduce the memory footprint of the BVH already. I wonder if this is at all related?) -
This is not a new patent. It was filed back in 2022, and the status is still pending. And, the article doesn't fully explain the concept behind this technique.Reply
It's not just only about how to reduce the GPU 's VRAM usage directly though. VRAM usage is not directly mentioned as well.
Don't get me wrong. In fact, if you read the patent carefully, it states, that MS plans to leverage storage or NVMe drives to reduce VRAM usage for ray tracing in games, or involve alternative methods of managing the data pool storage.
They seem to more like focusing on the need for new methods of storing the different types of data needed for ray tracing. It's a bit more complicated though.
One solution is to create more manageable data pools that could be controlled by software to point to these "LOD" pools. It also states that Geometry pools could be loaded from bulk storage into host memory as set up via a command list by the GPU for ray tracing processing.
These can be saved either within the memory or storage devices such as SSDs which offer the fastest processing speeds outside of non-persistent storage options. As per MS, ray tracing & its associated acceleration structures are edited/regenerated by software, so due to this they are competing for storage solutions for faster data transfer and detail processing.
One more thing to note.
The patent also states that the systems and methods described help minimize the space required for ray tracing acceleration structures. Accordingly, there is a need for systems and methods for better handling of the data associated with the acceleration structures.
So it appears that MS could be leveraging faster storage medium to limit VRAM usage when it comes to Ray tracing to some extent as well.
Via the google patent:
"Increasingly, as part of video games and other such applications, the acceleration structures for ray tracing are explicitly edited or regenerated by the software to reflect the current set of potentially visible geometry. Such acceleration structures are now competing for storage (both persistent (e.g., flash memory) and non-persistent (e.g., RAM)) with other data, such as geometry and texture data.”
"This growth in the share of the memory by the acceleration structures has resulted in systems with significantly large memory requirements. Moreover, the bandwidth required to fetch the large amount of data for acceleration structures has also proportionally gotten bigger.”
Also taken from the same patent:
"FIG. 1 shows a diagram of a system environment 100 including a central processing unit (CPU) 102 and a graphics processing unit (GPU) 104 with ray tracing acceleration structure level of detail processing in accordance with one example. System environment 100 may further include memory 106, presentation component(s) 108, application engine 110, graphics libraries 112, networking interfaces 114, and I/O port(s) 116, which may be interconnected via one or more busses (e.g., bus 120) to each other and to CPU 102 and GPU 104. CPU 102 may execute instructions stored in memory 106.
"Memory 106 may be any combination of non-volatile storage or volatile storage (e.g., flash memory, DRAM, SRAM, or other types of memories). GPU 104 may read/write to memory 106 either directly or via a direct memory access (DMA) process.”
https://i.imgur.com/H9KLo81.jpeg -
Alvar "Miles" Udell Are we sure TomsHardware isn't using bots to write articles now? A patent application filed in 2022 and still not approved is being written about as if it's new...Reply -
JarredWaltonGPU
It was discussed (presumably again) at GDC. And I believe MS is working to make this a more integral part of DX12.Alvar Miles Udell said:Are we sure TomsHardware isn't using bots to write articles now? A patent application filed in 2022 and still not approved is being written about as if it's new...
@Metal Messiah.
The use of NVMe is garbage red herring nonsense IMO. You could try to use RAM instead, but even that is going to be like 40X slower than VRAM (because of the extremely limited 16 GB/s of PCIe bandwidth). MS is the same company that talked about virtual memory and using system RAM with DX12 eight or so years ago. It has never happened, for the same reason: orders of magnitude slower than VRAM. Less controllable as well.
Nvidia has the better idea with DMM (displacement micro-meshes), where the GPU does the work to generate a more complex BVH and frees the CPU from doing that, at the same time reducing the BVH memory footprint by 90%. Or alternative texture compression schemes to greatly reduce the memory requirements. I’m still waiting for that to happen, as the AI texture compression demos looked very promising (and are probably coming in the post-Blackwell GPUs is my bet.) MS is the one that needs to do that as a standard, or Vulkan for non-Windows. Instead we’re getting pie in the sky patents about using NVMe storage to help with RT in DX12.
This is stupid because RT is so demanding. We need more RT computational power more than we need ways of reducing the BVH memory storage requirements. And with the future Blackwell and RDNA 4 GPUs, it likely becomes a moot point. 128-bit interface with 512 GB/s of bandwidth and 12GB of capacity will hopefully become the new minimum. No one is really having issues with RT on a laptop RTX 3050 because of the lack of VRAM, as the real issue is the laptop 3050 is just woefully inadequate for RT from the start. Double the VRAM would have helped with the capacity aspect, but it would still suck from the performance perspective.
Really, RT just needs faster hardware with significantly more memory bandwidth and capacity — 12GB minimum, with compute at least on the level of the 4060. And a DX12 tweak won’t provide either of those, or even the approximate equivalent. -
SethNW I think so wow looked at news today abd saw slow day, so they wrote pointless article about something they didn't write about before, because it is kind of useless. Or someone just found out and got overly excited.Reply
Sure there may be edge cases where all this will help ever so slightly on cards that are just about there, but it is just tiny bit over the edge for them. But there are plenty of cases where 8GB of VRAM isn't enough even before you turn on raytacing, this will do nothing, because 8GB will remain the issue. Also on low end cards you just don't have enough performance on GPU side even for titles that were built for 8GB era. Unless Tom's Hardware considers 720p to be new amazing high definition standard in resolutions... :-D Sorry, had to go for bit of sarcasm there.
Also while idea if using RAM or SSD via Direct Storage sounds good on paper, but in practice it has limited bandwidth and adds delay, which was already proven to not solve the issue with GTX970, remember that slow last 0,5GB of VRAM? As soon as card had to use that it tanked in performance, so it effectively only had 3,5GB of VRAM, as drivers had to make sure it doesn't use slow 0,5GB.
Also not to mention that lower end cards or even laptop chips don't have full 16x PCI-e lanes, so that would even further hurt performance in VRAM constrained scenarios. We already know that cards with 8x or 4x lanes tank performance even harder when out of VRAM than 16x cards. And best we can do with current stuff is PCI-e 4.0, if it needs 5.0, this won't help any of modern cards. Hence why it just feels like naje oil merchant talk, sounds great, but only if you don't ask too many questions. -
JarredWaltonGPU said:@Metal Messiah.
The use of NVMe is garbage red herring nonsense IMO. You could try to use RAM instead, but even that is going to be like 40X slower than VRAM (because of the extremely limited 16 GB/s of PCIe bandwidth). MS is the same company that talked about virtual memory and using system RAM with DX12 eight or so years ago. It has never happened, for the same reason: orders of magnitude slower than VRAM. Less controllable as well.
Yes indeed. I agree the idea to use a storage for this RT purpose sounds kind of dumb (but that's the patent is trying to convey). Not entirely sure why MS wants to use this tech, though I've contacted few of the senior developers working for MS, to shed some light on this patent.
I will report back with proper details, if they give some more insight, because right now further discussion on this patent creates more questions than answers. lol.
But like you said, we seriously need faster "hardware" for RT effects for sure. -
TechyIT223 If you guys want the fastest hardware then you better wait for next 2-3 generations for nvidia to come out with an RTX 8090 Ti or 9090 class card. 😂Reply
1024-bit bus width as a minimum.
Even then who knows how much power they can sport to max out anything ray tracing thrown at them. Need something on API side as well. -
AkroZ You have the libray GigaVoxels based on a publication from 2013 which seems similar.Reply
The ray tracing is done with an OpenCL or Cuda application on the GPU, after an image is generated it provide a command list to require additional data with priorities and discarding unused depending on the memory capacity. The CPU try to provide the data with his capacity, you doesn't need to generate less frames, the quality is just less on the firsts frames of a scene.
This approach use the hardware to it's maximum capacities for max quality and has low minimum requirements. It require very detailed 3D models (meaning use heavily hard drive space) and in 2013 they wasn't much design tools to generate them. -
TechyIT223 Was gigavoxel actual ray tracing implementation unlike modern methods? I doubt thoughReply