New memory tech unveiled that reduces AI processing energy requirements by 1,000 times or more
New CRAM technology gives RAM chips the power to process data, not just store it.
Artificial intelligence (AI) computing requires tremendous amounts of electricity, but targeted research might hold the key to greatly reducing that. However, a team of researchers in the U.S. has developed technology that could reduce the energy consumption required by AI processing by a factor of at least a thousand.
A group of engineering researchers at the University of Minnesota Twin Cities have demonstrated an AI efficiency-boosting technology and published a peer-reviewed paper outlining their work and findings. The paper was published in npj Unconventional Computing, a peer-reviewed journal by Nature. In essence, they’ve created a shortcut in the normal practice of AI computations that greatly reduces the energy requirement for the task.
In current AI computing, data is transferred between the components processing it (logic) and where data is stored (memory/storage). This constant shuttling of information back and forth is responsible for consuming as much as 200 times the energy used in the computation, according to this research.
Thus the researchers have turned to Computational Random-Access Memory (CRAM) to address this. The CRAM the research team has developed places a high-density, reconfigurable spintronic in-memory compute substrate within the memory cells themselves.
This differs from existing processing-in-memory solutions such as Samsung's PIM technology, as Samsung's solution places a processing computing unit (PCU) within the memory core. Data still has to travel from memory cells to the PCU and back, just not nearly as far.
Using CRAM, the data never leaves memory, instead undergoing processing entirely within the computer’s memory array. According to the research team, this allows the system running the AI computing application an energy consumption improvement “on the order of 1,000x over a state-of-the-art solution.”
Other examples suggest the potential for even greater energy savings and faster processing. In one test, performing an MNIST handwritten digit classifier task, the CRAM proved 2,500 times more energy-efficient and 1,700 times as fast as a near-memory processing system using the 16nm technology node. This task is used to train AI systems to recognize handwriting.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
The importance of this kind of work cannot be overstated. Recent reports suggest AI workloads already consume almost as much electricity as the entire nation of Cyprus did in 2021. The total energy consumption came in at 4.3 GW in 2023 and is expected to grow at a rate of 26% to 36% in the coming years. Arm’s CEO recently suggested that by 2030, AI may consume a quarter of all energy produced in the U.S.
The first author of the paper Yang Lv, a University of Minnesota Department of Electrical and Computer Engineering postdoctoral researcher, and the rest of the research team have already applied for several patents based on the new technology. They plan to work with leaders in the semiconductor industry, including those in Minnesota, to provide large-scale demonstrations and produce the hardware to help advance AI functionality while also making it more efficient.
Jeff Butts has been covering tech news for more than a decade, and his IT experience predates the internet. Yes, he remembers when 9600 baud was “fast.” He especially enjoys covering DIY and Maker topics, along with anything on the bleeding edge of technology.
-
ekio It’s such a known fact among engineers, that 99 percent of the energy is not lost in computation but in data carrying that I wonder why not every company tries to tackle similar solutions…Reply
Groq did it, and their product are very promising as a result, but why nvidia, amd, intel etc. keep the inefficient good old von neumann engineering in place instead of merging ram and logic ? Big question. -
DS426 I would say the holy grail of microprocessors is integrated logic and memory. We see the results from HBM and V-Cache, but indeed having memory and logic processed together seamlessly would be a heck of a quantum leap in computing.Reply
Before that happens, I assume optics will be a necessary step forward. This article was published in 2015 by MIT. Although it was a much smaller/simpler chip at 70 million transistors, one would think we'd have made more progress on this by now.
https://news.mit.edu/2015/optoelectronic-microprocessors-chip-manufacturing-1223
As for the big tech industry, AI hardware has been over-invested in -- specifically, powerful but incredibly inefficient hardware for the task at hand. -
ttquantia The issue is that these specialized architectures apply only to a minuscule fraction of all computation. Even inside AI. This has been tried for ages, multiple times, but coming up with a general-purpose architecture better than existing ones has so far turned out too difficult. I doubt there is anything here that benefits computation more generally.Reply -
slightnitpick How extensible is this? Can you plug in another module and increase the RAM/logic, the way you currently can by plugging in another RAM module?Reply -
bit_user
Things like that have already been announced and are in the works.ekio said:It’s such a known fact among engineers, that 99 percent of the energy is not lost in computation but in data carrying that I wonder why not every company tries to tackle similar solutions…
Groq did it, and their product are very promising as a result, but why nvidia, amd, intel etc. keep the inefficient good old von neumann engineering in place instead of merging ram and logic ? Big question.
https://www.tomshardware.com/news/sk-hynix-plans-to-stack-hbm4-directly-on-logic-processors
As mentioned in the article, this approach is far more incremental than what the researchers did. That would involve a fundamental shift in the memory technology used. They didn't say how well that memory technology is expected to scale, but it sounds like there's a lot to tackle before such spintronics-based compute-in-memory technology is ready for production use. -
JRStern https://en.wikipedia.org/wiki/Systolic_arrayReply
People have been kicking this stuff around for a long, LONG time. Micron put some serious effort into it, oh, about ten years ago I think, maybe a bit longer. It might work for the LLM training task, why not. Think of it as an ASIC with combined processing and memory.
Someone at NVDA is probably messing with it right now. -
husker
Exactly. I imagine that something like the “Fast, good or cheap — pick two.” iron triangle scenario applies.ttquantia said:The issue is that these specialized architectures apply only to a minuscule fraction of all computation. Even inside AI. This has been tried for ages, multiple times, but coming up with a general-purpose architecture better than existing ones has so far turned out too difficult. I doubt there is anything here that benefits computation more generally.