First truly 3D chip fabbed at US foundry, features carbon nanotube transistors and RAM on a single die — future devices could have up to 1000x improvement in energy-delay product
Prototype manufactured at SkyWater shows dense vertical memory-logic integration with measured and simulated AI speedups.
A collaborative research team has demonstrated what it says is the first monolithic 3D integrated circuit manufactured at a commercial U.S. foundry, reporting substantial performance gains over conventional flat chip designs. The prototype was developed by engineers from Stanford, Carnegie Mellon, the University of Pennsylvania, and MIT, and fabricated in partnership with SkyWater Technology.
The chip departs from conventional two-dimensional layouts by stacking memory and logic directly on top of one another in a single, continuous process. Instead of assembling multiple finished dies into a package, the researchers built each device layer sequentially on the same wafer using a low-temperature process designed not to damage underlying circuitry, resulting in a dense network of vertical interconnects that shortens data paths between memory cells and compute units.
The prototype was manufactured on SkyWater’s 200mm production line, using a mature 90nm to 130nm process. The stack integrates conventional silicon CMOS logic with resistive RAM layers and carbon nanotube field-effect transistors, all fabricated under a thermal budget of about 415°C. According to the team, early hardware tests show roughly a four-fold improvement in throughput compared with a comparable 2D implementation operating at similar latency and footprint.
Twelve-fold performance improvements seen in simulations
Beyond the measured hardware results, the researchers also evaluated taller stacks through simulation. Designs with additional tiers of memory and compute showed up to a twelve-fold performance improvement on AI-style workloads, including models derived from Meta’s LLaMA architecture. The group further argues that the architecture could eventually deliver 100-fold to 1,000-fold improvements in energy-delay product, a combined metric of speed and efficiency, by continuing to scale vertical integration rather than shrinking transistors.
While academic labs have previously demonstrated experimental 3D chips, the team emphasizes that this work differs in being built in a commercial foundry environment rather than a bespoke research line. SkyWater executives involved in the project described the effort as proof that monolithic 3D architectures can be transferred into domestic manufacturing flows, rather than remaining confined to university cleanrooms.
"Turning a cutting-edge academic concept into something a commercial fab can build is an enormous challenge," said co-author Mark Nelson, vice president of technology development operations at SkyWater Technology.
The team presented their research at the IEEE International Electron Devices Meeting (IEDM 2025) between December 6 and 10.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Luke James is a freelance writer and journalist. Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.
-
thestryker While this seems to be slightly more practical than prior designs I'm not sure it's viable. The density for compute isn't viable at all and since they're temperature constrained it doesn't seem like something that can rapidly be shrunk. It is possible that some sort of purpose built systems could leverage this, but I have a hard time imagining it would be at scale.Reply
Stacking is very likely to be the way to go for the foreseeable future since mixing manufacturing processes can be done there. Transistor design seems to be a far more pressing issue on the compute side and memory makers seem to be looking at leveraging some level of compute within their DRAM.
An interesting video that talks about the future of compute:
IS5FovPfvf0View: https://youtu.be/IS5FovPfvf0?si=tVIlK7ODG-7qEgFR -
edzieba The first bulk Silicon IC with stacked memory-over-logic, maybe. But a partially non-silicon process has already been announced, gone into mass production, then shuttered again: the Chalcogenide process used to make 3D Xpoint/Optane. This had CMOS compute at the bottom, followed by alternating metal layers and Chalcogenide PCM layers stacked on top, within the same die.Reply -
usertests ReplyThe group further argues that the architecture could eventually deliver 100-fold to 1,000-fold improvements in energy-delay product, a combined metric of speed and efficiency, by continuing to scale vertical integration rather than shrinking transistors.
How about using metrics that make sense, or multiple metrics for a better comparison?
I've followed Skywater for a long time. If they can legitimately create something with 10-100x better performance per watt than a TSMC N2 chip, then computing could become exciting again, even if there are a few asterisks.