AI agent designs a complete RISC-V CPU from a 219-word spec sheet in just 12 hours — comparably simple design required 'many tens of billions of tokens'

QiMeng AI-based chip design system — (Image credit: Chinese Academy of Sciences)

AI chip design startup Verkor.io claims, in a research paper published in March, that its agentic AI system, Design Conductor, autonomously produced a complete RISC-V CPU core — taking a 219-word requirements document and generating a verified, layout-ready design in 12 hours, which is orders of magnitude faster than the standard 18- to 36-month timelines seen in commercial chip design.

Go deeper with TH Premium: AI and data centers

Microsoft data center in Mount Pleasant, Wisconsin — (Image credit: Microsoft)

This is the first time an autonomous agent has built a working CPU from spec to GDSII layout file, according to Verkor. The resulting processor — VerCore — is a five-stage pipelined, in-order, single-issue core that met timing at 1.48 GHz on the ASAP7 7nm process design kit, scoring 3,261 on the CoreMark benchmark.

Verkor’s paper goes into detail on the pipeline architecture, which includes instruction fetch, decode, execute, memory, and writeback stages with early branch resolution and operand forwarding.

During optimization, the system independently implemented a fast Booth-Wallace multiplier that clocked at 2.57 GHz and a one-cycle branch penalty design that the agent selected after implementing and testing both one-cycle and two-cycle variants. Verkor compares VerCore's CoreMark performance to Intel's Celeron SU2300, a 2011 mobile chip based on the Penryn architecture.

A five-stage in-order core with no caches and no out-of-order execution is a fairly straightforward design by industry standards. Verkor's own paper notes that leading-edge chips cost north of $400 million and take 18 to 36 months with engineering teams in the hundreds, but VerCore is far simpler than those designs. That said, the 12-hour timeline is still notable for a fully autonomous run from spec to layout, even if it did require "many tens of billions of tokens" at this comparatively pale level of complexity.

Latest Videos From

Watch full video here:

VerCore pipeline diagram — (Image credit: Verkor.io)

VerCore hasn’t been physically fabricated and was instead verified in simulation using Spike, a reference RISC-V ISA simulator, and ASAP7 is an academic process design kit, not a production 7nm node. Verkor says it can run a uCLinux variant in simulation.

Verkor's paper is candid about the limitations of the underlying language models, admitting that the agent sometimes “underestimates the complexity of work that is required to address certain issues.” For example, in one case, when failing to meet timing, Design Conductor tried to make major changes to "deepen the pipeline, instead of looking for simpler explanations."

In another case, the researchers observed the model reasoning about Verilog, an event-driven language, as if it were sequential code. "While we found that this did not impact DC's ability to achieve functional correctness, it made it more challenging for DC to debug timing issues," the researchers explained.

The researchers estimate that five to 10 human experts will still be required to guide the system toward a production-ready chip. In addition, compute requirements grow non-linearly as design complexity increases, which makes the whole process less practical on a commercial scale. Verkor said it plans to release VerCore's RTL source and build scripts by the end of April, and the company also intends to showcase an FPGA implementation at DAC (the annual Electronic Design Automation Conference).

Previous AI chip design efforts, such as the Chinese researchers who produced a RISC-V CPU in under five hours in 2023 and the more recent QiMeng project, used different methodologies and architectures. Verkor’s Design Conductor handles the full design process from spec to GDSII autonomously, though it shares the same limitation that all other AI-designed chips have: no physical silicon.

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Luke James is a freelance writer and journalist. Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.

9 Comments Comment from the forums

IntelUser2000

I doubt this 1.48GHz processor is actually comparable to 1.2GHz Penryn. That's about the performance of an original in-order Atom and that Atom was comparable to earlier out-of-order ARM processors like Cortex A9 per clock.

Physical Designs also doesn't map well with simulation. Even if it's functional, performance of an in-order 1-issue chip with no caches will be terrible. In reality I think it'll be hard pressed to beat 486 per clock, which is also 1-issue in-order design. 486 does have L1 caches though.
Reply
bit_user

What's the going rate for "many tens of billions of tokens" in $USD?
Reply
hwertz

Don't know if this CPU is particularly useful (maybe it is, a simpe low cost RISC-V may have it's use. Or maybe there's already simple but better designs out there.). But impressive nevertheless.
Reply
roba67

The AI is creating RTL and running EDA tools. I'd like to see the AI actually synthesize and do layout. Also a chip this size, given the existing specs, is a 1 or 2 man design up to layout which is likely done by the fab. I do chip design.
Reply
nanoflooder

bit_user said:
What's the going rate for "many tens of billions of tokens" in $USD?
Depending on the model; the highest price I could find is $600 per 1M output, so if we take 100bn tokens, that would be $60m. On the other hand, the current frontier Chinese models cost around $5 per 1M, so about $50k total. Exactly how much they spent is anyone's guess.
Reply
Sam Hobbs

I think that 219-word spec sheet is not an advantage. The results are likely correspondingly impractical. In reality what they should do is to first request a review of the specifications. We should not make specifications bigger just for the sake of making them bigger but making them shorter can also be impractical.
Reply
palladin9479

Or hear me out, we find an existing design from 20 years ago, press control-c, then control-v.

Bam beer me.
Reply
usertests

palladin9479 said:
Or hear me out, we find an existing design from 20 years ago, press control-c, then control-v.

Bam beer me.
That will be 1 billion tokens, sir.
Reply
IntelUser2000

palladin9479 said:
Or hear me out, we find an existing design from 20 years ago, press control-c, then control-v.

Bam beer me.
The only possible advantage I see is potentially allowing for a future where it brings new CPU designing to more people, whereas right now it's only available to select people and organizations with heavy funding and decades of training and experience.

Moore's Law always benefits lower power, lower cost devices more than the high end. This is yet another example of it. It doesn't really bring significant advantages on the high end, and may be nothing more than nice-to-have. But on the low end it becomes a boon.
Reply

Show more comments