Tachyum releases a 1,600-page performance optimization manual despite continued tape-out delays and no actual silicon

Tachyum
(Image credit: Tachyum)

Tachyum has released a 1,600-page guide for optimizing the performance of its Prodigy Universal Processor FPGA hardware. Even though the company has yet to tape out its Prodigy processors after years of delays, it has released a performance optimization manual for the chips, which have a unique instruction set architecture and optimization strategies, well before actual products start sampling or hit the market.

The Prodigy universal processor has faced repeated delays since its initial timeline. Originally planned for a 2019 tape out and a 2020 launch, the schedule shifted multiple times: from 2021 to 2022, then to 2023, and then to 2024. Earlier this year, Tachyum once again updated its plans, saying it would tape out the chip in 2025, thus delaying the sampling of reference servers set for the first quarter of next year. While formally, the company still plans to initiate mass production of its Prodigy processors in 2025, it remains to be seen whether the company can complete all the necessary milestones (tape out, debugging, sampling, mass production start) in just one year. 

Tachuym's Prodigy design features 192 custom 64-bit compute cores based on an all-new microarchitecture that is said to be equally good for general-purpose computing as well as highly parallel AI and HPC computing. In particular, the ISA incorporates extensive vector and matrix instructions to address artificial intelligence and supercomputing applications, and the new performance optimization guide includes design guidelines for the development of AI and HPC software. 

The Prodigy instruction set architecture (ISA) combines elements of both RISC and CISC designs; according to Tachyum, the ISA avoids the complex, lengthy, and inefficient variable-length instructions commonly found in traditional CISC processors. All instructions are standardized to 32 or 64 bits, with some incorporating memory access features to boost performance further.  

Tachuym's Prodigy FPGA features built-in performance counters that enable real-time monitoring and analysis of runtime events. The company says these tools allow programmers and engineers to identify bottlenecks and optimize code for greater efficiency, making the processor ideal for demanding computational tasks. 

The manual provides specific optimization techniques, including managing dispatch limitations, improving memory routines, aligning branches and instructions, and mitigating register forwarding challenges. In addition, it offers guidance for handling cache operations, load/store alignment, and accessing special registers, ensuring developers can fine-tune software for peak performance. 

"Software programmers, test engineers, compiler developers, and systems and solutions engineers will appreciate the opportunity to take this deep dive into how Prodigy offers inherent performance benefits for efficient processing of AI, cloud, and HPC workloads," said Dr. Radoslav Danilak, founder and CEO of Tachyum. "Prodigy's integrated features will help users achieve industry-leading compute efficiency to derive insights faster, to perform research faster, to generate results faster."

As always, the proof is in the shipping silicon, and Tachyum has yet to even tape out a chip.

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • bit_user
    The article said:
    Tachuym's Prodigy FPGA features built-in performance counters that enable real-time monitoring and analysis of runtime events.
    Intel and AMD have these baked into their production silicon.
    Reply
  • Findecanor
    If they have released it, then where is it?
    I can't find it on their web site.
    Reply
  • bit_user
    Findecanor said:
    If they have released it, then where is it?
    I can't find it on their web site.
    The article links to a press release dated 2 days ago.
    https://www.tachyum.com/media/press-releases/2024/12/17/tachyum-publishes-prodigy-performance-optimization-manual/?s=31
    That doesn't seem to link directly to the manual, however. Perhaps they expect you to contact them and sign a partner aggrement/NDA, before you can get access to it.
    Reply
  • P.Amini
    bit_user said:
    The article links to a press release dated 2 days ago.
    https://www.tachyum.com/media/press-releases/2024/12/17/tachyum-publishes-prodigy-performance-optimization-manual/?s=31
    That doesn't seem to link directly to the manual, however. Perhaps they expect you to contact them and sign a partner aggrement/NDA, before you can get access to it.
    Any thoughts on this processor?
    Reply
  • bit_user
    P.Amini said:
    Any thoughts on this processor?
    It's a new ISA because... why?? Their original announcement sounded much more like VLIW or EPIC, but then they pivoted towards something more like a traditional RISC.

    As far as I can tell, what supposedly makes it good for AI and HPC is that each core has vector extensions and a matrix-multiply engine. You could do that with RISC-V, so why again is it a custom ISA? Even ARM now has SME (Scalable Matrix Extensions), though I'm not aware of any cores that yet implemented it.

    It feels like every few years that I hear about a promising HPC processor, but it seems like they chronically underestimate the time to bring something like that to market, and once they do, the mainstream guys have caught up & usually even passed them.

    The main thing they seem to have going for them is being an indigenous European project and backing from the Slovakian (?) government. However, the competition they're up against includes the European Processor Initiative, which is currently using ARM and has already planned to switch over to RISC-V.
    https://www.european-processor-initiative.eu/general-purpose-processor/
    Also, for those who don't mind buying a non-European ARM CPU, Fujitsu's Monaka should provide stiff competition in the HPC sector:
    https://www.tomshardware.com/desktops/servers/fujitsu-supermicro-working-on-arm-based-liquid-cooler-servers-for-2027
    Lastly, with power consumption of up to 950 W, I'm skeptical how appealing they're going to be for general-purpose cloud workloads, which they also claim to be targeting:
    https://www.tomshardware.com/news/tachyum-reveals-20-exaflops-supercomputer-design
    Don't get me wrong, though. I'm not wishing to see them fail. I wouldn't even mind being proven wrong, because I think having a greater range of computing platforms is generally a good thing. I'm just trying to be realistic about their prospects. I appreciate the monumental amount of work they've put in, this far, and it saddens me to think it might all be for nought.

    Anyway, what's much more intriguing to me is NextSilicon's Maverick-2:
    https://www.nextplatform.com/2024/10/29/hpc-gets-a-reconfigurable-dataflow-engine-to-take-on-cpus-and-gpus/
    The idea of dynamically building dataflow pipelines sounds like it has a lot more potential to maximize silicon utilization. That should ultimately lead to better perf/W and better perf/$ than Tachyum's approach, which is basically just following what ARM, Intel, and AMD are all doing, but with slightly wider vector pipelines and more general matrix extensions than AMX currently has.
    Reply
  • froggx
    P.Amini said:
    Any thoughts on this processor?
    i'm extremely impressed cooking up this wall spaghetti processor has gone on this long and they still can't show us anything that stuck.

    it's straight vaporware. they've set a target beyond the most remote of fantasies and aren't afraid to drop a lawsuit on whoever they get to help them as soon as things inevitably fall flat. props with how they're willing to take their already silly specs and add more cores into it every couple years, guess that's the only way to beat 3 different flavors of more specialized chips. i do admit though, i admire how they're talking about combining risc and cisc features only at least a decade or 2 (or more?) behind x86 and arm having designs leveraging the merits of both these types of design in production, cause they gotta talk about something to keep from crying. part of me actually suspects they think they struck on something new with those perf counters they mentioned. imagine if those had slipped through the cracks without anyone on contract they could extor... ahem... take to court.

    i really want to see if they can end the decade neither taping out nor tapping out. i think they've got what it takes.
    Reply
  • P.Amini
    bit_user said:
    It's a new ISA because... why?? Their original announcement sounded much more like VLIW or EPIC, but then they pivoted towards something more like a traditional RISC.

    As far as I can tell, what supposedly makes it good for AI and HPC is that each core has vector extensions and a matrix-multiply engine. You could do that with RISC-V, so why again is it a custom ISA? Even ARM now has SME (Scalable Matrix Extensions), though I'm not aware of any cores that yet implemented it.

    It feels like every few years that I hear about a promising HPC processor, but it seems like they chronically underestimate the time to bring something like that to market, and once they do, the mainstream guys have caught up & usually even passed them.

    The main thing they seem to have going for them is being an indigenous European project and backing from the Slovakian (?) government. However, the competition they're up against includes the European Processor Initiative, which is currently using ARM and has already planned to switch over to RISC-V.
    https://www.european-processor-initiative.eu/general-purpose-processor/
    Also, for those who don't mind buying a non-European ARM CPU, Fujitsu's Monaka should provide stiff competition in the HPC sector:
    https://www.tomshardware.com/desktops/servers/fujitsu-supermicro-working-on-arm-based-liquid-cooler-servers-for-2027
    Lastly, with power consumption of up to 950 W, I'm skeptical how appealing they're going to be for general-purpose cloud workloads, which they also claim to be targeting:
    https://www.tomshardware.com/news/tachyum-reveals-20-exaflops-supercomputer-design
    Don't get me wrong, though. I'm not wishing to see them fail. I wouldn't even mind being proven wrong, because I think having a greater range of computing platforms is generally a good thing. I'm just trying to be realistic about their prospects. I appreciate the monumental amount of work they've put in, this far, and it saddens me to think it might all be for nought.

    Anyway, what's much more intriguing to me is NextSilicon's Maverick-2:
    https://www.nextplatform.com/2024/10/29/hpc-gets-a-reconfigurable-dataflow-engine-to-take-on-cpus-and-gpus/
    The idea of dynamically building dataflow pipelines sounds like it has a lot more potential to maximize silicon utilization. That should ultimately lead to better perf/W and better perf/$ than Tachyum's approach, which is basically just following what ARM, Intel, and AMD are all doing, but with slightly wider vector pipelines and more general matrix extensions than AMX currently has.
    Thanks for the answer, helpful as always. Respect.
    Reply
  • P.Amini
    froggx said:
    i'm extremely impressed cooking up this wall spaghetti processor has gone on this long and they still can't show us anything that stuck.

    it's straight vaporware. they've set a target beyond the most remote of fantasies and aren't afraid to drop a lawsuit on whoever they get to help them as soon as things inevitably fall flat. props with how they're willing to take their already silly specs and add more cores into it every couple years, guess that's the only way to beat 3 different flavors of more specialized chips. i do admit though, i admire how they're talking about combining risc and cisc features only at least a decade or 2 (or more?) behind x86 and arm having designs leveraging the merits of both these types of design in production, cause they gotta talk about something to keep from crying. part of me actually suspects they think they struck on something new with those perf counters they mentioned. imagine if those had slipped through the cracks without anyone on contract they could extor... ahem... take to court.

    i really want to see if they can end the decade neither taping out nor tapping out. i think they've got what it takes.
    LOL! thanks!
    Reply
  • bit_user
    froggx said:
    it's straight vaporware. they've set a target beyond the most remote of fantasies and aren't afraid to drop a lawsuit on whoever they get to help them as soon as things inevitably fall flat.
    Yeah, like the lawsuit against Cadence they licensed some IP blocks from (stuff like DDR5 and PCIe controllers, I think). They claimed the only reason their processor would miss its market window was due to their partner, but I'm nearly certain they'd have faced other setbacks if that IP had arrived on time. In other words, I suspect it was just a convenient excuse:
    https://www.tomshardware.com/news/tachyum-to-cadence-our-prodigy-doesnt-meet-prodigious-goals-sue-you
    Were there other such lawsuits by Tachyum?

    froggx said:
    i really want to see if they can end the decade neither taping out nor tapping out. i think they've got what it takes.
    Yeah, sometimes the sunk cost fallacy can keep companies or ventures funded long past the point when they should've been cut off. That, or if the operation is allowed to fail, it would reflect poorly on certain powerful & influential people who've been backing them so far. I don't know for a fact that's what happening, here, but I wouldn't be the slightest bit surprised.

    Early in my career, I worked for a startup company that reinvented itself a few times and went after wildly different markets, mainly because one of its big, early investors didn't want to admit that it was a bad investment. As far as I know, they weren't a traditional venture capital firm, BTW.

    The normal thing for tech industry venture capital firms is to expect only 10% of their investments to hit it big. Being so accustomed to backing failures removes a lot of the stigma from admitting when you've got one in your fund. Then, they're pretty ruthless about cutting off and even clawing back funds from companies, basically as soon as they show signs of being non-viable.
    Reply
  • jalyst
    I feel like, every time I read all of bituser's posts (& several other posters here);
    I get, like, a 'tincy bit' smarter & better-informed... :grinning:

    I don't always fully appreciate/understand what's conveyed, but some stuff sinks in.

    It's also what I love about older-school interweb mediums, like this forum.
    Of which are increasingly rarer gems, nowadays... :rolleyes:
    Reply