Tachyum Details 2nm Prodigy Lineup, Scaling to 1,024 Cores and 400 PFLOPs

Tachyum released the full specifications of its 2nm Prodigy Universal Processor family, presenting a CPU-centric architecture aimed directly at workloads that today rely on GPU clusters. At the top of the lineup, the T241024 SKU integrates 1,024 custom 64-bit cores running at up to 6 GHz, paired with 24 channels of DDR5-17600 memory and 128 lanes of PCIe 7.0. Tachyum claims the Prodigy Ultimate configuration delivers 21.3x higher AI rack performance than Nvidia’s Rubin Ultra NVL576, and more than 1,000 PFLOPs of inference throughput across a full rack—far above the 50 PFLOPs the company attributes to Rubin. The new 2nm generation also introduces major improvements in bandwidth, memory scale, and power efficiency to support multi-chiplet packaging.

The architecture is built around chiplets containing 256 out-of-order 64-bit cores with eight instructions per cycle, backed by matrix and vector engines for AI and HPC, and a coherent cache hierarchy including 128 KB I-cache, 64 KB D-cache, and a combined 1 GB L2/L3 cache per socket. Systems scale up to 16 sockets and as much as 48 TB of unified memory per processor, positioning Prodigy for large-model AI, exascale supercomputing, analytics, in-memory databases, edge/telco workloads, and digital currency processing. The company’s latest funding round of $220 million is targeted at completing the 2nm tape-out and moving the architecture into production. Tachyum also emphasized its software strategy: the processors run native code as well as unmodified x86, Arm, and RISC-V binaries, supported by a full stack of OS kernels, compilers, AI frameworks, debugging tools, and application libraries.

The expanded SKU table now ranges from 32 to 1,024 cores, 4 to 24 DDR5 controllers, and TDPs from 30 W to 1,600 W. Mid-range products support 16-socket scalability and DDR5-12800 memory, while lower-power parts target cloud and database deployments with PCIe 7.0 lane counts between 24 and 128. Across the family, Tachyum cites up to 5x higher integer performance, 16x higher AI throughput, 8x more DRAM bandwidth, and 4x the I/O bandwidth compared to prior designs, alongside energy reductions from the 2nm node. The company is also offering its DIMM-based bandwidth-expansion technology—advertised as a 10x improvement—along with its TPU core, ISA, and TAI data types for industry licensing.

• Top-end T241024: 1,024 cores at 6 GHz, 24× DDR5-17600 channels, 128 PCIe 7.0 lanes, 48 TB memory per socket, 1,600 W TDP

• Performance engines: 400 TAI PFLOPs AI, 400 DP TFLOPs HPC

• Scalability: Up to 16 sockets, coherent multiprocessor fabric

• Compatibility: Runs x86, Arm, and RISC-V binaries in addition to the native ISA

• SKU range: 32–1,024 cores, 30–1,600 W, DDR5-6400 to DDR5-17600

“With tape-out funding now secured after a long wait, the world’s first Universal Processor can proceed to production, designed to overcome the inherent limitations of today’s data centers,” said Dr. Radoslav Danilak, founder and CEO of Tachyum.

🌐 Analysis

Tachyum’s detailed spec sheet positions Prodigy as a CPU-dominant alternative to GPU-first AI architectures at the moment when Nvidia, AMD, and hyperscalers are ramping next-generation accelerators such as Rubin, MI450, and custom AI ASICs. Its combination of 16-socket shared memory, 48 TB-per-socket scaling, and 6 GHz clocks at 2nm targets workloads where large memory coherence and uniform programming models matter more than peak tensor throughput. The competitive question now shifts to real-world benchmarks, compiler maturity, and whether OEMs and cloud operators are willing to adopt a universal CPU architecture for large-model AI instead of the accelerator stacks that dominate today’s deployments.

Tachyum is a privately held semiconductor company headquartered in Sunnyvale, California, founded in 2016 with the mission of developing a universal processor architecture that merges CPU, GPU, and AI-accelerator functions into a single, high-efficiency platform for hyperscale, HPC, and AI data centers. Its core technology, the Prodigy processor family, is designed to deliver large gains in rack-level performance and power efficiency, supported by more than 250 issued and pending patents and a growing software ecosystem. The company is led by co-founder and CEO Radoslav Danilak, a veteran chip architect who previously co-founded Skyera (acquired by Western Digital) and served as CTO of SandForce (acquired by LSI), with earlier engineering roles at NVIDIA and Toshiba—experience that underpins Tachyum’s focus on memory, storage, and compute-architecture innovation. Tachyum has raised significant private capital, including a reported $220 million Series C round and a $500 million purchase order from a European investor, and is a member of the I4DI consortium, which plans to deploy Prodigy-powered systems in Slovakia.

🌐 We’re tracking the latest developments in networking silicon. Follow our ongoing coverage at: https://convergedigest.com/category/semiconductors/

Tags: Tachyum

Tachyum Details 2nm Prodigy Lineup, Scaling to 1,024 Cores and 400 PFLOPs

QCi Builds $1.5B War Chest as Quantum Cyber and TFLN Advance

Google Commits $40B for AI Infrastructure in Texas

Jim Carroll

Related Posts

Tachyum announces its Universal Processor Platform

Google Commits $40B for AI Infrastructure in Texas

Categories

Archives