d-Matrix Debuts Corsair AI Platform with 150 Tbps Bandwidth for AI Inference

Jim Carroll

1 year ago

d-Matrix, a start-up based in Santa Clara, California, launched “Corsair”, a compute platform designed specifically for AI inference in modern datacenters. Built on d-Matrix’s proprietary Digital In-Memory Compute (DIMC) architecture, Corsair integrates memory and compute for high-performance generative AI applications, delivering faster token generation speeds, improved energy efficiency, and lower total cost of ownership compared to GPUs and other systems. This innovation addresses growing enterprise demand for scalable, cost-effective AI infrastructure.

Corsair supports the increasing computational needs of advanced AI models, such as reasoning agents and interactive video generation.

The company says its DIMC architecture overcomes the memory bandwidth limitations of traditional inference systems by tightly coupling memory and compute within each chip. The platform scales using DMX Link for high-speed chiplet connectivity and DMX Bridge for inter-package communication. These capabilities, combined with native support for the Micro-scaling (MX) block floating point standard, enable Corsair to achieve ultra-fast processing speeds, making generative AI applications more practical for enterprise use.

Each Corsair PCIe Gen5 card includes 2400 TFLOPs of 8-bit compute power, 2 GB of integrated performance memory, and up to 256 GB of off-chip capacity memory. The system delivers a memory bandwidth of 150 Tbps—significantly outpacing HBM systems. Enterprises can achieve up to 10x faster token generation and 3x better cost and energy efficiency. Sampling for early-access customers has begun, with general availability expected in Q2 2025.

Key Points

• Technology: DIMC architecture integrates compute and memory for ultra-high bandwidth and low latency.

• Performance: 150 Tbps memory bandwidth, 2400 TFLOPs compute per card, 10x faster token generation speeds.

• Scalability: DMX Link™ for chiplet interconnect; DMX Bridge™ for multi-card scaling.

• Efficiency: 3x better TCO and energy efficiency than GPUs.

• Form Factor: Standard PCIe Gen5 full-height full-length cards.

Sid Sheth, CEO of d-Matrix, stated: “Corsair redefines AI inference with blazing-fast token generation and unparalleled scalability, making generative AI viable for enterprises worldwide.”

Earlier this year, d-Matrix introduced Jayhawk II, a next-generation generative AI compute platform designed to tackle the cost, latency, and scalability issues of deploying large language models (LLMs) in data centers. The silicon features an enhanced Digital In-Memory Compute (DIMC) engine paired with chiplet-based interconnect technology, utilizing the Open Compute Project’s Bunch of Wires (BoW) PHY interconnect standard. Jayhawk II delivers a 40x improvement in memory bandwidth compared to high-end GPUs, significantly boosting throughput and reducing latency for applications such as ChatGPT, Meta’s Llama2, and Falcon. Optimized for LLMs ranging from 3 billion to 40 billion parameters, Jayhawk II supports floating point and block floating point numerics, compression, and sparsity, achieving 10–20x better total cost of ownership (TCO) and inference performance versus GPU-based solutions. This platform builds on the original Jayhawk release, scaling from 30 TOPs/W to 150 TOPs/W on a 6nm process while enabling prompt caching for efficient generative AI workflows.
d-Matrix, established in 2019 by CEO Sid Sheth and CTO Sudeep Bhoja, focuses on high-efficiency AI compute solutions for data centers. Both founders bring extensive experience from their previous roles at Inphi and Broadcom, where they developed power-efficient compute and interconnect solutions for data centers over the past two decades. In September 2023, d-Matrix secured $110 million in Series B funding led by Temasek, with participation from investors including M12, Microsoft’s venture fund, and Playground Global. This funding supports the commercialization of d-Matrix’s Digital In-Memory Compute (DIMC) technology, aiming to enhance AI inference performance and efficiency.