Intel Expands Gaudi 3 with PCIe Card and Rack-Scale Systems

Jim Carroll

1 year ago

At Computex 2025 in Taipei, Intel unveiled major updates to its AI accelerator and GPU portfolio, placing a strong focus on scalable deployment options for its Gaudi 3 architecture. Designed to address AI workloads ranging from enterprise inference to cloud-scale training, Intel Gaudi 3 is now available in both PCIe and rack-scale configurations, enabling a wider range of customers to support AI model development and deployment. These offerings aim to meet growing demand for flexible, high-performance AI infrastructure without locking customers into proprietary systems.

The PCIe version of Gaudi 3 is optimized for integration into existing server environments, supporting a wide spectrum of AI models including Llama 3.1 and Llama 4. Scalable configurations offer flexibility for organizations of all sizes, while maintaining compatibility with industry-standard platforms. Intel also introduced rack-scale reference designs supporting up to 64 Gaudi 3 accelerators per rack and 8.2 TB of high-bandwidth memory. These systems feature liquid cooling and an open, modular architecture aligned with Open Compute Project (OCP) principles—targeting cloud service providers seeking low-latency, high-throughput inference solutions. Both PCIe and rack-scale variants are expected in the second half of 2025.

In parallel, Intel expanded its GPU portfolio with the introduction of the Arc Pro B60 and B50 GPUs. Based on the Xe2 architecture, the new Arc Pro models are tailored for workstation and AI inference use cases, offering up to 24GB of memory, multi-GPU support, and software compatibility across Windows and Linux. Aimed at AEC, engineering, and AI development workflows, the GPUs will be available from partners and resellers starting mid-2025. Intel also released the AI Assistant Builder to the public as an open framework on GitHub, giving developers tools to build lightweight, local AI agents optimized for Intel platforms.

• Intel Gaudi 3 AI accelerators now available in PCIe and rack-scale configurations

• Rack-scale systems support up to 64 accelerators per rack and 8.2TB of HBM, with liquid cooling

• Modular architecture supports custom and OCP-based deployments for cloud-scale AI

• PCIe cards offer scalable inference from small to large language models

• Intel Arc Pro B60 and B50 GPUs feature Xe2 architecture, XMX AI cores, and up to 24GB memory

• New GPUs target AEC, inference, and workstation workloads with ISV certification

• Intel AI Assistant Builder now publicly available on GitHub for local AI agent development

• Product launches coincide with Intel’s 40th anniversary of operations in Taiwan

“For the past 40 years, the power of our partnership with the Taiwan ecosystem has fueled innovation that has changed our world for the better,” said Intel CEO Lip-Bu Tan. “This week, we are renewing our commitment to our partners as we work to build a new Intel for the future.”