Qualcomm Technologies expanded its data center portfolio with the introduction of the AI200 and AI250 accelerators, signaling a major step in its effort to compete in large-scale generative AI infrastructure. The new systems extend Qualcomm’s AI inference lineup from accelerator cards to full rack-level platforms designed to maximize performance per watt and total cost of ownership (TCO). Drawing on its Hexagon NPU architecture, Qualcomm aims to deliver efficient, scalable AI inference for cloud and enterprise data centers while maintaining compatibility with existing systems and leading software frameworks. The launch marks the start of a multi-year data center roadmap, with new inference products planned annually.
The AI200 introduces a purpose-built rack-level inference architecture optimized for large language and multimodal models, featuring up to 768 GB of LPDDR memory per card and direct liquid cooling to deliver efficient, high-density compute within a 160 kW rack envelope. It integrates PCIe for scale-up, Ethernet for scale-out, and confidential computing for secure enterprise workloads. The AI250, slated for release in early 2027, debuts Qualcomm’s near-memory computing architecture, achieving over 10× higher effective memory bandwidth and significantly reduced power draw. Both platforms are designed for rapid, cost-efficient deployment of generative AI models with high performance per dollar per watt.
Qualcomm’s hyperscaler-grade AI software stack underpins the new hardware, spanning from application to system layers and optimized for inference workloads. The open ecosystem supports PyTorch, ONNX, vLLM, and generative AI frameworks like LangChain and CrewAI, with one-click Hugging Face model deployment via the Efficient Transformers Library and AI Inference Suite. The company also reaffirmed its commitment to an annual data center cadence, expanding beyond inference accelerators into server-class CPUs now under development. Existing cards such as the AI100 Ultra, supporting models up to 100 billion parameters on a 150W PCIe card, highlight Qualcomm’s growing performance portfolio for AI inference.
- AI200: Rack-level AI inference system with 768 GB LPDDR memory per card, 160 kW rack power, PCIe/Ethernet scale-out, direct liquid cooling; available 2026.
- AI250: Next-generation near-memory compute architecture with >10× bandwidth and lower power; available early 2027.
- AI100 Ultra: PCIe card offering up to 870 TOPS (INT8) and 288 TFLOPS (FP16) at 150 W TDP.
- AI Inference Suite: End-to-end software tools and agents for deploying AI inference on-prem or in cloud environments.
- Server CPU Roadmap: New data center CPU program in development to complement Qualcomm’s inference accelerators.
- Security: Built-in confidential computing for enterprise-grade AI protection.
“With Qualcomm AI200 and AI250, we’re redefining what’s possible for rack-scale AI inference,” said Durga Malladi, SVP & GM, Technology Planning, Edge Solutions & Data Center, Qualcomm Technologies. “These innovative new AI infrastructure solutions empower customers to deploy generative AI at unprecedented TCO, while maintaining the flexibility and security modern data centers demand.”
🌐 Analysis: Qualcomm’s expanded data center portfolio signals its intent to challenge incumbents like NVIDIA, AMD, and Intel in inference-optimized silicon. The Hexagon NPU’s efficiency advantage, coupled with LPDDR-based memory systems and near-memory computing, positions Qualcomm to deliver high throughput at far lower power budgets than GPU-centric designs. The AI250’s 10× effective bandwidth increase directly targets LLM bottlenecks in memory access, an area where emerging architectures such as AMD’s MI350X and Intel’s Gaudi3 are also competing. Qualcomm’s annual product cadence and investment in server CPUs suggest a broader pivot toward hyperscaler and enterprise AI workloads, establishing it as a credible new player in the AI data center market.
🌐 We’re tracking the latest developments in semiconductors and AI infrastructure. Follow our ongoing coverage at: https://convergedigest.com/category/semiconductors/







