Qualcomm has introduced some particulars of its tilt on the AI datacenter market by revealing a pair of accelerators and rack scale programs to accommodate them, all targeted on inferencing workloads.
The corporate provided scant technical particulars about its new AI200 and AI250 “chip-based accelerator playing cards”, saying solely that the AI 200 helps 768 GB of LPDDR reminiscence per card, and the AI250 will provide “revolutionary reminiscence structure primarily based on near-memory computing” and characterize “a generational leap in effectivity and efficiency for AI inference workloads by delivering larger than 10x greater efficient reminiscence bandwidth and far decrease energy consumption.”
Qualcomm will ship the playing cards in pre-configured racks that can use “direct liquid cooling for thermal effectivity, PCIe for scale up, Ethernet for scale out, confidential computing for safe AI workloads, and a rack-level energy consumption of 160 kW.”
In Could, Qualcomm CEO Cristiano Amon provided somewhat cryptic statements that the corporate would solely enter the AI datacenter market with “one thing distinctive and disruptive” and would use its experience constructing CPUs to “take into consideration clusters of inference that’s about excessive efficiency at very low energy.”
Nevertheless, the home of the Snapdragon’s announcement makes no point out of CPUs. It does say its accelerators construct on Qualcomm’s “NPU know-how management” – absolutely a nod to the Hexagon-branded neural processing items it builds into processors for laptops and cellular units.
Qualcomm’s most up-to-date Hexagon NPU, which it baked into the Snapdragon 8 Elite SoC, contains 12 scalar accelerators and eight vector accelerators, and helps INT2, INT4, INT8, INT16, FP8, FP16 precisions.
Maybe essentially the most eloquent clue in Qualcomm’s announcement is that its new AI merchandise “provide rack-scale efficiency and superior reminiscence capability for quick generative AI inference at excessive efficiency per greenback per watt” and has “low whole value of possession.”
That verbiage addresses three ache factors for AI operators.
One is the price of power to energy AI functions. One other is that top power consumption produces a number of warmth, which means datacenters want extra cooling infrastructure – which additionally consumes power and impacts value.
The third is the amount of reminiscence obtainable to accelerators, an element that determines what fashions they’ll run – or what number of fashions can run in a single accelerator.
The 768 GB of reminiscence Qualcomm says it’s packed into the AI 200 is comfortably mode than Nvidia or AMD provide of their flagship accelerators.
Qualcomm due to this fact seems to be suggesting its AI merchandise can do extra inferencing with fewer sources, a mixture that can attraction to loads of operators as (or if) adoption of AI workloads expands.
The home of Snapdragon additionally introduced a buyer for its new equipment, specifically Saudi AI outfit Humain, which “is concentrating on 200 megawatts beginning in 2026 of Qualcomm AI200 and AI250 rack options to ship high-performance AI inference providers within the Kingdom of Saudi Arabia and globally.”
However Qualcomm says it expects the AI250 received’t be obtainable till 2027. Humain’s announcement, like the remainder of this information, is due to this fact exhausting to evaluate as a result of it omits essential particulars about precisely what Qualcomm has created and if it is going to be really aggressive with different accelerators.
Additionally absent from Qualcomm’s announcement is whether or not main hyperscalers have expressed any curiosity in its equipment, or if it is going to be viable to run on-prem.
The announcement does, nonetheless, mark Qualcomm’s return to the datacenter after previous forays targeted on CPUs flopped. Traders clearly like this new transfer as the corporate’s share worth popped 11 p.c on Monday. ®
Source link


