GPU-makers like Nvidia and AMD could dominate the AI infrastructure market, however there are nonetheless various AI chip startups knocking round.
One in all them is Rebellions, which after establishing a foothold on its dwelling turf in South Korea, goals to carry its tech to the remainder of the world, starting with a brand new rack-scale compute platform that will not require enterprises to undertake liquid cooling or ultra-power dense racks.
Based in late 2020, the startup produces AI accelerators which were deployed in quite a few functions within the South Korean home market.
Initially, “we centered an awesome deal on telcos, service suppliers, and enterprise-end customers inside the Korean market,” Rebellions chief enterprise officer Marshall Choy informed El Reg. “We constructed up use circumstances round every thing from name facilities and customer support to CCTV surveillance for the nationwide freeway system.”
“We’re in a really robust place to take these learnings, capabilities, and enhancements we have finished over time and convey that out to different areas, exterior of Korea, as much less of a contemporary begin, however extra of a rinse and repeat kind of movement,” he added.
Following the introduction of its Insurgent Quad accelerators, since rebranded because the Rebel100, the corporate has turned its consideration to the remainder of the world. Over the previous few months, Rebellions has opened places of work in Japan, Saudi Arabia, Taiwan, and the US, the place it hopes to win over enterprises with its new RebelRack and RebelPods.
Earlier than wanting on the racks, let’s speak in regards to the chips themselves. Our sibling web site The Next Platform dug into the Rebel100 final winter, however at a excessive degree, the chip appears to be like fairly much like Nvidia’s H200 accelerators from late 2023.
In line with Rebellions, the processor is able to a petaFLOP of dense 16-bit floating level math or double that at FP8. Nevertheless, not like the H200, which used a monolithic compute die fabbed at TSMC, Rebellions’ newest processor makes use of a chiplet structure with 4 compute dies manufactured and packaged by Samsung.
That processor is fed by 4 HBM3e stacks totaling 144 GB of capability and 4.8 TB/s of mixture bandwidth.
Whereas the smaller compute dies and reliance on Samsung shouldn’t solely assist with yields and keep away from competing for TSMC’s restricted fab and packaging capability, it nonetheless must supply HBM from someplace. Reminiscence is already briefly provide and HBM is among the many scarcest.
That is the place being a South Korean firm with shut ties to each the SK chaebol and Samsung turns out to be useful. SK Hynix and Samsung are the biggest suppliers of HBM on this planet. Final we heard, Rebellions was sourcing its HBM from Samsung, however in a pinch it should not should combat that arduous to get SK Hynix to kick in some capability.
The chip itself is at the moment being packaged as a PCIe card with a 600 watt TDP, slightly than the OAM or SXM modules we have change into accustomed to.
Rebellions’ reference design requires eight of those playing cards to be crammed right into a single air-cooled node.
Excessive-efficiency, normal kind components akin to 19-inch chassis and air cooling have been key design factors for Rebellions because it meant the system could possibly be deployed into present enterprise datacenters, one thing that may’t be stated of Nvidia’s newest era of liquid-cooled Rubin GPUs.
The RebelRack will function 4 of those nodes, every linked by way of quad-400 Gbps networking, for a complete of 32 accelerators and 64 petaFLOPS of FP8 compute, 4.6 TB of HBM3e, and 153.6 TB/s of mixture reminiscence bandwidth.
For bigger deployments, Rebellions can be growing what it calls the RebelPod, which might scale from eight to 128 nodes, every with eight Rebel100 accelerators interconnected utilizing 800 Gbps Ethernet.
“Proper now, individuals consider rack degree. I believe we’ll be pondering, in a number of days from now, about row degree and datacenter degree,” Choy stated.
In comparison with GPU techniques, this is not a whole lot of networking. Most HGX techniques now function a minimum of one 800 Gbps NIC per GPU. Choy tells us that going ahead, the community cloth goes to be a significant focus for the corporate.
As we have seen with different rack-scale techniques from AMD and Nvidia, compute and networking are solely two items of the puzzle; you additionally want software program that may sew every thing collectively cohesively.
Rebellions’ software program stack is nothing unique. We’re informed the platform runs on open supply frameworks like vLLM, PyTorch, and Triton. For disaggregated inference, it is utilizing llm-d, one other open supply framework that permits compute-heavy prefill operations on one set of accelerators and reminiscence bandwidth-heavy decode operations on one other.
“The whole lot’s open supply, from vLLM compiler all the best way as much as the very highest degree of stack, Purple Hat, OpenShift, and every thing in between,” Choy stated. “Should you’ve used any of those applied sciences in another context, you already know learn how to use Rebellions.”
We have heard comparable claims from chipmakers earlier than that have not ended up being fairly really easy to make use of. Nevertheless, Rebellions is a member of the PyTorch Basis, one thing that may’t be stated of many AI chip startups.
In fact, none of that is low-cost, however Rebellions is not hurting for money. On Monday the startup raised $400 million in a pre-IPO funding spherical led by Mirae Asset Monetary Group and the Korea Nationwide Progress Fund to each assist its growth westward and additional the event of extra able to and environment friendly AI accelerators and techniques.
In line with latest reports, the corporate might file for an IPO as quickly as this 12 months or early subsequent 12 months. ®
Source link


