Alternative clouds are booming as companies seek cheaper access to GPUs

0 3 minutes read

Alternative clouds are booming as companies seek cheaper access to GPUs

The appetite for alternative clouds has never been bigger.

Case in point: CoreWeave, the GPU infrastructure provider that began life as a cryptocurrency mining operation, this week raised $1.1 billion in new funding from investors including Coatue, Fidelity and Altimeter Capital. The round brings its valuation to $19 billion post-money, and its total raised to $5 billion in debt and equity — a remarkable figure for a company that’s less than ten years old.

It’s not just CoreWeave.

Lambda Labs, which also offers an array of cloud-hosted GPU instances, in early April secured a “special purpose financing vehicle” of up to $500 million, months after closing a $320 million Series C round. The nonprofit Voltage Park, backed by crypto billionaire Jed McCaleb, last October announced that it’s investing $500 million in GPU-backed data centers. And Together AI, a cloud GPU host that also conducts generative AI research, in March landed $106 million in a Salesforce-led round.

So why all the enthusiasm for — and cash pouring into — the alternative cloud space?

The answer, as you might expect, is generative AI.

As the generative AI boom times continue, so does the demand for the hardware to run and train generative AI models at scale. GPUs, architecturally, are the logical choice for training, fine-tuning and running models because they contain thousands of cores that can work in parallel to perform the linear algebra equations that make up generative models.

But installing GPUs is expensive. So most devs and organizations turn to the cloud instead.

Incumbents in the cloud computing space — Amazon Web Services (AWS), Google Cloud and Microsoft Azure — offer no shortage of GPU and specialty hardware instances optimized for generative AI workloads. But for at least some models and projects, alternative clouds can end up being cheaper — and delivering better availability.

On CoreWeave, renting an Nvidia A100 40GB — one popular choice for model training and inferencing — costs $2.39 per hour, which works out to $1,200 per month. On Azure, the same GPU costs $3.40 per hour, or $2,482 per month; on Google Cloud, it’s $3.67 per hour, or $2,682 per month.

Given generative AI workloads are usually performed on clusters of GPUs, the cost deltas quickly grow.

“Companies like CoreWeave participate in a market we call specialty ‘GPU as a service’ cloud providers,” Sid Nag, VP of cloud services and technologies at Gartner, told TechCrunch. “Given the high demand for GPUs, they offers an alternate to the hyperscalers, where they’ve taken Nvidia GPUs and provided another route to market and access to those GPUs.”

Nag points out that even some big tech firms have begun to lean on alternative cloud providers as they run up against compute capacity challenges.

Last June, CNBC reported that Microsoft had signed a multi-billion-dollar deal with CoreWeave to ensure that OpenAI, the maker of ChatGPT and a close Microsoft partner, would have adequate compute power to train its generative AI models. Nvidia, the furnisher of the bulk of CoreWeave’s chips, sees this as a desirable trend, perhaps for leverage reasons; it’s said to have given some alternative cloud providers preferential access to its GPUs.

Lee Sustar, principal analyst at Forrester, sees cloud vendors like CoreWeave succeeding in part because they don’t have the infrastructure “baggage” that incumbent providers have to deal with.

“Given hyperscaler dominance of the overall public cloud market, which demands vast investments in infrastructure and range of services that make little or no revenue, challengers like CoreWeave have an opportunity to succeed with a focus on premium AI services without the burden of hypercaler-level investments overall,” he said.

But is this growth sustainable?

Sustar has his doubts. He believes that alternative cloud providers’ expansion will be conditioned by whether they can continue to bring GPUs online in high volume, and offer them at competitively low prices.

Competing on pricing might become challenging down the line as incumbents like Google, Microsoft and AWS ramp up investments in custom hardware to run and train models. Google offers its TPUs; Microsoft recently unveiled two custom chips, Azure Maia and Azure Cobalt; and AWS has Trainium, Inferentia and Graviton.

“Hypercalers will leverage their custom silicon to mitigate their dependencies on Nvidia, while Nvidia will look to CoreWeave and other GPU-centric AI clouds,” Sustar said.

Then there’s the fact that, while many generative AI workloads run best on GPUs, not all workloads need them — particularly if they’re aren’t time-sensitive. CPUs can run the necessary calculations, but typically slower than GPUs and custom hardware.

More existentially, there’s a threat that the generative AI bubble will burst, which would leave providers with mounds of GPUs and not nearly enough customers demanding them. But the future looks rosy in the short term, say Sustar and Nag, both of whom are expecting a steady stream of upstart clouds.

“GPU-oriented cloud startups will give [incumbents] plenty of competition, especially among customers who are already multi-cloud and can handle the complexity of management, security, risk and compliance across multiple clouds,” Sustar said. “Those sorts of cloud customers are comfortable trying out a new AI cloud if it has credible leadership, solid financial backing and GPUs with no wait times.”