AI Neocloud refers to a new breed of cloud compute provider focused on offering GPU compute rental.
The rise of the AI Neoclouds has captivated the attention of the entire computing industry. Everyone is using them for access to GPU compute, from enterprises to startups. Even Microsoft is spending $200M+ a month on GPU compute through AI Neoclouds despite having their own datacenters.
This GPU rental market is huge and stands as the most significant incremental driver of GPU demand. It is primarily served by two types of providers: hyperscalers and AI Neoclouds.
Traditional hyperscalers offering AI cloud services include Google Cloud, Microsoft Azure, AWS and Oracle. In contrast, Meta, xAI and Tesla, despite also having formidable GPU fleets and considerable capacity expansion plans, do not currently offer AI services, and thus do not fall into this group.
Hyperscalers enjoy the lowest cost of capital. But their integrated ecosystems, vast data lakes, and established enterprise customer bases allow them to charge premium prices and capture higher margins.
Interestingly, most listed data center companies, like Equinix, are structured as REITs (Real Estate Investment Trusts), meaning they own, operate and finance “income-generating real estate”. Thus, they are effectively real estate companies (leasing) rather than technology companies.
AI Neoclouds, unlike traditional hyperscalers, focus almost exclusively on GPU Cloud services. They, of course, have a higher cost of capital compared to the hyperscalers. The largest AI Neocloud giant is Coreweave (Nvidia is one of the investors) — last valuation at $19B, $1.8B in equity, debt of $10.6B, and revenues of $2.0B (not bad for a 3-year-old company!).
Another interesting emerging business model that sits outside the above categories is VC Clusters, whereby a VC or a VC-like entity sets up clusters for the exclusive use of portfolio companies. Notable examples include Andromeda, AI2, Computefund.ai, and A16z. With in-house clusters, these VCs can provide very flexible options for compute rental in exchange for equity.
Overall, the economics powering the AI Neoclouds are still evolving just as the market is learning how their business models work. Particularly when it comes to the expected depreciation / obsolescence lifetime of AI chips. Today, most AI Neoclouds rely heavily on Nvidia’s H100 chips, with many starting operations in 2024. However, with Nvidia’s upcoming release of the faster and more efficient Blackwell generation, the potential impact on pricing — and consequently the IRR for the current H100 chip base — remains to be seen.
Conclusion
Most AI Neoclouds’ clients are currently focused on training, with typical contacts lasting 2-3 years. But the inference market is projected to be ten times larger. Token as a Service (TaaS) is likely to emerge as the winning model for commercializing inference. As a result, AI Neoclouds will soon need to reinvent themselves. Exciting times ahead for the sector (not to mention energy supply)!