New virtual machines for Microsoft Azure allow developers to create generative AI apps that can be scaled to work with thousands of Nvidia H100 GPUs.
The ND H100 v5 VM series on Azure, which works in tandem with Quantum-2 InfiniBand networking, boosts the performance of large-scale deployments by companies such as OpenAI, creators of the much talked-about ChatGPT, and Nvidia’s chips.
The new supercomputing system in the cloud provides the type of infrastructure required to handle the latest large-scale AI training models, according to Matt Vegas, principal product manager for Azure high-performance computing (HPC) and AI at Microsoft.
“Generative AI applications are rapidly evolving and adding unique value across nearly every industry,” Vegas wrote in a blog post this week. “From reinventing search with a new AI-powered Microsoft Bing and Edge to AI-powered assistance in Microsoft Dynamics 365, AI is rapidly becoming a pervasive component of software and how we interact with it, and our AI Infrastructure will be there to pave the way.”
Version 5 of the ND H100, now in preview, includes eight H100 Tensor Core GPUs that are interconnected through NVSwitch and NVLink 4.0, 400 GBps Nvidia Quantum-2 CX7 InfiniBand per GPU, NVSwitch and NVLink 4.0 with 3.6 TBps bidirectional bandwidth among the eight local GPUs, 4th Gen Intel Xeon Scalable processors, and PCIe Generation 5 host-to-GPU interconnect with 64 GBps per GPU.
While the system’s specs might be impressive, the cost might make it reasonable for only the largest of enterprises.
“This is Nvidia’s highest-end GPU/AI configuration and will run things like ChatGPT well, or for banks dealing with thousands of customers they watch to make sure they are not committing fraud or are a bad risk,” said Jack Gold, president and principal analyst at J.Gold Associates. “But here’s the gotcha — the H100 is a very expensive system. If you want it to handle complex environments, it might cost you a million or more.”
Organizations most likely to purchase a fully loaded Nvidia system would be third-party developers and service providers.
Microsoft has not disclosed pricing for its new Azure offering. Other AI cloud services that provide large generative AI models, such as Azure OpenAI Service, offer pay-as-you-go consumption models, and users pay per unit for each model.
Nvidia has taken an active role in helping not just Microsoft but all of its hyperscale partners build their data centers for AI, said Ian Buck, vice president of hyperscale and HPC at Nvidia. “What we inevitably end up with is the partner’s data center with Nvidia’s brains.”
Dan NewmanChief analyst, Futurum Research, and CEO, The Futurum Group
But while Nvidia partners with Microsoft, the chip giant also partners with Microsoft’s largest competitors, including AWS and Google, in putting together their respective AI supercomputers. Microsoft might be in a better competitive position with its recent AI-focused acquisitions and services.
“Microsoft has arrived at an advantageous position,” said Dan Newman, chief analyst at Futurum Research and CEO of The Futurum Group. “With new AI services, the implementation of ChatGPT [and] a sizeable investment in OpenAI, the company is moving fast — and with likely more announcements next week [at Nvidia’s GTC conference].”
A variety of Azure services will be available with the new VMs, including Azure Machine Learning, which makes Microsoft’s AI supercomputer available to users for model training, and the Azure Open AI Service, which provides users with the capabilities of large-scale generative AI models, the company said.
Microsoft might have a technology edge now, according to some, but this latest partnership is likely to inspire the delivery of even more capable systems by both well-known and little-known competitors alike as the AI market continues to heat up.
“The competition has been roaring the last year or two, but with technology being offered like this, competitive offerings will go up to the next level,” Newman said.
As Editor at Large in TechTarget Editorial’s News Group, Ed Scannell is responsible for writing and reporting breaking news, news analysis and features focused on technology issues and trends affecting corporate IT professionals.