AMD Instinct MI300X Accelerators Boost Microsoft Azure OpenAI and Azure ND MI300X V5 Workloads

AMD and Microsoft collaborate to provide powerful AI tools, including the Instinct MI300X accelerators and ROCm open software, for Azure customers, with the general availability of Microsoft Azure ND MI300X virtual machines and the use of AMD Ryzen AI processors and Alveo MA35D media accelerators for AI-based deployments and live streaming video workloads.

Impressive performance and efficiency for demanding AI workloads
General availability of Microsoft Azure ND MI300X virtual machines
Collaboration between Microsoft, AMD, and Hugging Face on the ROCm open software ecosystem

At the Microsoft Build event, AMD showcased its latest compute and software capabilities for Microsoft customers and developers. The company highlighted the use of AMD solutions such as Instinct MI300X accelerators, ROCm open software, Ryzen AI processors and software, and Alveo MA35D media accelerators to provide powerful tools for AI-based deployments across various markets.

One notable announcement was the general availability of Microsoft Azure ND MI300X virtual machines (VMs), which offer impressive performance and efficiency for demanding AI workloads. Hugging Face, a customer of Microsoft, now has access to these VMs to enhance their AI workloads.

“The AMD Instinct MI300X and ROCm software stack is powering the Azure OpenAI Chat GPT 3.5 and 4 services, which are some of the world’s most demanding AI workloads,” said Victor Peng, president of AMD. “With the general availability of the new VMs from Azure, AI customers have broader access to MI300X to deliver high-performance and efficient solutions for AI applications.”

Microsoft’s Chief Technology Officer and Executive Vice President of AI, Kevin Scott, expressed excitement about the collaboration between Microsoft and AMD. He emphasized the importance of coupling powerful compute hardware with system and software optimization to deliver exceptional AI performance and value. Through the use of ROCm and MI300X, Microsoft aims to empower its AI customers and developers to achieve excellent price-performance results for advanced and compute-intensive models.

The Azure ND MI300X v5 VM series, previously announced in preview in November 2023, is now available in the Canada Central region for running AI workloads. These VMs offer industry-leading performance with impressive HBM capacity and memory bandwidth. This allows customers to fit larger models in GPU memory or use fewer GPUs, resulting in power, cost, and time savings.

Hugging Face, one of the first customers to utilize these VMs, achieved remarkable performance and price/performance for their models after porting them to the ND MI300X VMs within a month. This demonstrates the efficiency and ease of deploying NLP applications using Hugging Face models on the VMs.

Julien Simon, Chief Evangelist Officer at Hugging Face, praised the collaboration between Microsoft, AMD, and Hugging Face on the ROCm open software ecosystem. He highlighted how this collaboration enables Hugging Face users to run a vast number of AI models available on the Hugging Face Hub on Azure with AMD Instinct GPUs without code changes. This makes it easier for Azure customers to build AI using open models and open source.

Developers can also leverage AMD Ryzen AI software to optimize and deploy AI inference on AMD Ryzen AI powered PCs. The Ryzen AI software allows applications to run on the neural processing unit (NPU) built on AMD XDNA architecture, the first dedicated AI processing silicon on a Windows x86 processor. This enables AI models to operate on the embedded NPU, freeing up CPU and GPU resources for other compute tasks. This significantly increases battery life and allows developers to run on-device LLM AI workloads and concurrent applications efficiently and locally.

In addition to AI advancements, Microsoft has chosen the AMD Alveo MA35D media accelerator to power its live streaming video workloads, including Microsoft Teams and SharePoint video. The Alveo MA35D is purpose-built to handle live interactive streaming services at scale, streamlining video processing tasks such as transcoding, decoding, encoding, and adaptive bitrate streaming. By using the Alveo MA35D accelerator in servers powered by 4th Gen AMD EPYC processors, Microsoft can consolidate servers and cloud infrastructure while delivering impressive performance and future-ready AV1 technology.

The 4th Gen AMD EPYC processors currently power a range of VMs at Azure, including general-purpose, memory-intensive, compute-optimized, and accelerated compute VMs. These VMs offer up to 20% better performance for general-purpose and memory-intensive workloads, improved price/performance, and up to 2x the CPU performance for compute-optimized workloads compared to the previous generation of AMD EPYC processor-powered VMs at Azure. The Dalsv6, Dasv6, Easv6, Falsv6, and Famsv6 VM-series are currently in preview and will become generally available in the coming months.

About Our Team

Our team comprises industry insiders with extensive experience in computers, semiconductors, games, and consumer electronics. With decades of collective experience, we’re committed to delivering timely, accurate, and engaging news content to our readers.

Background Information

About AMD:

AMD, a large player in the semiconductor industry is known for its powerful processors and graphic solutions, AMD has consistently pushed the boundaries of performance, efficiency, and user experience. With a customer-centric approach, the company has cultivated a reputation for delivering high-performance solutions that cater to the needs of gamers, professionals, and general users. AMD's Ryzen series of processors have redefined the landscape of desktop and laptop computing, offering impressive multi-core performance and competitive pricing that has challenged the dominance of its competitors. Complementing its processor expertise, AMD's Radeon graphics cards have also earned accolades for their efficiency and exceptional graphical capabilities, making them a favored choice among gamers and content creators. The company's commitment to innovation and technology continues to shape the client computing landscape, providing users with powerful tools to fuel their digital endeavors.

Latest Articles about AMD

About Microsoft:

Microsoft, founded by Bill Gates and Paul Allen in 1975 in Redmond, Washington, USA, is a technology giant known for its wide range of software products, including the Windows operating system, Office productivity suite, and cloud services like Azure. Microsoft also manufactures hardware, such as the Surface line of laptops and tablets, Xbox gaming consoles, and accessories.

Latest Articles about Microsoft

Technology Explained

CPU: The Central Processing Unit (CPU) is the brain of a computer, responsible for executing instructions and performing calculations. It is the most important component of a computer system, as it is responsible for controlling all other components. CPUs are used in a wide range of applications, from desktop computers to mobile devices, gaming consoles, and even supercomputers. CPUs are used to process data, execute instructions, and control the flow of information within a computer system. They are also used to control the input and output of data, as well as to store and retrieve data from memory. CPUs are essential for the functioning of any computer system, and their applications in the computer industry are vast.

Latest Articles about CPU

EPYC: EPYC is a technology designed by computer chip manufacturer AMD for use in the server and data center industry. It was introduced in June 2017 and features an innovative design to improve performance and power efficiency. EPYC processor technology is based on an innovative 14nm processor architecture, allowing up to 32 high-performance cores in a single socket. This allows for more efficient processing power, increased memory bandwidth, and greater compute density. EPYC is now widely used in the data center and cloud computing industry and provides benefits such as greater scalability, increased resource efficiency, and advanced virtualization capabilities. Additionally, EPYC technology is used in data intensive servers like server farms, gaming, and virtualization platforms. EPYC ensures that even with large deployments in multi-processor environments, power consumption and performance levels are optimized to ensure maximum efficiency.

Latest Articles about EPYC

GPU: GPU stands for Graphics Processing Unit and is a specialized type of processor designed to handle graphics-intensive tasks. It is used in the computer industry to render images, videos, and 3D graphics. GPUs are used in gaming consoles, PCs, and mobile devices to provide a smooth and immersive gaming experience. They are also used in the medical field to create 3D models of organs and tissues, and in the automotive industry to create virtual prototypes of cars. GPUs are also used in the field of artificial intelligence to process large amounts of data and create complex models. GPUs are becoming increasingly important in the computer industry as they are able to process large amounts of data quickly and efficiently.

Latest Articles about GPU

LLM: A Large Language Model (LLM) is a highly advanced artificial intelligence system, often based on complex architectures like GPT-3.5, designed to comprehend and produce human-like text on a massive scale. LLMs possess exceptional capabilities in various natural language understanding and generation tasks, including answering questions, generating creative content, and delivering context-aware responses to textual inputs. These models undergo extensive training on vast datasets to grasp the nuances of language, making them invaluable tools for applications like chatbots, content generation, and language translation.

Latest Articles about LLM

NPU: NPU, or Neural Processing Unit, is a type of specialized processor that is designed to handle complex artificial intelligence tasks. It is inspired by the structure and function of the human brain, with the ability to process and analyze large amounts of data simultaneously. In the computer industry, NPU technology is being used in various applications such as speech recognition, image and video processing, and natural language processing. This allows computers to perform tasks that were previously only possible for humans, making them more efficient and intelligent. NPU technology is also being integrated into smartphones, self-driving cars, and other devices, making them smarter and more responsive to user needs. With the increasing demand for AI-driven technology, the use of NPU is expected to grow and revolutionize the way we interact with computers in the future.

Latest Articles about NPU

Evergreen Posts

NZXT about to launch the H6 Flow RGB, a HYTE Y60’ish Mid tower case

Intel’s CPU Roadmap: 15th Gen Arrow Lake Arriving Q4 2024, Panther Lake and Nova Lake Follow

HYTE teases the “HYTE Y70 Touch” case with large touch screen

NVIDIA’s Data-Center Roadmap Reveals GB200 and GX200 GPUs for 2024-2025

Intel introduces Impressive 15th Gen Core i7-15700K and Core i9-15900K: Release Date Imminent