NVIDIA Amplifies Generative AI Training in MLPerf Benchmarks with Turbocharged Performance

November 9, 2023 by our News Team

NVIDIA has once again raised the bar for AI training and high-performance computing in the latest MLPerf industry benchmarks, setting new records in generative AI, text-to-image models, recommender models, computer vision models, and AI-assisted simulations on supercomputers.

  • NVIDIA Eos completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes.
  • NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion model in just 2.5 minutes.
  • NVIDIA AI platform delivered up to 16 times gains since the first MLPerf HPC round in 2019.

nVidia’s AI platform has once again raised the bar for AI training and high-performance computing in the latest MLPerf industry benchmarks. The standout achievement is in generative AI, where Nvidia Eos, an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking, completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. This is a significant improvement from the previous record of 10.9 minutes set less than six months ago.

The benchmark used a portion of the full GPT-3 data set behind the popular ChatGPT service, and by extrapolation, Eos could now train the entire data set in just eight days, 73 times faster than a previous state-of-the-art system using 512 A100 GPUs. This dramatic reduction in training time not only saves costs and energy but also speeds up time-to-market, making large language models more accessible for businesses.

NVIDIA’s success in generative AI extends to other tests as well. In a new benchmark for text-to-image models, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion model in just 2.5 minutes, setting a high bar for this workload.

These achievements are made possible by the use of the most accelerators ever applied to an MLPerf benchmark. The 10,752 H100 GPUs delivered a 2.8 times scaling in performance compared to the previous round, thanks in part to software optimizations. Efficient scaling is crucial in generative AI as language models continue to grow exponentially each year.

Both Eos and Microsoft Azure employed 10,752 H100 GPUs in separate submissions and achieved within 2% of the same performance. This demonstrates the efficiency of NVIDIA AI in both data center and public-cloud deployments.

NVIDIA’s advancements are not limited to generative AI. The company set new records in other benchmarks as well, such as recommender models and computer vision models. These improvements are a result of both software and hardware advancements.

NVIDIA remains the only company to run all MLPerf tests, showcasing the fastest performance and greatest scaling in each of the nine benchmarks. This translates to faster time-to-market, lower costs, and energy savings for users training massive language models or customizing them for their specific business needs.

Eleven system makers, including ASUS, Dell Technologies, Fujitsu, Gigabyte, Lenovo, QCT, and Supermicro, used the NVIDIA AI platform in their submissions this round. Their participation in MLPerf highlights the value of the benchmark for customers evaluating AI platforms and vendors.

In the MLPerf HPC benchmark for AI-assisted simulations on supercomputers, the H100 GPUs delivered up to twice the performance of NVIDIA A100 Tensor Core GPUs in the last round. This represents up to 16 times gains since the first MLPerf HPC round in 2019. One notable test trained OpenFold, a model that predicts the 3D structure of a protein from its amino acid sequence. The H100 GPUs trained OpenFold in just 7.5 minutes, significantly faster than previous methods that took weeks or months.

The MLPerf benchmarks have gained widespread support from industry and academia since their inception in 2018. Organizations such as Amazon, ARM, Baidu, Google, Harvard, HPE, Intel, Lenovo, Meta, Microsoft, NVIDIA, Stanford University, and the University of Toronto back these tests. The transparency and objectivity of MLPerf make it a reliable resource for users to make informed buying decisions.

All the software used by NVIDIA in these benchmarks is available from the MLPerf repository, ensuring that developers can achieve similar results. NVIDIA continuously optimizes these software advancements, making them available on NGC, the company’s software hub for GPU applications.

NVIDIA Amplifies Generative AI Training in MLPerf Benchmarks with Turbocharged Performance

NVIDIA Amplifies Generative AI Training in MLPerf Benchmarks with Turbocharged Performance

NVIDIA Amplifies Generative AI Training in MLPerf Benchmarks with Turbocharged Performance

NVIDIA Amplifies Generative AI Training in MLPerf Benchmarks with Turbocharged Performance

About Our Team

Our team comprises industry insiders with extensive experience in computers, semiconductors, games, and consumer electronics. With decades of collective experience, we’re committed to delivering timely, accurate, and engaging news content to our readers.

Background Information

About ARM: ARM, originally known as Acorn RISC Machine, is a British semiconductor and software design company that specializes in creating energy-efficient microprocessors, system-on-chip (SoC) designs, and related technologies. Founded in 1990, ARM has become a important player in the global semiconductor industry and is widely recognized for its contributions to mobile computing, embedded systems, and Internet of Things (IoT) devices. ARM's microprocessor designs are based on the Reduced Instruction Set Computing (RISC) architecture, which prioritizes simplicity and efficiency in instruction execution. This approach has enabled ARM to produce highly efficient and power-saving processors that are used in a vast array of devices, ranging from smartphones and tablets to IoT devices, smart TVs, and more. The company does not manufacture its own chips but licenses its processor designs and intellectual property to a wide range of manufacturers, including Qualcomm, Apple, Samsung, and NVIDIA, who then integrate ARM's technology into their own SoCs. This licensing model has contributed to ARM's widespread adoption and influence across various industries.

ARM website  ARM LinkedIn

About ASUS: ASUS, founded in 1989 by Ted Hsu, M.T. Liao, Wayne Hsieh, and T.H. Tung, has become a multinational tech giant known for its diverse hardware products. Spanning laptops, motherboards, graphics cards, and more, ASUS has gained recognition for its innovation and commitment to high-performance computing solutions. The company has a significant presence in gaming technology, producing popular products that cater to enthusiasts and professionals alike. With a focus on delivering and reliable technology, ASUS maintains its position as a important player in the industry.

ASUS website  ASUS LinkedIn

About Dell: Dell is a globally technology leader providing comprehensive solutions in the field of hardware, software, and services. for its customizable computers and enterprise solutions, Dell offers a diverse range of laptops, desktops, servers, and networking equipment. With a commitment to innovation and customer satisfaction, Dell caters to a wide range of consumer and business needs, making it a important player in the tech industry.

Dell website  Dell LinkedIn

About Fujitsu: Fujitsu is a important Japanese technology company for its wide array of computing solutions. With a history dating back to 1935, Fujitsu excels in producing personal computers, laptops, and tablets that combine innovation and reliability. In addition to consumer-focused products, Fujitsu is a key player in enterprise solutions, offering servers, storage systems, and data center services. The company's emphasis on quality, advanced features, and IT services has solidified its position as a significant player in the global computing industry.

Fujitsu website

About Gigabyte: Gigabyte Technology, a important player in the computer hardware industry, has established itself as a leading provider of innovative solutions and products catering to the ever-evolving needs of modern computing. With a strong emphasis on quality, performance, and technology, Gigabyte has gained recognition for its wide array of computer products. These encompass motherboards, graphics cards, laptops, desktop PCs, monitors, and other components that are integral to building high-performance systems. for their reliability and advanced features, Gigabyte's motherboards and graphics cards have become staples in the gaming and enthusiast communities, delivering the power and capabilities required for immersive gaming experiences and resource-intensive applications

Gigabyte website  Gigabyte LinkedIn

About Google: Google, founded by Larry Page and Sergey Brin in 1998, is a multinational technology company known for its internet-related services and products. Initially for its search engine, Google has since expanded into various domains including online advertising, cloud computing, software development, and hardware devices. With its innovative approach, Google has introduced influential products such as Google Search, Android OS, Google Maps, and Google Drive. The company's commitment to research and development has led to advancements in artificial intelligence and machine learning.

Google website  Google LinkedIn

About Intel: Intel Corporation, a global technology leader, is for its semiconductor innovations that power computing and communication devices worldwide. As a pioneer in microprocessor technology, Intel has left an indelible mark on the evolution of computing with its processors that drive everything from PCs to data centers and beyond. With a history of advancements, Intel's relentless pursuit of innovation continues to shape the digital landscape, offering solutions that empower businesses and individuals to achieve new levels of productivity and connectivity.

Intel website  Intel LinkedIn

About Lenovo: Lenovo, formerly known as "Legend Holdings," is a important global technology company that offers an extensive portfolio of computers, smartphones, servers, and electronic devices. Notably, Lenovo acquired IBM's personal computer division, including the ThinkPad line of laptops, in 2005. With a strong presence in laptops and PCs, Lenovo's products cater to a wide range of consumer and business needs. Committed to innovation and quality, Lenovo delivers reliable and high-performance solutions, making it a significant player in the tech industry.

Lenovo website  Lenovo LinkedIn

About Microsoft: Microsoft, founded by Bill Gates and Paul Allen in 1975 in Redmond, Washington, USA, is a technology giant known for its wide range of software products, including the Windows operating system, Office productivity suite, and cloud services like Azure. Microsoft also manufactures hardware, such as the Surface line of laptops and tablets, Xbox gaming consoles, and accessories.

Microsoft website  Microsoft LinkedIn

About nVidia: NVIDIA has firmly established itself as a leader in the realm of client computing, continuously pushing the boundaries of innovation in graphics and AI technologies. With a deep commitment to enhancing user experiences, NVIDIA's client computing business focuses on delivering solutions that power everything from gaming and creative workloads to enterprise applications. for its GeForce graphics cards, the company has redefined high-performance gaming, setting industry standards for realistic visuals, fluid frame rates, and immersive experiences. Complementing its gaming expertise, NVIDIA's Quadro and NVIDIA RTX graphics cards cater to professionals in design, content creation, and scientific fields, enabling real-time ray tracing and AI-driven workflows that elevate productivity and creativity to unprecedented heights. By seamlessly integrating graphics, AI, and software, NVIDIA continues to shape the landscape of client computing, fostering innovation and immersive interactions in a rapidly evolving digital world.

nVidia website  nVidia LinkedIn

About Supermicro: Supermicro is a reputable American technology company founded in 1993 and headquartered in San Jose, California. Specializing in high-performance server and storage solutions, Supermicro has become a trusted name in the data center industry. The company offers a wide range of innovative and customizable server hardware, including motherboards, servers, storage systems, and networking equipment, catering to the needs of enterprise clients, cloud service providers, and businesses seeking reliable infrastructure solutions.

Supermicro website  Supermicro LinkedIn

Technology Explained

GPU: GPU stands for Graphics Processing Unit and is a specialized type of processor designed to handle graphics-intensive tasks. It is used in the computer industry to render images, videos, and 3D graphics. GPUs are used in gaming consoles, PCs, and mobile devices to provide a smooth and immersive gaming experience. They are also used in the medical field to create 3D models of organs and tissues, and in the automotive industry to create virtual prototypes of cars. GPUs are also used in the field of artificial intelligence to process large amounts of data and create complex models. GPUs are becoming increasingly important in the computer industry as they are able to process large amounts of data quickly and efficiently.

Stable Diffusion: Stable Diffusion is a technology that is used to improve the performance of computer systems. It is a process of spreading out the load of a system across multiple processors or cores. This helps to reduce the amount of time it takes for a system to complete a task, as well as reduce the amount of energy used. Stable Diffusion is used in many areas of the computer industry, such as in cloud computing, distributed computing, and high-performance computing. It is also used in gaming, where it can help to reduce the amount of time it takes for a game to load. Stable Diffusion is also used in artificial intelligence, where it can help to improve the accuracy of machine learning algorithms.

Leave a Reply