NVIDIA GH200 Superchip Impresses with MLPerf Inference Benchmarks Achievement

NVIDIA's GH200 Grace Hopper Superchip and H100 GPUs have delivered impressive performance on the MLPerf industry benchmarks, showcasing their exceptional versatility and power in data center inference tests. Additionally, NVIDIA's TensorRT-LLM software and L4 GPUs have demonstrated their superiority across a range of AI workloads, while the Jetson Orin system-on-module has exhibited up to an 84% performance increase in object detection. With support from over 70 organizations, users can trust that they will receive dependable and flexible performance from NVIDIA's full-stack AI platform.

NVIDIA's GH200 Grace Hopper Superchip has made an impressive debut on the MLPerf industry benchmarks, showcasing its exceptional performance and versatility in data center inference tests.
NVIDIA's HGX H100 systems, equipped with eight H100 GPUs, delivered the highest throughput on every MLPerf Inference test in this round.
NVIDIA has developed TensorRT-LLM, a generative AI software that optimizes inference, allowing customers to more than double the inference performance of their H100 GPUs at no additional cost.

nVidia’s GH200 Grace Hopper Superchip has made an impressive debut on the MLPerf industry benchmarks, showcasing its exceptional performance and versatility in data center inference tests. The superchip combines a Hopper GPU with a Grace CPU, providing increased memory, bandwidth, and the ability to optimize performance by automatically shifting power between the CPU and GPU. Additionally, NVIDIA’s HGX H100 systems, equipped with eight H100 GPUs, delivered the highest throughput on every MLPerf Inference test in this round.

The GH200 Superchips and H100 GPUs demonstrated their superiority across a range of MLPerf’s data center tests, including computer vision, speech recognition, medical imaging, recommendation systems, and large language models (LLMs) used in generative AI. These results continue to solidify NVIDIA’s position as a leader in AI training and inference since the launch of the MLPerf benchmarks in 2018.

To further enhance inference performance, NVIDIA has developed TensorRT-LLM, a generative AI software that optimizes inference. Although not ready for submission to MLPerf in August, this open-source library allows customers to more than double the inference performance of their H100 GPUs at no additional cost. Internal tests have shown up to an 8x performance speedup when using TensorRT-LLM on H100 GPUs compared to prior generation GPUs running GPT-J 6B without the software.

NVIDIA’s continuous innovation on its full-stack AI platform is exemplified by TensorRT-LLM and other software advancements that provide users with growing performance over time without extra costs. These advances are versatile across diverse AI workloads and have been integrated by leading companies such as Meta, AnyScale, Cohere, Deci, Grammarly, Mistral AI, MosaicML (now part of Databricks), OctoML, Tabnine, and Together AI.

In the latest MLPerf benchmarks, NVIDIA’s L4 GPUs demonstrated excellent performance across various workloads. Compact L4 GPUs running in 72 W PCIe accelerators delivered up to 6x more performance than CPUs with nearly 5x higher power consumption. The dedicated media engines in L4 GPUs, combined with CUDA software, provided up to 120x speedups for computer vision in NVIDIA’s tests. These GPUs are available from Google Cloud and other system builders, catering to customers in industries ranging from consumer internet services to drug discovery.

NVIDIA also showcased its performance boosts at the edge with a new model compression technology that achieved up to a 4.7x performance boost when running the BERT LLM on an L4 GPU. This technique, demonstrated in MLPerf’s “open division,” is expected to benefit all AI workloads, particularly those running on edge devices with size and power consumption constraints. Additionally, the NVIDIA Jetson Orin system-on-module exhibited up to an 84% performance increase in object detection compared to the previous round, thanks to software optimizations leveraging the latest version of the chip’s cores.

The MLPerf benchmarks provide transparent and objective results, enabling users to make informed buying decisions. With participation from cloud service providers like Microsoft Azure and Oracle Cloud Infrastructure, as well as system manufacturers such as ASUS, Dell Technologies, Lenovo, and Supermicro, users can trust that they will receive dependable and flexible performance. MLPerf is supported by over 70 organizations, including industry giants like Alibaba, ARM, Cisco, Google, Intel, and Microsoft.

All the software used in NVIDIA’s benchmarks is available from the MLPerf repository, ensuring that everyone can achieve the same world-class results. These optimizations are continuously integrated into containers available on the NVIDIA NGC software hub for GPU applications. For more detailed information on NVIDIA’s latest results, readers can refer to their technical blog.

About Our Team

Our team comprises industry insiders with extensive experience in computers, semiconductors, games, and consumer electronics. With decades of collective experience, we’re committed to delivering timely, accurate, and engaging news content to our readers.

Background Information

About ARM: ARM, originally known as Acorn RISC Machine, is a British semiconductor and software design company that specializes in creating energy-efficient microprocessors, system-on-chip (SoC) designs, and related technologies. Founded in 1990, ARM has become a important player in the global semiconductor industry and is widely recognized for its contributions to mobile computing, embedded systems, and Internet of Things (IoT) devices. ARM's microprocessor designs are based on the Reduced Instruction Set Computing (RISC) architecture, which prioritizes simplicity and efficiency in instruction execution. This approach has enabled ARM to produce highly efficient and power-saving processors that are used in a vast array of devices, ranging from smartphones and tablets to IoT devices, smart TVs, and more. The company does not manufacture its own chips but licenses its processor designs and intellectual property to a wide range of manufacturers, including Qualcomm, Apple, Samsung, and NVIDIA, who then integrate ARM's technology into their own SoCs. This licensing model has contributed to ARM's widespread adoption and influence across various industries.

About ASUS: ASUS, founded in 1989 by Ted Hsu, M.T. Liao, Wayne Hsieh, and T.H. Tung, has become a multinational tech giant known for its diverse hardware products. Spanning laptops, motherboards, graphics cards, and more, ASUS has gained recognition for its innovation and commitment to high-performance computing solutions. The company has a significant presence in gaming technology, producing popular products that cater to enthusiasts and professionals alike. With a focus on delivering and reliable technology, ASUS maintains its position as a important player in the industry.

About Dell: Dell is a globally technology leader providing comprehensive solutions in the field of hardware, software, and services. for its customizable computers and enterprise solutions, Dell offers a diverse range of laptops, desktops, servers, and networking equipment. With a commitment to innovation and customer satisfaction, Dell caters to a wide range of consumer and business needs, making it a important player in the tech industry.

About Google: Google, founded by Larry Page and Sergey Brin in 1998, is a multinational technology company known for its internet-related services and products. Initially for its search engine, Google has since expanded into various domains including online advertising, cloud computing, software development, and hardware devices. With its innovative approach, Google has introduced influential products such as Google Search, Android OS, Google Maps, and Google Drive. The company's commitment to research and development has led to advancements in artificial intelligence and machine learning.

About Intel: Intel Corporation, a global technology leader, is for its semiconductor innovations that power computing and communication devices worldwide. As a pioneer in microprocessor technology, Intel has left an indelible mark on the evolution of computing with its processors that drive everything from PCs to data centers and beyond. With a history of advancements, Intel's relentless pursuit of innovation continues to shape the digital landscape, offering solutions that empower businesses and individuals to achieve new levels of productivity and connectivity.

About Lenovo: Lenovo, formerly known as "Legend Holdings," is a important global technology company that offers an extensive portfolio of computers, smartphones, servers, and electronic devices. Notably, Lenovo acquired IBM's personal computer division, including the ThinkPad line of laptops, in 2005. With a strong presence in laptops and PCs, Lenovo's products cater to a wide range of consumer and business needs. Committed to innovation and quality, Lenovo delivers reliable and high-performance solutions, making it a significant player in the tech industry.

About Microsoft: Microsoft, founded by Bill Gates and Paul Allen in 1975 in Redmond, Washington, USA, is a technology giant known for its wide range of software products, including the Windows operating system, Office productivity suite, and cloud services like Azure. Microsoft also manufactures hardware, such as the Surface line of laptops and tablets, Xbox gaming consoles, and accessories.

About nVidia: NVIDIA has firmly established itself as a leader in the realm of client computing, continuously pushing the boundaries of innovation in graphics and AI technologies. With a deep commitment to enhancing user experiences, NVIDIA's client computing business focuses on delivering solutions that power everything from gaming and creative workloads to enterprise applications. for its GeForce graphics cards, the company has redefined high-performance gaming, setting industry standards for realistic visuals, fluid frame rates, and immersive experiences. Complementing its gaming expertise, NVIDIA's Quadro and NVIDIA RTX graphics cards cater to professionals in design, content creation, and scientific fields, enabling real-time ray tracing and AI-driven workflows that elevate productivity and creativity to unprecedented heights. By seamlessly integrating graphics, AI, and software, NVIDIA continues to shape the landscape of client computing, fostering innovation and immersive interactions in a rapidly evolving digital world.

About Oracle: Oracle Corporation is a important American multinational technology company founded in 1977 and headquartered in Redwood City, California. It's one of the world's largest software and cloud computing companies, known for its enterprise software products and services. Oracle specializes in developing and providing database management systems, cloud solutions, software applications, and hardware infrastructure. Their flagship product, the Oracle Database, is widely used in businesses and organizations worldwide. Oracle also offers a range of cloud services, including Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS).

About Supermicro: Supermicro is a reputable American technology company founded in 1993 and headquartered in San Jose, California. Specializing in high-performance server and storage solutions, Supermicro has become a trusted name in the data center industry. The company offers a wide range of innovative and customizable server hardware, including motherboards, servers, storage systems, and networking equipment, catering to the needs of enterprise clients, cloud service providers, and businesses seeking reliable infrastructure solutions.

Technology Explained

CPU: The Central Processing Unit (CPU) is the brain of a computer, responsible for executing instructions and performing calculations. It is the most important component of a computer system, as it is responsible for controlling all other components. CPUs are used in a wide range of applications, from desktop computers to mobile devices, gaming consoles, and even supercomputers. CPUs are used to process data, execute instructions, and control the flow of information within a computer system. They are also used to control the input and output of data, as well as to store and retrieve data from memory. CPUs are essential for the functioning of any computer system, and their applications in the computer industry are vast.

GPU: GPU stands for Graphics Processing Unit and is a specialized type of processor designed to handle graphics-intensive tasks. It is used in the computer industry to render images, videos, and 3D graphics. GPUs are used in gaming consoles, PCs, and mobile devices to provide a smooth and immersive gaming experience. They are also used in the medical field to create 3D models of organs and tissues, and in the automotive industry to create virtual prototypes of cars. GPUs are also used in the field of artificial intelligence to process large amounts of data and create complex models. GPUs are becoming increasingly important in the computer industry as they are able to process large amounts of data quickly and efficiently.

PCIe: PCIe (Peripheral Component Interconnect Express) is a high-speed serial computer expansion bus standard for connecting components such as graphics cards, sound cards, and network cards to a motherboard. It is the most widely used interface in the computer industry today, and is used in both desktop and laptop computers. PCIe is capable of providing up to 16 times the bandwidth of the older PCI standard, allowing for faster data transfer speeds and improved performance. It is also used in a variety of other applications, such as storage, networking, and communications. PCIe is an essential component of modern computing, and its applications are only expected to grow in the future.