Intel Unveils Impressive AI Inference Capabilities, Showcasing Powerful Performance

Intel's submission of MLPerf Inference v3.1 results highlights its competitive performance in AI inference, offering customers flexibility and choice when selecting an optimal AI solution based on their specific performance, efficiency, and cost targets.

Intel's submission builds upon previous updates from MLCommons and Hugging Face, which demonstrated that the Gaudi 2 accelerators can outperform Nvidia's H100 on a state-of-the-art vision language model.
Intel's AI products offer customers flexibility and choice when selecting an optimal AI solution based on their specific performance, efficiency, and cost targets.
Intel remains the only vendor to provide public CPU results using industry-standard deep learning ecosystem software.

MLCommons, a leading organization in machine learning, has released the results of its MLPerf Inference v3.1 performance benchmark, which includes evaluations of GPT-J, a large language model with 6 billion parameters, as well as computer vision and natural language processing models. Intel, a major player in the AI industry, submitted its results for various hardware components, including the Habana Gaudi 2 accelerators, 4th Gen Intel Xeon Scalable processors, and Intel Xeon CPU Max Series.

The results highlight Intel’s competitive performance in AI inference and reaffirm the company’s dedication to making artificial intelligence more accessible across a wide range of AI workloads. Sandra Rivera, Intel’s executive vice president and general manager of the Data Center and AI Group, emphasized the company’s commitment to meeting customers’ needs for high-performance and efficient deep learning inference and training.

Intel’s submission builds upon previous updates from MLCommons and Hugging Face, which demonstrated that the Gaudi 2 accelerators can outperform nVidia’s H100 on a state-of-the-art vision language model. These latest results further solidify Intel as a viable alternative to Nvidia’s H100 and A100 for AI compute requirements. Intel’s AI products offer customers flexibility and choice when selecting an optimal AI solution based on their specific performance, efficiency, and cost targets, while also providing an escape from closed ecosystems.

The Habana Gaudi 2 inference performance results for GPT-J showcase its competitive performance. The Gaudi 2 achieved impressive performance metrics, with 78.58 queries per second for server queries and 84.08 queries per second for offline samples. While Nvidia’s H100 holds a slight advantage over Gaudi 2 in terms of performance (1.09x for server queries and 1.28x for offline samples), Gaudi 2 outperforms Nvidia’s A100 by a significant margin (2.4x for server queries and 2x for offline samples). The Gaudi 2 submission utilized FP8 and achieved 99.9% accuracy on this new data type. Intel plans to continue delivering performance improvements and expanded model coverage through software updates released every six to eight weeks.

Intel also submitted results for all seven inference benchmarks, including GPT-J, on its 4th Gen Intel Xeon Scalable processors. These results demonstrate excellent performance across various general-purpose AI workloads, such as vision, language processing, speech and audio translation models, as well as larger models like DLRM v2 recommendation and ChatGPT-J. Additionally, Intel remains the only vendor to provide public CPU results using industry-standard deep learning ecosystem software.

The 4th Gen Intel Xeon Scalable processor is an ideal choice for building and deploying general-purpose AI workloads with popular AI frameworks and libraries. In the GPT-J 100-word summarization task, which involves summarizing a news article of approximately 1,000 to 1,500 words, the 4th Gen Intel Xeon processors achieved impressive results, summarizing two paragraphs per second in offline mode and one paragraph per second in real-time server mode.

For the first time, Intel submitted MLPerf results for its Intel Xeon CPU Max Series, which offers up to 64 gigabytes of high-bandwidth memory. The Intel Xeon CPU Max Series was the only CPU capable of achieving 99.9% accuracy in the GPT-J benchmark, making it crucial for applications that require the highest level of accuracy.

Intel collaborated with its original equipment manufacturer (OEM) customers to showcase the scalability and wide availability of general-purpose servers powered by Intel Xeon processors. These collaborations further demonstrate Intel’s commitment to delivering AI performance that meets customer service level agreements (SLAs).

MLPerf is widely recognized as the most reputable benchmark for AI performance, providing a fair and repeatable platform for performance comparisons. Intel plans to submit new AI training performance results for the upcoming MLPerf benchmark, reaffirming its commitment to supporting customers and addressing every aspect of the AI continuum, from low-cost AI processors to high-performing hardware accelerators and GPUs for network, cloud, and enterprise customers.

About Our Team

Our team comprises industry insiders with extensive experience in computers, semiconductors, games, and consumer electronics. With decades of collective experience, we’re committed to delivering timely, accurate, and engaging news content to our readers.

Background Information

About Intel:

Intel Corporation, a global technology leader, is for its semiconductor innovations that power computing and communication devices worldwide. As a pioneer in microprocessor technology, Intel has left an indelible mark on the evolution of computing with its processors that drive everything from PCs to data centers and beyond. With a history of advancements, Intel's relentless pursuit of innovation continues to shape the digital landscape, offering solutions that empower businesses and individuals to achieve new levels of productivity and connectivity.

Latest Articles about Intel

About nVidia:

NVIDIA has firmly established itself as a leader in the realm of client computing, continuously pushing the boundaries of innovation in graphics and AI technologies. With a deep commitment to enhancing user experiences, NVIDIA's client computing business focuses on delivering solutions that power everything from gaming and creative workloads to enterprise applications. for its GeForce graphics cards, the company has redefined high-performance gaming, setting industry standards for realistic visuals, fluid frame rates, and immersive experiences. Complementing its gaming expertise, NVIDIA's Quadro and NVIDIA RTX graphics cards cater to professionals in design, content creation, and scientific fields, enabling real-time ray tracing and AI-driven workflows that elevate productivity and creativity to unprecedented heights. By seamlessly integrating graphics, AI, and software, NVIDIA continues to shape the landscape of client computing, fostering innovation and immersive interactions in a rapidly evolving digital world.

Latest Articles about nVidia

Technology Explained

CPU: The Central Processing Unit (CPU) is the brain of a computer, responsible for executing instructions and performing calculations. It is the most important component of a computer system, as it is responsible for controlling all other components. CPUs are used in a wide range of applications, from desktop computers to mobile devices, gaming consoles, and even supercomputers. CPUs are used to process data, execute instructions, and control the flow of information within a computer system. They are also used to control the input and output of data, as well as to store and retrieve data from memory. CPUs are essential for the functioning of any computer system, and their applications in the computer industry are vast.

Latest Articles about CPU

Xeon: The Intel Xeon processor is a powerful and reliable processor used in many computer systems. It is a multi-core processor that is designed to handle multiple tasks simultaneously. It is used in servers, workstations, and high-end desktop computers. It is also used in many embedded systems, such as routers and switches. The Xeon processor is known for its high performance and scalability, making it a popular choice for many computer applications. It is also used in many cloud computing applications, as it is capable of handling large amounts of data and providing high levels of performance. The Xeon processor is also used in many scientific and engineering applications, as it is capable of handling complex calculations and simulations.

Latest Articles about Xeon

Evergreen Posts

NZXT about to launch the H6 Flow RGB, a HYTE Y60’ish Mid tower case

Intel’s CPU Roadmap: 15th Gen Arrow Lake Arriving Q4 2024, Panther Lake and Nova Lake Follow

HYTE teases the “HYTE Y70 Touch” case with large touch screen

NVIDIA’s Data-Center Roadmap Reveals GB200 and GX200 GPUs for 2024-2025

Intel introduces Impressive 15th Gen Core i7-15700K and Core i9-15900K: Release Date Imminent

Intel introduces Impressive AI Inference Capabilities, Showcasing Powerful Performance

About Our Team

Background Information

About Intel:

Latest Articles about Intel

About nVidia:

Latest Articles about nVidia

Technology Explained

Latest Articles about CPU

Latest Articles about Xeon

Evergreen Posts

Leave a Reply Cancel reply

About Our Team

Background Information

About Intel:

Latest Articles about Intel

About nVidia:

Latest Articles about nVidia

Technology Explained

Latest Articles about CPU

Latest Articles about Xeon

Related Posts

Trending Posts

Evergreen Posts

Leave a Reply Cancel reply