Supermicro begins shipping servers powered by NVIDIA GH200 Grace Hopper superchip

October 19, 2023 by our News Team

Supermicro has launched a range of GPU systems based on NVIDIA's reference architecture, featuring the latest NVIDIA GH200 Grace Hopper and NVIDIA Grace CPU Superchip, for standardized AI infrastructure and accelerated computing.

  • Compact 1U and 2U form factors offer flexibility and expansion options
  • Advanced liquid-cooling technology enables high-density configurations
  • Modular architecture supports multiple PCIe 5.0 x16 FHFL slots for DPUs, additional GPUs, networking, and storage

Supermicro, known for its comprehensive IT solutions, has launched a wide range of GPU systems based on nVidia’s reference architecture. These systems feature the latest NVIDIA GH200 Grace Hopper and NVIDIA Grace CPU Superchip, making Supermicro one of the industry leaders in standardized AI infrastructure and accelerated computing.

The new modular architecture, available in compact 1U and 2U form factors, offers flexibility and expansion options for current and future GPUs, DPUs, and CPUs. Supermicro’s advanced liquid-cooling technology enables high-density configurations, such as a 1U 2-node setup with 2 integrated NVIDIA GH200 Grace Hopper Superchips connected by a high-speed interconnect. With facilities worldwide, Supermicro can deliver thousands of rack-scale AI servers per month, ensuring Plug-and-Play compatibility.

Charles Liang, president and CEO of Supermicro, emphasized the company’s commitment to modular, scalable, and universal systems that cater to the evolving AI landscape. By collaborating with NVIDIA, Supermicro aims to accelerate time to market for enterprises developing AI-enabled applications while simplifying deployment and reducing environmental impact.

Ian Buck, vice president of hyperscale and HPC at NVIDIA, acknowledged the longstanding collaboration between NVIDIA and Supermicro in delivering high-performance AI systems. The combination of NVIDIA’s MGX modular reference design and Supermicro’s server expertise will pave the way for new generations of AI systems featuring Grace and Grace Hopper Superchips.

Supermicro’s NVIDIA MGX platforms offer a diverse range of servers designed to accommodate future AI technologies. This product line addresses the unique thermal, power, and mechanical challenges associated with AI-based servers.

The lineup includes various models such as the ARS-111GL-NHR and ARS-111GL-NHR-LCC, both featuring 1 NVIDIA GH200 Grace Hopper Superchip with air or Liquid Cooling. The ARS-111GL-DHNR-LCC takes it a step further with 2 NVIDIA GH200 Grace Hopper Superchips and 2 nodes, all liquid-cooled. The ARS-121L-DNR boasts 2 NVIDIA Grace CPU Superchips in each of its 2 nodes, totaling 288 cores. The ARS-221GL-NR and SYS-221GE-NR models offer NVIDIA Grace CPU Superchips in different form factors.

Each MGX platform can be enhanced with NVIDIA BlueField-3 DPU and/or NVIDIA ConnectX-7 interconnects for high-performance networking.

Supermicro’s 1U NVIDIA MGX systems come with up to 2 NVIDIA GH200 Grace Hopper Superchips, featuring 2 NVIDIA H100 GPUs and 2 NVIDIA Grace CPUs. These systems offer ample memory, with 480 GB LPDDR5X for the CPU and either 96 GB of HBM3 or 144 GB of HBM3E for the GPU. The memory-coherent NVIDIA-C2C interconnect ensures high-bandwidth, low-Latency communication between the CPU, GPU, and memory at a remarkable speed of 900 GB/s, seven times faster than PCIe 5.0. The modular architecture supports multiple PCIe 5.0 x16 FHFL slots for DPUs, additional GPUs, networking, and storage.

Supermicro’s proven direct-to-chip liquid cooling solutions are particularly beneficial for the 1U 2-node design featuring 2 NVIDIA GH200 Grace Hopper Superchips. These solutions reduce operational expenses by over 40% while increasing computing density and simplifying large language model (LLM) clusters and HPC applications’ rack-scale deployment.

The 2U Supermicro NVIDIA MGX platform supports both NVIDIA Grace and x86 CPUs, accommodating up to 4 full-size data center GPUs such as the NVIDIA H100 PCIe, H100 NVL, or L40S. It also provides ample I/O connectivity with three additional PCIe 5.0 x16 slots and eight hot-swap EDSFF storage bays.

Supermicro offers NVIDIA networking solutions to secure and accelerate AI workloads on the MGX platform. This includes NVIDIA BlueField-3 DPUs for accelerated user-to-cloud and data storage access, as well as ConnectX-7 adapters for high-speed InfiniBand or Ethernet connectivity between GPU servers.

Developers can leverage these new systems and NVIDIA’s software products for various industry workloads. NVIDIA AI Enterprise, an enterprise-grade software, streamlines the development and deployment of generative AI, computer vision, speech AI, and more. The NVIDIA HPC software development kit provides essential tools for scientific computing.

Supermicro has meticulously designed every aspect of the NVIDIA MGX systems to improve efficiency, from intelligent thermal design to component selection. NVIDIA Grace Superchip CPUs, with their 144 cores, deliver up to twice the performance per watt compared to traditional x86 CPUs. Certain Supermicro NVIDIA MGX systems can be configured with two nodes in just 1U, offering compute densities and energy efficiency for hyperscale and edge data centers.

Supermicro begins shipping servers powered by NVIDIA GH200 Grace Hopper superchip

Supermicro begins shipping servers powered by NVIDIA GH200 Grace Hopper superchip

Supermicro begins shipping servers powered by NVIDIA GH200 Grace Hopper superchip

About Our Team

Our team comprises industry insiders with extensive experience in computers, semiconductors, games, and consumer electronics. With decades of collective experience, we’re committed to delivering timely, accurate, and engaging news content to our readers.

Background Information

About nVidia: NVIDIA has firmly established itself as a leader in the realm of client computing, continuously pushing the boundaries of innovation in graphics and AI technologies. With a deep commitment to enhancing user experiences, NVIDIA's client computing business focuses on delivering solutions that power everything from gaming and creative workloads to enterprise applications. for its GeForce graphics cards, the company has redefined high-performance gaming, setting industry standards for realistic visuals, fluid frame rates, and immersive experiences. Complementing its gaming expertise, NVIDIA's Quadro and NVIDIA RTX graphics cards cater to professionals in design, content creation, and scientific fields, enabling real-time ray tracing and AI-driven workflows that elevate productivity and creativity to unprecedented heights. By seamlessly integrating graphics, AI, and software, NVIDIA continues to shape the landscape of client computing, fostering innovation and immersive interactions in a rapidly evolving digital world.

nVidia website  nVidia LinkedIn

About Supermicro: Supermicro is a reputable American technology company founded in 1993 and headquartered in San Jose, California. Specializing in high-performance server and storage solutions, Supermicro has become a trusted name in the data center industry. The company offers a wide range of innovative and customizable server hardware, including motherboards, servers, storage systems, and networking equipment, catering to the needs of enterprise clients, cloud service providers, and businesses seeking reliable infrastructure solutions.

Supermicro website  Supermicro LinkedIn

Technology Explained

CPU: The Central Processing Unit (CPU) is the brain of a computer, responsible for executing instructions and performing calculations. It is the most important component of a computer system, as it is responsible for controlling all other components. CPUs are used in a wide range of applications, from desktop computers to mobile devices, gaming consoles, and even supercomputers. CPUs are used to process data, execute instructions, and control the flow of information within a computer system. They are also used to control the input and output of data, as well as to store and retrieve data from memory. CPUs are essential for the functioning of any computer system, and their applications in the computer industry are vast.

GPU: GPU stands for Graphics Processing Unit and is a specialized type of processor designed to handle graphics-intensive tasks. It is used in the computer industry to render images, videos, and 3D graphics. GPUs are used in gaming consoles, PCs, and mobile devices to provide a smooth and immersive gaming experience. They are also used in the medical field to create 3D models of organs and tissues, and in the automotive industry to create virtual prototypes of cars. GPUs are also used in the field of artificial intelligence to process large amounts of data and create complex models. GPUs are becoming increasingly important in the computer industry as they are able to process large amounts of data quickly and efficiently.

HBM3E: HBM3E is the latest generation of high-bandwidth memory (HBM), a type of DRAM that is designed for artificial intelligence (AI) applications. HBM3E offers faster data transfer rates, higher density, and lower power consumption than previous HBM versions. HBM3E is developed by SK Hynix, a South Korean chipmaker, and is expected to enter mass production in 2024. HBM3E can achieve a speed of 1.15 TB/s and a capacity of 64 GB per stack. HBM3E is suitable for AI systems that require large amounts of data processing, such as deep learning, machine learning, and computer vision.

Latency: Technology latency is the time it takes for a computer system to respond to a request. It is an important factor in the performance of computer systems, as it affects the speed and efficiency of data processing. In the computer industry, latency is a major factor in the performance of computer networks, storage systems, and other computer systems. Low latency is essential for applications that require fast response times, such as online gaming, streaming media, and real-time data processing. High latency can cause delays in data processing, resulting in slow response times and poor performance. To reduce latency, computer systems use various techniques such as caching, load balancing, and parallel processing. By reducing latency, computer systems can provide faster response times and improved performance.

Liquid Cooling: Liquid cooling is a technology used to cool down computer components, such as processors, graphics cards, and other components that generate a lot of heat. It works by circulating a liquid coolant, such as water or a special coolant, through a series of pipes and radiators. The liquid absorbs the heat from the components and then dissipates it into the air. This technology is becoming increasingly popular in the computer industry due to its ability to provide more efficient cooling than traditional air cooling methods. Liquid cooling can also be used to overclock components, allowing them to run at higher speeds than their rated speeds. This technology is becoming increasingly popular in the gaming industry, as it allows gamers to get the most out of their hardware.

LLM: A Large Language Model (LLM) is a highly advanced artificial intelligence system, often based on complex architectures like GPT-3.5, designed to comprehend and produce human-like text on a massive scale. LLMs possess exceptional capabilities in various natural language understanding and generation tasks, including answering questions, generating creative content, and delivering context-aware responses to textual inputs. These models undergo extensive training on vast datasets to grasp the nuances of language, making them invaluable tools for applications like chatbots, content generation, and language translation.

LPDDR5X: LPDDR5X is a type of computer memory technology that is used in many modern computers. It stands for Low Power Double Data Rate 5X and is a type of Random Access Memory (RAM). It is designed to be more efficient than its predecessors, allowing for faster data transfer speeds and lower power consumption. This makes it ideal for use in laptops, tablets, and other mobile devices. It is also used in gaming consoles and other high-end computers. LPDDR5X is capable of transferring data at up to 8400 megabits per second, making it one of the fastest types of RAM available. This makes it ideal for applications that require high performance, such as gaming, video editing, and 3D rendering.

PCIe: PCIe (Peripheral Component Interconnect Express) is a high-speed serial computer expansion bus standard for connecting components such as graphics cards, sound cards, and network cards to a motherboard. It is the most widely used interface in the computer industry today, and is used in both desktop and laptop computers. PCIe is capable of providing up to 16 times the bandwidth of the older PCI standard, allowing for faster data transfer speeds and improved performance. It is also used in a variety of other applications, such as storage, networking, and communications. PCIe is an essential component of modern computing, and its applications are only expected to grow in the future.

Leave a Reply