Supermicro introduces advanced NVIDIA-powered SuperClusters for seamless AI deployment.


March 19, 2024 by our News Team

Supermicro has introduced new SuperCluster solutions designed for accelerated generative AI deployment, featuring high-performance 4U liquid-cooled and 8U air-cooled systems, and a 1U air-cooled NVIDIA MGX system optimized for cloud-scale inference, designed to enable faster time to results for large language models and deep learning.

  • Supermicro's SuperCluster solutions are specifically designed for generative AI workloads, making them highly efficient and effective for this type of computing.
  • The liquid-cooled systems offer improved energy efficiency and lower total cost of ownership for data centers, making them a cost-effective option for businesses.
  • The SuperCluster solutions are easily scalable, allowing for seamless expansion as AI workloads grow and evolve.


Supermicro, a leading IT solution provider, has launched its latest portfolio of SuperCluster solutions designed to accelerate the deployment of generative AI. These solutions serve as foundational building blocks for large language model (LLM) infrastructure, catering to the demanding needs of AI workloads.

The Supermicro SuperCluster solutions consist of three powerful options specifically built for generative AI workloads. The lineup includes 4U liquid-cooled systems, 8U air-cooled systems, and a 1U air-cooled Supermicro nVidia MGX system optimized for cloud-scale inference.

According to Charles Liang, President and CEO of Supermicro, the era of AI has shifted the focus from individual servers to compute clusters. With their expanded manufacturing capacity, Supermicro can now deliver complete generative AI clusters to customers faster than ever before. These clusters boast impressive specifications, such as 512 NVIDIA HGX H200 GPUs with 72 TB of HBM3E, making them ideal for training today’s LLMs with trillions of parameters.

Supermicro’s SuperCluster solutions, combined with NVIDIA AI Enterprise software, are well-suited for enterprise and cloud infrastructures. The interconnected GPUs, CPUs, memory, storage, and networking form the foundation of modern AI systems. These solutions provide the necessary building blocks for the rapidly evolving field of generative AI and LLMs.

Kaustubh Sanghani, Vice President of GPU Product Management at NVIDIA, commended Supermicro for leveraging NVIDIA’s latest technologies to accelerate next-generation AI workloads. By incorporating Blackwell architecture-based products into their server systems, Supermicro is meeting the demands of data centers and providing solutions to customers worldwide.

One notable offering from Supermicro is the 4U NVIDIA HGX H100/H200 8-GPU systems, which utilize liquid-cooling technology to double the density of the 8U air-cooled system. This results in reduced energy consumption and lower total cost of ownership for data centers. The cooling distribution unit and manifold ensure optimal temperature for GPUs and CPUs, maximizing performance. This innovative cooling technology can lead to up to a 40% reduction in electricity costs and saves valuable data center space.

The Supermicro SuperCluster solutions are designed to scale seamlessly, whether it’s for training massive foundation models or building cloud-scale LLM inference infrastructures. The spine and leaf network topology, coupled with non-blocking 400 Gb/s fabrics, allows for easy scalability from 32 nodes to thousands of nodes. Supermicro’s rigorous testing processes ensure operational effectiveness and efficiency before shipping, and the integration of Liquid Cooling further enhances performance.

The NVIDIA MGX system designs featured in Supermicro’s solutions address a crucial bottleneck in Generative AI – GPU memory bandwidth and capacity. These designs, equipped with NVIDIA GH200 Grace Hopper Superchips, enable cloud-scale high-volume inference with high batch sizes, lowering operational costs. The SuperCluster solutions provide a blueprint for future AI clusters and offer a 256-node cluster that is easily deployable and scalable.

Supermicro’s SuperCluster solutions come in various configurations, including the 4U liquid-cooled system in 5 racks, the 8U air-cooled system in 9 racks, and the 1U air-cooled NVIDIA MGX system in 9 racks. These systems boast impressive specifications, such as up to 512 GPUs, 64-nodes, and up to 36 TB of HBM3e memory. They also feature high-speed networking options, customizable storage fabrics, and support for NVIDIA AI Enterprise software.

With their superior network performance and optimization for LLM training, deep learning, and high-volume inference, Supermicro’s SuperCluster solutions are a top choice for AI deployments. The company’s rigorous validation testing and on-site deployment service ensure a seamless experience for customers. These plug-and-play scalable units are designed for easy deployment in data centers, allowing for faster time to results.

Supermicro introduces advanced NVIDIA-powered SuperClusters for seamless AI deployment.

About Our Team

Our team comprises industry insiders with extensive experience in computers, semiconductors, games, and consumer electronics. With decades of collective experience, we’re committed to delivering timely, accurate, and engaging news content to our readers.

Background Information


About nVidia: NVIDIA has firmly established itself as a leader in the realm of client computing, continuously pushing the boundaries of innovation in graphics and AI technologies. With a deep commitment to enhancing user experiences, NVIDIA's client computing business focuses on delivering solutions that power everything from gaming and creative workloads to enterprise applications. for its GeForce graphics cards, the company has redefined high-performance gaming, setting industry standards for realistic visuals, fluid frame rates, and immersive experiences. Complementing its gaming expertise, NVIDIA's Quadro and NVIDIA RTX graphics cards cater to professionals in design, content creation, and scientific fields, enabling real-time ray tracing and AI-driven workflows that elevate productivity and creativity to unprecedented heights. By seamlessly integrating graphics, AI, and software, NVIDIA continues to shape the landscape of client computing, fostering innovation and immersive interactions in a rapidly evolving digital world.

nVidia website  nVidia LinkedIn

About Supermicro: Supermicro is a reputable American technology company founded in 1993 and headquartered in San Jose, California. Specializing in high-performance server and storage solutions, Supermicro has become a trusted name in the data center industry. The company offers a wide range of innovative and customizable server hardware, including motherboards, servers, storage systems, and networking equipment, catering to the needs of enterprise clients, cloud service providers, and businesses seeking reliable infrastructure solutions.

Supermicro website  Supermicro LinkedIn

Technology Explained


GPU: GPU stands for Graphics Processing Unit and is a specialized type of processor designed to handle graphics-intensive tasks. It is used in the computer industry to render images, videos, and 3D graphics. GPUs are used in gaming consoles, PCs, and mobile devices to provide a smooth and immersive gaming experience. They are also used in the medical field to create 3D models of organs and tissues, and in the automotive industry to create virtual prototypes of cars. GPUs are also used in the field of artificial intelligence to process large amounts of data and create complex models. GPUs are becoming increasingly important in the computer industry as they are able to process large amounts of data quickly and efficiently.


HBM3E: HBM3E is the latest generation of high-bandwidth memory (HBM), a type of DRAM that is designed for artificial intelligence (AI) applications. HBM3E offers faster data transfer rates, higher density, and lower power consumption than previous HBM versions. HBM3E is developed by SK Hynix, a South Korean chipmaker, and is expected to enter mass production in 2024. HBM3E can achieve a speed of 1.15 TB/s and a capacity of 64 GB per stack. HBM3E is suitable for AI systems that require large amounts of data processing, such as deep learning, machine learning, and computer vision.


Liquid Cooling: Liquid cooling is a technology used to cool down computer components, such as processors, graphics cards, and other components that generate a lot of heat. It works by circulating a liquid coolant, such as water or a special coolant, through a series of pipes and radiators. The liquid absorbs the heat from the components and then dissipates it into the air. This technology is becoming increasingly popular in the computer industry due to its ability to provide more efficient cooling than traditional air cooling methods. Liquid cooling can also be used to overclock components, allowing them to run at higher speeds than their rated speeds. This technology is becoming increasingly popular in the gaming industry, as it allows gamers to get the most out of their hardware.


LLM: A Large Language Model (LLM) is a highly advanced artificial intelligence system, often based on complex architectures like GPT-3.5, designed to comprehend and produce human-like text on a massive scale. LLMs possess exceptional capabilities in various natural language understanding and generation tasks, including answering questions, generating creative content, and delivering context-aware responses to textual inputs. These models undergo extensive training on vast datasets to grasp the nuances of language, making them invaluable tools for applications like chatbots, content generation, and language translation.





Leave a Reply