Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, introduces the tech guide for their GIGAPOD, an advanced scaling AI supercomputing infrastructure solution designed to enhance modern AI applications, such as large language model (LLM) training and real-time inference. Built around powerful GPU servers, it incorporates accelerators, like NVIDIA HGX™ H100 NVIDIA HGX™ H200. Utilizing GPU interconnects through NVIDIA® NVLink™, it combines the nodes in a cluster into a single computing unit via high-speed networking, significantly boosting the crucial high-speed parallel computing demand in AI applications.

From design, production, to deployment, GIGABYTE can manage it all with GIGAPOD because of its flexible and scalable architecture. It is designed to accommodate the explosive growth in AI training models, providing a one-stop solution for transforming traditional data centers into large-scale AI cloud service providers. Leveraging GIGABYTE’s expertise in hardware and strong partnerships with leading upstream GPU manufacturers not only ensures a smooth AI supercomputer deployment, but also provides users with reliable AI productivity.

Challenges in Modern Computing Architectures
In the early days of GPU applications and AI development, when the computing requirements were relatively low and the interconnect technology was not yet mature, GPU computing primarily ran on a simple single-server architecture. However, as the scale of training models increased, the importance of multi-GPU and multi-node architectures became more apparent, especially for training LLMs with hundreds of billions of parameters. The GPU is important, but the cluster computing interconnect cannot be overlooked as it can significantly reduce AI training times and has become a vital component for large-scale computing centers.

When advanced enterprises build ideal AI application solutions, they typically face three primary requirements during the initial hardware deployment:

  1. Powerful Computing: GPU nodes can compute in tandem, enabling them to efficiently perform parallel processing tasks such as matrix operations during AI training and simulations.
  2. Systematic Hardware Deployment: Data center deployment requires meticulous planning for key aspects such as data center power, floor layout, rack configuration, and thermal management, ensuring complete system hardware integration.
  3. Uninterrupted High-Speed Network Architecture: A high-speed network topology provides high bandwidth, low-latency network interconnections to speed up data transfer and enhance system performance.

While discussions about data center construction often focus on the number of GPUs and computing power, without a well-established power supply and cooling system, the GPUs in the server room cannot realize their potential. Additionally, a high-speed networking architecture is a must as it plays a crucial role in ensuring that each computing node can communicate in real time, enabling fast GPU-to-GPU communication to handle the exponential growth in data.

To overcome the challenges faced by modern data centers, the following sections will detail why GIGAPOD is the best solution for building AI data centers today.

Optimized Hardware Configuration
A basic GIGAPOD configuration consists of 32x GPU servers, each equipped with 8x GPUs, providing a total of 256x interconnected GPUs. Additionally, a dedicated rack is required to house network switches and storage servers.

GIGAPOD configuration/specification with the GIGABYTE G593 series server:

  • CPU: Dual 4th/5th Generation Intel® Xeon® Scalable Processors or
    AMD EPYC™ 9005/9004 Series Processors
  • GPU: NVIDIA HGX™ H100 / H200 Tensor Core GPU
  • Memory: 24x DIMMs (AMD EPYC) or 32x DIMMs (Intel Xeon)
  • Storage: 8x 2.5” Gen5 NVMe/SATA/SAS-4 hot-swap drives
  • PCIe slots: 4x FHHL and 8x low-profile PCIe Gen5 x16 slots
  • Power: 4+2 3000W 80 PLUS Titanium redundant power supplies

All server models in the G593 series support 8-GPU baseboards and dual CPUs. In parallel computing workloads, the server primarily relies on the GPU, while complex linear processing tasks are handled by the CPU. This workload distribution is ideal for AI training applications, and users can choose their preferred CPU platform from either AMD or Intel.

Unique Advantages of the GIGABYTE G593 Series:

  • Industry-leading high-density design: The G593 series offers the highest density 8-GPU air-cooled server on the market. Compared to the larger, industry-standard 7U/8U designs, GIGABYTE achieves the same compute performance in a more compact 5U chassis.
  • Front-mounted GPU tray: The removable front GPU tray allows for easier maintenance and access of the GPU modules.
  • Advanced cooling technology: Supports Direct Liquid Cooling (DLC) for CPU, GPU, and NVSwitch to reduce energy consumption and achieve a lower PUE (Power Usage Effectiveness).
  • 1-to-1 balanced design: Each PCIe switch connects to the same number of GPUs, storage devices, and PCIe slots, making it ideal for GPU RDMA and direct data access from NVMe drives.
  • Six CRPS redundant power supplies: Features a redundant power design, with a 3600W PSU option to achieve N+N redundancy.

When building a performance-optimized AI computing solution, avoiding bandwidth bottlenecks is crucial. In high-performance AI systems or clusters, the ideal scenario is for all data transfer to use the GPU’s high-bandwidth memory, avoiding data transfers through the processor’s PCIe lanes. To solve the bandwidth performance bottleneck, GIGABYTE integrates four Broadcom PCIe switches on the system board to allow GPUs to access data through Remote Direct Memory Access (RDMA) without routing through the CPU. For accelerated networking, each GPU connects to NVIDIA® ConnectX®-7, which uses InfiniBand or Ethernet networking at up to 400Gb/s.

Additionally, PCIe switches help with signal expansion, allowing for greater I/O connectivity by efficiently sharing PCIe lanes beyond those devoted to the GPU modules. GIGABYTE’s design includes four additional PCIe x16 slots, often used with NVIDIA BlueField®-3 DPUs for networking, security, and data processing in high-performance clusters.

GIGABYTE’s AI data center supercomputing solution, GIGAPOD, not only excels in reliability, availability, and maintainability but also offers unparalleled flexibility. Whether it’s the choice of GPU, rack size, cooling solutions, or custom planning, GIGABYTE adapts to diverse IT infrastructure, hardware requirements, and data center sizes. With services ranging from L6 to L12, covering everything from power and cooling infrastructure design to hardware deployment, system optimization, and after-sales support, we ensure that our customers receive an end-to-end solution that fully meets their operational requirements and performance goals.

For queries or more information, please contact sales
Stay updated with our latest news and announcements by following Giga Computing:
X: https://x.com/GigaComputing
LinkedIn: https://linkedin.com/company/giga-computing
Facebook: https://facebook.com/gigabyteserver

Über Giga Computing Technology Co., Ltd.

About Giga Computing
Giga Computing Technology is an industry innovator and leader in the enterprise computing market. Having spun off from GIGABYTE, we maintain hardware expertise in manufacturing and product design, while operating as a standalone business that can drive more investment into core competencies. We offer a complete product portfolio that addresses all workloads from the data center to edge including traditional and emerging workloads in HPC and AI to data analytics, 5G/edge, cloud computing, and more. Our longstanding partnerships with key technology leaders ensure that our new products will be the most advanced and launch with new partner platforms. Our systems embody performance, security, scalability, and sustainability. To find out more, visit https://www.gigacomputing.com/…… and join our newsletter.

About GIGABYTE
GIGABYTE is an engineer, visionary, and leader in the world of tech that uses its hardware expertise, patented innovations, and industry leadership to create, inspire, and advance. Renowned for over 30 years of award-winning excellence in motherboards and graphics cards, GIGABYTE is a cornerstone in the HPC community, providing businesses with server and data center expertise to accelerate their success. At the forefront of evolving technology, GIGABYTE is devoted to inventing smart solutions that enable digitalization from edge to cloud, and allow customers to capture, analyze, and transform digital information into economic data that can benefit humanity and "Upgrade Your Life". Please visit https://www.gigabyte.com/ for more information.

Firmenkontakt und Herausgeber der Meldung:

Giga Computing Technology Co., Ltd.
7F, 6 Baoqiang Rd., Xindian Dist.
231 New Taipei City
Telefon: +31 40 290 2071
Telefax: +49 (40) 253304-45
https://www.gigabyte.com/

Ansprechpartner:
GIGA Computing EMEA
Telefon: +31 40 290 2071
E-Mail: bernice@giga-byte.nl
Für die oben stehende Story ist allein der jeweils angegebene Herausgeber (siehe Firmenkontakt oben) verantwortlich. Dieser ist in der Regel auch Urheber des Pressetextes, sowie der angehängten Bild-, Ton-, Video-, Medien- und Informationsmaterialien. Die United News Network GmbH übernimmt keine Haftung für die Korrektheit oder Vollständigkeit der dargestellten Meldung. Auch bei Übertragungsfehlern oder anderen Störungen haftet sie nur im Fall von Vorsatz oder grober Fahrlässigkeit. Die Nutzung von hier archivierten Informationen zur Eigeninformation und redaktionellen Weiterverarbeitung ist in der Regel kostenfrei. Bitte klären Sie vor einer Weiterverwendung urheberrechtliche Fragen mit dem angegebenen Herausgeber. Eine systematische Speicherung dieser Daten sowie die Verwendung auch von Teilen dieses Datenbankwerks sind nur mit schriftlicher Genehmigung durch die United News Network GmbH gestattet.

counterpixel