The Blackwell GPU architecture introduces six groundbreaking technologies aimed at accelerating computing and driving advancements in various fields such as data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing, and generative AI. Spearheaded by NVIDIA, this innovation promises to unlock new possibilities across industries.
Jensen Huang, NVIDIA’s founder and CEO, emphasized the company’s three-decade pursuit of accelerated computing, particularly in enabling transformative technologies like deep learning and AI. He highlighted generative AI as the defining technology of the era, with Blackwell positioned as the engine driving this new industrial revolution. Collaborating with dynamic companies worldwide, NVIDIA aims to democratize AI across all sectors.
Major industry players expected to adopt Blackwell include Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI.
Key innovations of the Blackwell architecture include:
World’s Most Powerful Chip: Featuring 208 billion transistors, Blackwell GPUs utilize a custom-built 4NP TSMC process with two-reticle limit GPU dies, interconnected by a high-speed chip-to-chip link, enabling unparalleled performance.
Second-Generation Transformer Engine: Enhanced with micro-tensor scaling and advanced dynamic range management algorithms, Blackwell supports double the compute and model sizes with new 4-bit floating-point AI inference capabilities.
Fifth-Generation NVLink: This iteration of NVIDIA NVLink delivers groundbreaking bidirectional throughput per GPU, facilitating seamless high-speed communication among multiple GPUs for complex AI models.
RAS Engine: Dedicated to reliability, availability, and serviceability, Blackwell-powered GPUs include AI-based preventative maintenance capabilities, maximizing system uptime and resilience for large-scale AI deployments.
Secure AI: Advanced confidential computing capabilities safeguard AI models and customer data without compromising performance, crucial for privacy-sensitive industries like healthcare and finance.
Decompression Engine: Accelerating database queries with support for the latest formats, the decompression engine ensures high-performance data analytics and science, making data processing increasingly GPU-accelerated.
The NVIDIA GB200 Grace Blackwell Superchip, a central component of the Blackwell architecture, connects two NVIDIA B200 Tensor Core GPUs to the NVIDIA Grace CPU, facilitating ultra-low-power chip-to-chip communication. For optimal AI performance, systems powered by GB200 can integrate with advanced networking platforms like the NVIDIA Quantum-X800 InfiniBand and Spectrum-X800 Ethernet.
The GB200 NVL72, a multi-node, liquid-cooled rack-scale system, incorporates 36 Grace Blackwell Superchips interconnected by fifth-generation NVLink, alongside NVIDIA BlueField-3 data processing units. This configuration provides up to 30x performance increase compared to previous generations for LLM inference workloads while reducing costs and energy consumption.
Additionally, NVIDIA offers the HGX B200 server board, supporting x86-based generative AI platforms by linking eight B200 GPUs through NVLink and offering networking speeds up to 400Gb/s through advanced networking platforms.
In summary, the Blackwell GPU architecture represents a significant leap forward in accelerated computing, poised to drive innovation and propel various industries into the future.