Intel Goes Chiplet-Crazy With MAX CPU & GPU Series: Xeon CPU MAX ‘Sapphire Rapids HBM’ & Xeon GPU MAX ‘Ponte Vecchio’ For Servers
Today, Intel finally enters the data center and enterprise segment with its own multi-tile aka chiplet products codenamed Xeon MAX. The first two products in this brand new family are the Xeon CPU MAX series which are Sapphire Rapids processors with HBM technology while the other product is the Xeon GPU MAX series based on the Ponte Vecchio design which goes crazy with the chiplet-era architecture.
Intel Xeon MAX CPUs “Sapphire Rapids With HBM”
Starting with the Intel Xeon CPU MAX series, the Xeon CPU MAX series will house up to four HBM packages, all offering significantly higher DRAM bandwidth versus a baseline Sapphire Rapids-SP Xeon CPU with 8-channel DDR5 memory. This is going to allow Intel to offer a chip with both increased capacity and bandwidth for customers that demand it. The HBM SKUs can be used in two modes, an HBM Flat mode & an HBM caching mode. The Xeon CPU Max Series offers 64 gigabytes of high bandwidth memory (HBM2e) on the package, significantly increasing data throughput for HPC and AI workloads. Compared with top-end 3rd Gen Intel Xeon Scalable processors, the Xeon CPU Max Series provides up to 3.7 times10 more performance on a range of real-world applications like energy and earth systems modeling. Further, the Data Center GPU Max Series packs over 100 billion transistors into a 47-tile package, bringing new levels of throughput to challenging workloads like physics, financial services and life sciences. When paired with the Xeon CPU Max Series, the combined platform achieves up to 12.8 times greater performance than the prior generation when running the LAMMPS molecular dynamics simulator. via Intel The standard Sapphire Rapids-SP Xeon chip will feature 10 EMIB interconnects and the entire package will measure at a mighty 4446mm2. Moving over to the HBM variant, we are getting an increased number of interconnects which sit at 14 and are needed to interconnect the HBM2E memory to the cores. The four HBM2E memory packages will feature 8-Hi stacks so Intel is going for at least 16 GB of HBM2E memory per stack for a total of 64 GB across the Sapphire Rapids-SP package. Talking about the package, the HBM variant will measure at an insane 5700mm2 or 28% larger than the standard variant. Compared to the recently leaked EPYC Genoa numbers, the HBM2E package for Sapphire Rapids-SP would end up 5% larger while the standard package will be 22% smaller.
Intel Sapphire Rapids-SP Xeon (Standard Package) - 4446mm2 Intel Sapphire Rapids-SP Xeon (HBM2E Package) - 5700mm2 AMD EPYC Genoa (12 CCD Package) - 5428mm2
It is also mentioned that the Intel Sapphire Rapids Xeon Max CPUs will feature up to 80 PCIe Gen 5.0 / 4.0 lanes, support for 8-channel DDR5-4800 memory, up to 4 UPI & 16 GTs lanes & an x8 DMI PCIe 3.0 interface. There are at least five SKUs launching today which include:
Xeon Platinum 9480 - 56 Core (1.9 / 2.6 GHz) - $12980 US Xeon Platinum 9470 - 52 Core (2.0 / 2.7 GHz) - $11590 US Xeon Platinum 9468 - 48 Core (2.1 / 2.6 GHz) - $9900 US Xeon Platinum 9460 - 40 Core (2.2 / 2.7 GHz) - $8750 US Xeon Platinum 9462 - 32 Core (2.7 / 3.1 GHz) - $7995 US
Intel 4th Gen Sapphire Rapids Xeon Max Series CPUs
Intel Xeon MAX GPUs “Ponte Vecchio” In Various Configs
The Intel ‘Ponte Vecchio’ GPU or the ‘Intel Data Center GPU Max Series’ as the company now likes to call it, is a major product that has 128 Xe Cores, 128 RT cores (making it the only HPC / AI GPU that has a native raytracing core), up to 64 MB of L1 Cache and up to 408 MB of L2 cache. 128GB of HBM2e has also been used and the IO will connect up to 8 discrete dies. PCIe Gen 5 is being used along with Xe Link to deliver a tremendous amount of processing power. It is created using a mix of Intel 7, TSMC N5, and TSMC N7 packaged through EMIB and Foveros approaches. Max Series GPUs will be available in several form factors to address different customer needs:
Max Series 1100 GPU: A 300-watt double-wide PCIe card with 56 Xe cores and 48GB of HBM2e memory. Multiple cards can be connected via Intel Xe Link bridges. Max Series 1350 GPU: A 450-watt OAM module with 112 Xe cores and 96GB of HBM. Max Series 1550 GPU: Intel’s maximum performance 600-watt OAM module with 128 Xe cores and 128GB of HBM.
Intel is saying the architecture will allow up to 8 OAMs to be connected for absolute beast mode performance and based on the numbers they gave for 4 OAMs we can calculate the following:
1 OAM: 128GB HBM2e, 128 Xe Cores, 600W TDP, 52TFLOPs, 3.2 TBs/ memory bandwidth 2 OAM: 256GB HBM2e, 256 Xe Cores, 1200W TDP, 104 TFLOPS, 6.4 TB/s memory bandwidth 4 OAM: 512GB HBM2e, 512 Xe Cores, 2400W TDP, 208 TFLOPS, 12.8 TB/s memory bandwidth
Intel Xeon MAX GPU Specs
Now let’s talk about performance. Intel’s Max Series GPUs deliver up to 128 Xe-HPC cores, the new foundational architecture targeted at the most demanding computing workloads. Intel is claiming that each OAM is 2x faster than an NVIDIA 100 in OpenMC and miniBUDE. Intel states the Intel Data Center GPU Max Series has an aggregate 1.5x performance lead in ExaSMR - NekRS virtual nuclear reactor simulation workloads like AdvSub, FDM (FP32), AxHelm (FP32), and AxHelm (FP64). Finally, they are also claiming the performance crown (when compared with the NVIDIA A100) on financial workloads like Riskfuel which are used to train credit options and pricing models.