If you’re considering upgrading your X86 server environment – and there’s been a lot of discussion about the intent of both enterprises and cloud providers to do just that – the exciting news is that both Intel and AMD have unveiled their most advanced serial compute engines to date.
For Intel, which still holds about two-thirds of the X86 server CPU market, achieving near parity with AMD despite a slight setback in manufacturing processes is quite impressive. On the other hand, AMD’s recently introduced “Turin” Zen 5 and Zen 5c processors, along with their performance and price benefits, suggest that the company will continue to capture market share, regardless of Intel’s attempts to close the gap in X86 server CPUs. This could indicate that, in the near future, once both companies achieve similar manufacturing and performance levels, one of them may initiate a price war.
However, that moment is not upon us just yet. Currently, as major hyperscalers focus on developing their own Arm-based server CPU architectures, both Intel and AMD are content to compete with one another while ignoring the rise of Arm technology. Lowering prices for X86 chips would mean surrendering considerable revenue and profit margins, a luxury neither firm can afford. Consequently, X86 server CPUs are becoming a new legacy tier, while custom Arm chips are pushing the price/performance ratio down. We also anticipate that RISC-V will eventually challenge Arm in this regard.
As is customary, we will begin our analysis of the Turin CPUs by providing you with the essential specifications, performance metrics, and pricing of the processors. We will subsequently delve into a detailed architectural study and a competitive assessment from AMD’s viewpoint. (We were in the process of wrapping up these pieces for the Intel “Granite Rapids” Xeon 6 processors when Hurricane Helene decided to change our plans.)
AMD has made significant strides in its journey with the Epyc processor, a necessary evolution to reclaim its standing after nearly abandoning the datacenter in the early 2010s. During that period, Intel surged ahead with its revamped 64-bit Xeon series, which effectively leveraged various concepts from AMD’s earlier Opteron designs but executed them more successfully. Presently, the tables have turned, with Intel lagging behind its foundry ally, Taiwan Semiconductor Manufacturing Co. The company has faced immense challenges due to its delays in transitioning to more advanced manufacturing processes, crippling its server CPU designers. Since 2019, Intel has struggled to secure “design wins” to rival AMD, resorting instead to “supply wins,” maintaining its ability to produce chips in a way that AMD cannot match.
Throughout the generations of Epyc processors, the chiplet architecture has matured and refined, resulting in Epyc CPUs that can be made up of nine, thirteen, or seventeen chiplets, interconnected and encapsulated in organic substrate. This design mimics the appearance and functionality of older monolithic CPUs. Consequently, Epyc chips have garnered substantial traction, especially among hyperscalers and cloud infrastructure providers keen on maximizing core density to optimize their critical metric of price/performance per watt and per unit volume—a concept reminiscent of what was once termed SWaP (Space, Watts, and Performance) in the early 2000s.
With the continuous improvement of its Epyc chiplet designs, AMD’s adoption has become virtually unquestioned. It’s clear now that AMD is entrenched in the server CPU arena for the long haul and is capable of producing processors for one-socket and two-socket servers that can hold their own against any competitor in the market.
However, it remains true that x86 processors are likely to always carry a higher price tag compared to custom-built Arm server chips favored by hyperscalers and cloud providers. This disparity arises from the additional overhead that giants like Intel and AMD must account for in their pricing structures. Consequently, any entity outside of the hyperscaler or cloud builder space will inherently face a premium for server computing resources, a situation that is unavoidable and built into the system.
A significant portion of the global infrastructure continues to utilize X86 applications operating on Windows Server, which are not straightforward to migrate to Arm architecture, so there’s no need to panic just yet. However, it’s important to note that a majority of contemporary applications are being developed for Linux rather than Windows Server, making their transition to Arm architectures more feasible. Hence, a persistent state of cautious awareness seems to be the right approach.
Considering the present landscape of the X86 server market, we are curious to see just how much AMD’s share can potentially increase.
A lot hinges on the trends within hyperscalers and cloud service providers, who collectively account for more than fifty percent of server CPU shipments. If these entities transition half of their server fleets to Arm while maintaining the other half on X86 to accommodate legacy applications—namely, Windows Server—then ultimately, three-quarters of the market will remain on X86, which presents a substantial opportunity. Conversely, if hyperscalers and cloud builders control three-quarters of the server CPU market and only expand their X86 fleets gradually to support Windows Server alongside some Linux workloads that customers prefer on X86 (for valid reasons), then the competition will intensify for both Intel and AMD. In such a scenario, their market shares could fluctuate based on the intensity of their pricing strategies. This situation is dependent on achieving parity in design and manufacturing processes, which, as we and Intel have discovered, is not guaranteed.
The new Turin Zen 5 and Zen 5c cores incorporate numerous microarchitectural enhancements, resulting in a 17 percent increase in integer instructions per clock (IPC) performance compared to the Zen 4 and Zen 4c cores, with floating point IPC showing an impressive increase of 37 percent.
Note: In our tables, when assessing relative performance in comparison to our baseline four-core “Shanghai” Opteron 2387 processor running at 2.8 GHz, we focus solely on integer workloads for the time being. However, we plan to retroactively include comparisons for floating-point performance in the future.
The advancements in integer IPC for the core design align with historical improvements, which include a 15 percent increase from the “Rome” Epyc 7002s compared to the “Naples” Epyc 7001s, a 19 percent boost from Rome to “Milan” Epyc 7003s, and a 14 percent rise from Milan to “Genoa” Epyc 9004s. Enhancements such as process shrinks, modifications to L3 cache per core (with “c” cores featuring half the cache at 2 MB compared to the standard cores’ 4 MB), along with chiplet features and layouts, allow AMD to expand their SKU offerings significantly. Currently, AMD boasts a more extensive array with Turin, comprising 27 unique chips, unlike Intel’s combined Granite Rapids P-core and “Sierra Forest” E-core Xeon 6 line, which only features around a dozen SKUs.
This is a different Intel than we once knew, isn’t it? The landscape has certainly shifted. By the first quarter of 2025, Intel plans to introduce more low-end SKUs for both Granite Rapids and Sierra Forest, while AMD is expected to roll out additional telco and edge variants for Turin, alongside 3D V-Cache Turin-X chips, potentially balancing the scales.
The Turin processors signify an evolution from Genoa, and this evolution is essential since both chips must fit into the same SP5 server socket. Typically, to accommodate significant upgrades, a new socket is required, as server buyers and designers prefer to maximize the lifespan of a socket to at least two generations.
AMD is making significant advancements with its Turin chips by utilizing 3 nanometer manufacturing processes from TSMC for the cores, while the I/O and memory die are produced using 4 nanometer processes. This marks a considerable reduction from the previous 5 nanometer processes used for the Genoa cores and the 6 nanometer processes for the Genoa I/O and memory die.
The following table outlines the evolution across five generations of products incorporating the standard Zen cores, excluding the “c” variants:
In the standard Turin products, the core complex dies (CCDs) consist of eight cores accompanied by 32 MB of L2 cache shared among these cores, similar to the designs seen in Milan and Genoa. Due to the advancements in chiplet manufacturing processes—from 7 nanometers with Milan to 5 nanometers with Genoa, and now 3 nanometers with Turin—AMD can integrate 16 chiplets along with the I/O die into a single package, allowing for an impressive increase in the top core count from 64 in Milan to 128 in Turin.
Accompanying this increase, the L3 cache has also expanded to 512 MB with Turin. The device supports twelve DDR5 memory channels, akin to the Genoa design. Notably, Turin memory operates at a speed of 6.4 GHz, which represents a 50 percent enhancement in speed and effectively provides a corresponding 50 percent increase in memory bandwidth per socket. This figure aligns with the 50 percent rise in core count compared to Genoa. Furthermore, both the Genoa and Turin architectures feature either 128 or 160 lanes of PCI-Express 5.0 I/O, a requirement upheld by the SP5 socket.
Today, two new variants of the Turin CPUs have been introduced, showcasing not only distinct cores but also different CCD configurations tailored for specific data center workloads.
The “scale up” Turins, which utilize Zen 5 CCDs, are highlighted on the left side of the accompanying chart. These models feature sixteen CCDs, each equipped with eight Zen 5 cores, resulting in an impressive total of 128 cores and 256 threads. In contrast, the “scale out” Turins resemble the “Bergamo” series, which runs parallel to the standard Genoa versions. These variants incorporate twelve Zen 5c CCDs, offering a higher count of sixteen cores per Zen 5c CCD. This configuration is made possible by a reduction of 2 MB of L2 cache per core and a reorganization of the CCD layout. Despite the differing arrangements, the Zen 5 and Zen 5c cores maintain functional parity. This approach diverges from Intel’s strategy with Granite Rapids and Sierra Forest, where the former features a conventional Xeon core known as a P-core, while the latter employs a distinct, Atom-derived core termed the E-core. Ultimately, the significance of these distinctions will be determined by market reception.
Following the pattern established with previous Epyc CPU ranges, AMD has developed standard Turin models for dual-socket servers, along with specialized versions identified by a P designation, designed for single-socket servers, which come with reasonable price cuts owing to their crimped NUMA circuits. There are also F variants of the Turins meant for high-performance applications (the F stands for frequency enhanced). It is anticipated that X variants will follow, likely around Q1 2025, coinciding with Intel’s CPU announcements. These X models are expected to include additional L3 cache to enhance performance in HPC and specific AI workloads that are sensitive to cache availability.
With that, here are the Zen 5 SKUs of Turin that have been released to date:
Introducing the Zen 5c SKUs of Turin, which boast an increased core count, enhanced throughput, and a more competitive price/performance ratio:
The advancements achieved by AMD since the introduction of the 45 nanometer Shanghai Opterons in April 2009, right in the heart of the Great Recession, are truly impressive and deserve recognition.
The Opteron 2387 represented a solid choice within the Shanghai lineup, which included only four SKUs. This processor featured four Shanghai cores running at a frequency of 2.8 GHz, without any boost clock speeds, and was equipped with 6 MB of L3 cache, all within a compact 75-watt thermal design point. When purchased in 1,000-unit trays, which is customary in the server market, its price was set at $873 each. (No, don’t expect a tray of them for $873. They’re not exactly a bag of chips…)
To assess relative performance, we calculate the chip’s clock speed, multiply it by its core count, and factor in its overall IPC improvement when compared to the Shanghai core.
The highest-performing processor in the Naples lineup, the Epyc 7601 featuring 32 cores operating at 2.2 GHz, exhibited an impressive performance improvement of 10.37 times. It achieved this at a cost of $405 for each unit of relative performance, with a total price tag of $4,200. The subsequent model in the Rome series, the Epyc 7742, which was tailored for high-performance computing workloads, offered even greater relative performance. This 64-core processor, functioning at 2.25 GHz, achieved a remarkable score of 24.40 and reduced the cost per unit of performance to $285. Following this, the 64-core Milan Epyc 7763, clocked at 2.45 GHz, reached a rating of 31.61, benefiting from microarchitecture and clock speed advancements rather than an expansion of core count, with the price per performance slightly declining to $250. The 96-core Epyc 9654, operating at 2.4 GHz, recorded a relative performance rating of 52.94, resulting in a cost of $223 per unit at a total price of $11,805.
It’s important to understand that while enhancing performance is a relatively straightforward task, reducing the price-to-performance ratio proves to be more challenging. Moreover, increasing core count tends to facilitate performance boosts more effectively than enhancing clock speed, largely due to thermal limitations.
With advancements in the Turin architecture, the latest top-tier Epyc model, the Epyc 9755, boasts an impressive 128 cores running at 2.7 GHz. This processor delivers a striking relative performance rating of 92.93, priced at $12,984, translating to $140 per unit of performance. AMD has notably made significant progress in enhancing value for performance with this release.
To put this into perspective, the performance of the Epyc 9755 represents a 92.93 times improvement compared to the reference Shanghai Opteron 2387, with a 14.9 times increase in price and a 6.7 times rise in power consumption, amounting to a 6.2 times better value over the span of more than fifteen years.
The Zen 5c variants of Turin elevate the performance and value proposition even further. The Epyc 9965 features 192 cores running at 2.25 GHz, boasting a relative performance of 116.17 and a price of $14,813, resulting in a price/performance ratio of $128 per unit of performance. This configuration achieves 25 percent more peak theoretical integer throughput performance compared to the Epyc 9755, while also delivering 8.7 percent better value for money.
Before opting for a Zen 5c variant over a Zen 5, it’s crucial to understand how cache-sensitive your workloads are. A thorough examination of the entire SKU lineup is necessary to align the workload with the appropriate SKU. Premium pricing is applicable for high serial performance, as demonstrated in the tables. Similarly, there is a premium associated with higher throughput, which can be explained by the yield on chiplets and is justifiable.
This discussion will not delve into the direct comparisons between AMD Turin 5 and Turin 5c against Intel Granite Rapids and Sierra Forest. However, a relative evaluation within the Intel range is relevant at this moment.
Firstly, it’s important to note that while the Intel Sierra Forest models feature a higher core count, they exhibit significantly lower performance, reduced prices, and a better price/performance ratio compared to the Granite Rapids chips. Specifically, the 144-core Xeon 6780E has 24 percent lower throughput than the 128-core Xeon 6980P, yet the price/performance improves by 16 percent for these top-tier models. As previously mentioned, the Turin 5c Epyc 9965, with its 192 cores, accomplishes 25 percent more work at an 8.7 percent lower cost per unit of performance compared to the Turin 5 Epyc 9755, which has 128 cores.
There exists a substantial shift in strategy.
Next, let’s examine the comparative performance advancements for Intel during the period from 2009 to 2024. The benchmark server CPU we consider for assessing relative performance is the 45-nanometer “Nehalem” Xeon E5540, released in March 2009 at the height of the Great Recession. This is a quad-core processor operating at 2.53 GHz, featuring 8 MB of L3 cache, consuming 80 watts, and priced at $744 for 1,000-unit quantities. Intel has achieved a remarkable 62-fold increase in performance from the reference Xeon E5540 to the leading Xeon 6 6980P, while power consumption has risen by 6.25 times to 500 watts, the cost has escalated by 23.9 times to $17,800, and the price/performance ratio has improved by 2.6 times. In contrast, AMD has amplified performance for standard Turin parts by a staggering 92.93 times, with power consumption increasing by 6.7 times, prices climbing by 14.9 times, and price/performance enhancing by 6.25 times.
Keep an eye out for an in-depth exploration of the Turin architecture and the competitive landscape.
Receive highlights, insights, and narratives from the week sent directly to your inbox with no interruptions.
Finding enthusiasm for established markets that expand consistently can be challenging, especially if they do not represent the most lucrative segment. Within the global IT sector, the part predominantly utilized by enterprises—specifically the segment that is smaller than …
Competing with industry giants like Intel and Nvidia in their respective CPU and GPU domains is no easy feat. AMD deserves recognition for its efforts to challenge both companies simultaneously in an attempt to secure a larger share of the datacenter market. AMD has been making significant progress …
Although the minimalist server processor and the microserver concept based on it haven’t overtaken global datacenters, there remain workloads well-suited to less powerful single-socket CPUs. This is why Intel has consistently developed server versions of …
Timothy, you overlooked a key opportunity: “AMD TURINS THE SCREWS…”
No, that’s where I began! However, I opted for the rhymes instead.
Excellent analysis! These EPYC Zen Torinos indeed establish a new benchmark when it comes to price/performance value. If the relative pricing had remained consistent at the Rome-Milan-Genoa level (approximately $250 / Rel Perf), the narrative wouldn’t be as compelling, as it would show a 3.5X price/performance improvement compared to Intel’s 2.6X. However, Turin truly advances that threshold to an astonishing 6.25X, marking what is now an impressive 15-year journey (a significant shift!).
It’s also remarkable how they elevated the base clock speed from Milan to Genoa (for example, from 2.4 GHz to 3.1 GHz in 64C), and then improved the boost clock from Genoa to Turin (for instance, from 3.7 GHz to 5.0 GHz). Frontier’s CPUs currently employ the Milan 7713 64C 3rd generation running at 2.0 GHz, and I can only speculate how much better performance could be achieved with Turins instead, along with matching GPUs that are faster, more power-efficient, and less costly.
Next on the horizon should be MRDIMM, PCIe 6.0, and CXL 3.0!