Enterprise

AMD EPYC 9754S Review – A CPU With a Very Particular Set of Skills

The AMD EPYC 9754S is designed for HPC workloads with SMT disabled, delivering 128 cores and 128 threads with a default TDP of 360W.

Last year AMD expanded its server CPU line with 4th Gen EPYC. While the 128-core, 256-thread EPYC 9754 took top billing, just under it on the SKU matrix is the AMD EPYC 9754S. The difference between the two chips is simple, yet dramatic. The 9754S has Simultaneous Multithreading (SMT) disabled. This means the 9754S delivers the same 128 cores as the 9754, but with SMT disabled, just 128 threads, compared to 256. This change brings about a nice discount for customers disabling SMT already.

Model Cores Max Threads Default TDP Base Freq. (GHz) Boost Freq. (GHz) L3 Cache (MB)
9754 128 256 360W 2.25 3.10 256
9754S 128 128 360W 2.25 3.10 256
9734 112 224 320W 2.2 3.0 256

What is AMD SMT and Why does the 9754S Exist?

With SMT, a single EPYC CPU core can process two threads simultaneously, this can lead to more efficient use of the processor’s resources. When one thread is waiting for data to be loaded from memory or is otherwise idle, the other thread can be executing instructions. This means the core spends less time idle, potentially improving performance. This is especially true in use cases like virtualization and rendering.

Disabling SMT can allow manufacturers to market these chips as lower-tier products, ensuring they still meet specific performance and stability criteria. CPUs with SMT disabled can be influenced by binning processes, market segmentation strategies, and the desire to cater to specific performance or efficiency needs, showcasing the nuanced approach manufacturers take in product planning and positioning.

That said, not every workload benefits from SMT, and many times, an AMD server may have SMT disabled in the BIOS. While that can be an effective tweak, this brings up another important point. The 9754S chip with SMT disabled is a little less expensive than the 9754. In either event, single-threaded applications, computational workloads, and any use cases where CPU latency is critically important can benefit from having SMT disabled.

AMD EPYC 9754S vs EPYC 9754 Performance

We want to pull two of our regular tests, y-cruncher and Cinebench 2024, and see what performance differences we get with and without SMT. We ran 9754S and 9754 against each other while running the 9754 with SMT on and off to see what advantages the 9754S has without SMT at all.

Test Platform and Specs:

Cinebench 2024

First up is Cinebench 2024, with SMT enabled on our non S model. Here we can see we are within run-to-run variation differences.

Cinebench 2024 CPU 2x EPYC 9754S 2x EPYC 9754
CPU Multi-Core 2,682 2,587
CPU Single-Core 68 69
MP Ratio 39.19x 37.64x

y-cruncher specifically was selected because of the architecture of the program, positioned as a total system test. Performing as large of a Pi calculation that will fit into system memory, we aimed to prove our long-standing intuition, that SMT can negatively impact CPU and Memory bound workloads. Let’s take a look at the results first before diving into what it all means.

y-cruncher 0.8.3

y-cruncher 0.8.3 Total Computation Time in seconds
(lower is better)
2x EPYC 9754S 2x EPYC 9754 (SMT Off) 2x EPYC 9754 (SMT On) 9754 SMT Off Performance Increase
1 Billion 13.481 13.546 14.139 4.65%
2.5 Billion 23.818 24.144 28.111 15.27%
5 Billion 40.760 40.797 49.271 17.27%
10 Billion 77.409 77.959 95.420 18.88%
25 Billion 203.303 202.124 233.629 12.98%
50 Billion 475.557 476.949 520.349 8.61%
100 Billion 1,248.458 1,251.36 1,242.419 -0.49%

y-cruncher 0.8.4

y-cruncher 0.8.4 Total Computation Time in seconds
(lower is better)
2x EPYC 9754S 2x EPYC 9754 (SMT Off) 2x EPYC 9754 (SMT On) 9754 SMT Off Performance Increase
1 Billion 13.480 13.56 14.573 7.50%
2.5 Billion 23.680 23.501 28.649 17.34%
5 Billion 40.819 40.547 50.082 18.50%
10 Billion 78.523 77.466 93.842 16.32%
25 Billion 206.399 206.078 236.070 12.57%
50 Billion 483.797 482.79 521.867 7.29%
100 Billion 1,269.484 1,266.83 1,253.446 -1.28%

Results Analysis

Diving into the intricacies of AMD SMT, there’s a compelling dialogue within the tech community about its implications on system performance. At its core, SMT appears to be a straightforward choice for those in pursuit of enhanced performance. The theory goes: if enabling SMT can lead to ideal scaling, then why not embrace it as a beneficial architectural choice?

The relationship between SMT efficiency and core architecture isn’t black and white. Lackluster SMT scaling doesn’t necessarily point to a flaw in its implementation. In fact, it could hint at a robust core design that hardly leaves room for SMT to make a noticeable difference. This paradox underscores a crucial industry insight: processor manufacturers can’t claim a one-size-fits-all benefit with SMT or similar technologies. They acknowledge that while SMT can squeeze out additional performance in certain use cases, it’s not without its shortcomings in other scenarios.

Through the lens of high-performance computing and supercomputing tasks, the limitations of SMT become more apparent. While the idea of doubling the thread count per core might sound promising, the reality is not akin to having double the cores. In extreme cases, this can lead to performance dips as threads vie for cache resources. Nonetheless, for the majority of multi-threaded applications, especially those devoid of cache competition, SMT lifts performance, primarily shining in tasks that can fully leverage its potential.

Closing Thoughts

AMD SMT is incredibly useful for a wide variety of workloads that are common in the enterprise. But not every workload needs or benefits from SMT. Through our testing we have shown how AMD is able to take advantage of variations in manufacturing to deliver a solid product that has a unique value proposition. Organizations designing platforms for specific types of workloads that need pure-core without SMT, can save a little bit of money by buying the AMD EPYC 9754S, which has SMT disabled out permanently from the factory.

AMD Product Page

Engage with StorageReview

Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | TikTok | RSS Feed

Jordan Ranous

AI Specialist; navigating you through the world of Enterprise AI. Writer and Analyst for Storage Review, coming from a background of Financial Big Data Analytics, Datacenter Ops/DevOps, and CX Analytics. Pilot, Astrophotographer, LTO Tape Guru, and Battery/Solar Enthusiast.

Recent Posts

UGREEN DXP480T Plus SSD NAS Review

The UGREEN DXP480T Plus offers an alluring blend of portability and performance in a tiny body with 4 M.2 NVMe…

3 days ago

How to Get Started with TrueNAS Scale

TrueNAS has gained traction in the self-host and homelab communities for several reasons. One primary reason is that it's free…

3 days ago

VDI Acceleration For All? Intel Data Center GPU Flex Series 170 Review

The Intel Data Center GPU Flex Series 170 ia tantalizing for organizations that want to provide their VDI users with…

4 days ago

UGREEN DXP6800 Pro NAS Review

The UGREEN DXP6800 Pro is an exciting addition to the prebuilt NAS segment and stands out as a sophisticated solution…

1 week ago

Dell Latitude 9450 2-in-1 Review

Overall, the Latitude 9450 2-in-1 gets our strong recommendation for a skinny high-end business 2-in-1 convertible. (more…)

1 week ago

Capturing the 2024 Solar Eclipse – StorageReview Style

For the 2024 eclipse we needed fast, reliable storage that could handle a few bumps and bruises along the way…

1 week ago