AMD Advances Performance-Per-Watt to Benefit Gamers
By: Sam Naffziger, Senior Vice President, Corporate Fellow, and Product Technology Architect, AMD
The demand for immersive, realistic gaming experiences is constantly pushing the boundaries of technology, driving enhancements to support features like variable rate shading, raytracing, and advanced upscaling technologies. These improved experiences require relentless performance improvements from continuous advancements in silicon design and architecture, which in turn drive higher power consumption in the absence of reductions from Moore’s Law or other approaches.
Specifically, high end graphics card power has quickly pushed up to and beyond 400 watts. At the same time, power consumption is rapidly becoming a major concern for gamers, as not only are energy prices skyrocketing worldwide, but higher power means users must contend with ever-increasing heat dissipation and louder systems caused by the need for larger cooling solutions.
The basics of silicon design are such that driving up performance through additional features and higher frequency can increase power super-linearly with performance. But with thoughtful and innovative engineering and refinement, we can be more performant without sky-high power budgets – keeping overall costs, heat, and noise lower while still delivering breakthrough experiences and performance.
A while back at AMD, we rethought and transformed our core architectures from the ground up, and in doing so, we made several large bets to improve design and energy efficiency. These bets are now paying off in meaningful ways across all our product lines in the form of leadership performance per watt, quieter, lighter systems, potentially lower energy bills, and smaller physical footprints. For example, AMD now leads efficiency in the supercomputer space with four out of the top five most efficient supercomputers.
Leadership Performance-Per-Watt in Gaming
AMD continues to prioritize the most efficient and powerful silicon design to deliver leading performance in graphics and gaming.
The last three generations of AMD Radeon™ graphics cards have seen incredible improvements in performance per watt. In 2019 when the AMD RDNA™ architecture was introduced with the 7nm-based Radeon™ RX 5000 Series GPUs, AMD delivered up to 50 percent performance-per-watt improvement over the prior GCN architecture. As one example, this equates to up to 50 percent higher framerates in Division 2 at the same power[1].
In 2020 when AMD RDNA™ 2 architecture was released to power Radeon™ RX 6000 Series graphics cards, we delivered up to a whopping 65 percent better performance per watt than the Radeon™ RX 5000 Series graphics cards in the same 7nm technology[2], by innovating in architecture and silicon design. This placed the Radeon™ RX 6000 Series in a highly competitive performance-per-watt position across the stack, and once again showed AMD’s commitment to powerful yet efficient computing.
The Importance of Performance Per Watt
What does the best performance per watt mean for gamers? In addition to producing less heat and consuming less power while delivering high performance over long periods of time, there can be an operating cost savings benefit as well since consuming less power for the same performance reduces ones electricity bill and also carbon footprint.
Building Power Efficient Designs is in our DNA
As the only company developing high-performance CPUs and GPUs today, AMD is uniquely positioned to leverage the learnings across our central engineering teams and leverage the best IP across our product portfolio. For example, following the success of the AMD Ryzen™ desktop and mobile processors, which offer incredible performance and efficiency, engineering teams collaborated to apply key learnings from our “Zen” CPU development into our Radeon graphics architecture to make RDNA™ 2 an extremely efficient GPU architecture.
We incorporated some of the “Zen” CPU micro-architecture approaches and design methodologies into our graphics pipeline, streamlining the physical makeup of the die to make higher frequencies possible. For example, we leveraged the dense CPU L3 memories to implement AMD Infinity Cache™, a high-density, low-power cache, to make frequently used data in gaming workloads more easily accessible, dramatically increasing bandwidth while reducing the power needed for memory and cutting latency[3].
To further refine and improve the AMD RDNA™ 2 graphics architecture and deliver greater efficiency and performance gains, the team implemented several other key changes, including:
- Optimized Switching – Improving the fundamental design of the architecture to ensure every gate switched and every clock toggled directly contributed to performance, thus removing any wasted activity and excess routing to optimize the graphics pipeline.
- High-Frequency Design – Tuning the design for high clock speeds, pushing AMD RDNA™ 2 frequencies beyond AMD RDNA™ by up to 30 percent, which enables the GPU to run at a lower voltage to ensure the new architecture could maintain the same clock speeds at lower power[4].
- Smarter Power Management – Implementing intelligent power management within the GPU, which identifies the best opportunities to exploit higher frequencies and does so only when it directly improves performance, then reduces back down, thus eliminating excess energy use.
What’s Next?
Looking ahead, we’re continuing our push for more efficient gaming with AMD RDNA™ 3 architecture. As the first AMD graphics architecture to leverage the 5nm process and advanced chiplet packaging technology, AMD RDNA™ 3 delivers up to 50 percent better performance per watt than AMD RDNA™ 2 architecture[5] – truly bringing top-of-the-line gaming performance to gamers in cool, quiet, and energy-conscious designs.
Contributing to this energy-conscious design, AMD RDNA™ 3 refines the AMD RDNA™ 2 adaptive power management technology to set workload-specific operating points, ensuring each component of the GPU uses only the power it requires for optimal performance. The new architecture also introduces a new generation of AMD Infinity Cache™, projected to offer even higher-density, lower-power caches to reduce the power needs of graphics memory, helping to cement AMD RDNA™ 3 and Radeon™ graphics as a true leader in efficiency.
We’re thrilled with the improvements we’re making with AMD RDNA™ 3 and its predecessors, and we believe there’s even more to be pulled from our architectures and advanced process technologies, delivering unmatched performance per watt across the stack as we continue our push for better gaming.
Footnotes
AMD is not responsible for the contents of third-party sites and no endorsement is implied. GD-5
[1] RDNA provides an average of 1.54x the performance per watt over GCN in the game, Division 2. Testing done by AMD performance labs 5/23/19, using the Division 2 @ 25x14 Ultra settings. System configuration: GIGABYTE Z390 AORUS ELITE, Intel Core i7-9700K CPU, 16GB DDR4, Win 10 Pro. Performance may vary based on use of latest drivers. Laptop manufacturers may vary configurations, yielding different results. RX-325
[2] Testing done by AMD performance labs 10/21/20, measuring the individual FPS scores and calculating an average FPS score across the following titles: Assassins Creed Odyssey (DX11, Ultra), Battlefield V (DX12, Ultra), Borderlands 3 (DX12, Ultra), Control (DX12, High), Death Stranding (DX12 Ultra), Division 2 (DX12, Ultra), F1 2020 (DX12, Ultra), Far Cry 5 (DX11, Ultra), Gears of War 5 (DX12, Ultra), Hitman 2 (DX12, Ultra), Horizon Zero Dawn (DX12, Ultra), Metro Exodus (DX12, Ultra), Resident Evil 3 (DX12, Ultra), Shadow of the Tomb Raider (DX12, Highest), Strange Brigade (DX12, Ultra), Total War Three Kingdoms (DX11, Ultra), Witcher 3 (DX11, Ultra no HairWorks) at 4K. Test systems configured with a Core i9-9900K CPU, Radeon(TM) RX 6900 XT GPU with AMD Radeon(TM) Graphics driver 27.20.12031.1000, 32GB memory, and Win 10 vs. a similarly configured system with a Radeon(TM) RX 5700 XT GPU and AMD Radeon(TM) Graphics driver 26.20.13001.9005. Performance-per-watt calculated by dividing the TBP of each GPU multiplied by the average FPS score taken across all titles. Laptop manufacturers may vary configurations, yielding different results. Performance may vary. RX-554.
[3] Measurements calculated by AMD engineering, on a Radeon RX 6000 Series graphics card with 128 MB AMD Infinity Cache and 256-bit GDDR6. Measuring 4K gaming average AMD Infinity Cache hit rates of 58% across top gaming titles, multiplied by theoretical peak bandwidth from the 16 64B AMD Infinity Fabric channels connecting the Cache to the Graphics Engine at boost frequency of up to 1.94 GHz. RX-535.
[4] Based on October 2020 AMD engineering internal modeling of graphics-engine-only average 3D Mark11 power consumption vs. frequency of the Radeon RX 5700 XT and Radeon RX 6900 XT GPUs, divided by the number of compute units (40 and 80 respectively). RX-536