DDR5 OverclockingLatency and Performance Deep Dive
On the previous page I undervolted the CPU to 1.17v, but now all results will show the CPU using only 1.11v. The Efficient Cores will continue to run at 4GHz, but now with the DDR5 RAM overclocked. A few people have called me out for jumping on the DDR5 RAM train so early, but coming from DDR3 I had absolutely nothing to lose. I am not 100% sure what the safe voltages are for the DDR5 RAM so I am being conservative with the voltage and how far I push the frequencies. My stock DDR5 frequency is 4800Mhz (40-39-39-76). I kept the timings the same across all frequencies (40-39-39-76) and managed to overclock the DRAM to 5000Mhz, 5400Mhz & 5600Mhz. DDR5 4800Mhz, 5000Mhz and 5600Mhz are the main results that will be focusing on. This will show the performance scaling across multiple DDR5 frequencies. The goal is to lower the latency, undervolt the CPU, lower temperature, lower wattage, overclock the Efficient-Cores to 4.0GHz and from that point I will take another deep dive into the micro-architecture. I also included my results with the stock voltages and stock frequencies as well for comparisons. The ultimate goal remains the same and that is to make Intel’s 12th Gen Alder Lake as efficient as possible.
DDR5 Overclocking - Latency Deep Dive
The Latency Chart above shows 3 different results for the DDR5 4800Mhz, 5000Mhz and 5600Mhz becnhmarks. We see that the DRAM Bandwidth scales nearly perfectly with the theoretical DRAM bandwidth. From 4800Mhz to 5600Mhz I was able to increase my DRAM Bandwidth by 18% from 75.7 GB\s to 89.4 GB\s. The small datasets benchmarks show a decrease from 66.43ns to 57.07 ns which is a 14% decrease; and probably explains why everything feels “snappier” as I am navigating around the operating system. Most applications won’t use a ton of DRAM to be stored in memory, but this depends solely on your workloads and the types of apps you need to execute. Moving down to the larger datasets shows a very nice decrease of 75.40 ns to 68.45ns (lowest tests recorded showed 66.2ns) and that comes out to 9% on average. That is with the same default timings “40-39-39-76”. When the processor needs to access larger files in the DRAM we should see some sizable increases in performance. I will be testing this later in this article during my second round of benchmark testing and uarch deep-dive. Going deeper into the micro-architecture I decided to include the Performance Core Latency and the Efficient Core Latency. Instead of showing each individual core for both P & E Core for each DDR5 frequency, I decided to combine them in groups of two and show the overall latency average. Starting with the Performance Core Latency, the average of all individual P-cores came out to 59.10ns with DDR5-5600Mhz. That is a drop of 7.09ns which is a sizable drop for per core latency to DRAM. The Efficient Core Latency to DRAM only dropped by 3.55ns which is still a pretty good drop. In my first Alder Lake-S Review it was revealed that the Efficient Cores could access cache memory and DRAM much quicker than the Performance Cores. The lower voltage, 4Ghz overclock and DDR5 overclocked to 5600Mhz makes the Atom Cores even more efficient.
All 'undervolt (1.11v)' results below are using
E-Cores @ 4GHz + DDR5-5600Mhz
DDR5 Overclocking – All P & E Cores Deep Dive
Earlier in this article we focused on Efficient Cores @ 4GHz since I did not overclock the Performance Cores. Now we will focus on both the Performance and Efficient Cores (4GHz) now that I have overclocked my DDR5 DRAM to 5600 MT\s. This benchmark shows all of the cores working together on various workloads. The Stock (1.27v) information comes from my initial Alder Lake-S Review. I have increased my total bandwidth by 27 GB\s.
DDR5 Overclocking – 8 Performance Cores
I have separated the Performance Core and Efficient Core results to show my performance increases with the DDR5-5600Mhz overclock. Above we see an increase of roughly 10 GB\s with all 8 Performance Cores working together. Alder Lake continues to impress me only 1.11v (vCore) being used during 100% CPU utilization.
DDR5 Overclocking – 8 Efficient Cores
Now we will take a look at both Efficient Clusters working together on the same workloads. Each Cluster contains 4 Atom Cores so this benchmark shows 8 Atom Cores working together. Stock voltages & stock frequencies (3.7GHz) showed 175.55 or 176 GB\s in my previous Alder Lake-S Review, under-voltage (1.11v) & overclocked frequencies (4GHz) shows 187.75 or 188 GB\s. That’s a very nice increase of 12 GB\s that was basically free. The Efficient Cores run very cool with one Cluster showing 4 E-Cores at 56c and the other Cluster showing 4 E-Cores at 52c. I hope that I can push the E-Cores to 5Ghz one day. It is also possible that the temperatures can be lower since I am still waiting on my LGA1700 brackets to be delivered for my AIO.
DDR5 Overclocking – Single Efficient Core MAX Performance
Time to go a little deeper and take a look at the absolute best case scenario’s for the Efficient Cores. I have selected the best ‘Gracemont’ Efficient Core to perform this test just as I did in my last article. This test will show the top performance for one Efficient Core. It is a “true” single core test that should tell me exactly how quickly and how much data the Efficient Core can compute. Earlier in this article I revealed the latency for each Efficient Core. We saw an improvement in latency at nearly every level. At the top of this page under the “DDR5 Overclocking - Latency Deep Dive” we saw the decrease of overall system latency and the Efficient Core Latency. With the E-Core frequency being overclocked to 4GHz, the DRAM being overclocked to 5600Mhz and latency being lower we see that a single Atom ‘Gracemont’ Core performance has increased by 14% over the stock settings. The chart also reveals a few more interesting details. Undervolting (1.17v + DDR5-4800Mhz) and overclocking (4GHz) shows a minor increase which is still great because it’s basically free performance, however, when overclocking the DDR5 to 5600Mhz we see a large increase in performance. The additional DRAM bandwidth and quicker access to that data allows the Efficient Cores to work more effectively. It is going to be interesting to see how much more performance is waiting within these Atom Cores.