Alder Lake Features

Intel Thread Director

Now that we have learned about Alder Lake micro-architecture one has to wonder how the Performance-Cores & the Efficient-Cores complete workloads efficiently. That’s where the “Thread Director” comes into play. Intel, along with Microsoft, has created a solution between the software and hardware to determine which Performance or Efficient Cores will complete the workloads. This is accomplished by Intel’s Thread Director polling the instructions within the Core. To support this at the software level Windows 11 will be needed to take full advantage of this technology. Intel has been working closely with Microsoft to ensure that the Alder Lake microarchitecture can communicate with the Windows 11 OS effectively.

In real-time Alder Lake sends feedback to the Windows 11 OS to ensure that the OS task scheduling is sending data to the cores that can complete the workloads efficiently. Each core will be reported to the OS down to the nanosecond to ensure that workloads are sent to specific cores based on several criteria’s. Normally background tasks will be sent to the Efficient-Cores and priority tasks will be sent to the Performance-Cores. As an example, if you are playing a video game then the game threads will be sent to the Performance-Cores, while the Windows 11 OS background tasks and streaming software threads will be sent to the Efficient-Cores.

It appears that Intel’s Thread Director at the hardware level will be even smarter than we think since it will be able to move tasks from the Performance-Cores over to the Efficient-Cores when needed in real-time. If you were working with multiple software and needed to multitask this could be a great thing. Alder Lake should make multi-tasking quicker. Let’s say you were working in Blender on a scene, but you realized that you needed to make some last minute changes to an object in Zbrush. Well initially Blender would be the main window that you are using (“Foreground” App :: Performance Cores) and when you switch over to Zbrush, Intel’s “Thread Director” would move Blender threads to the “background” (Efficient Cores). Now that you currently are working Zbrush and it is the Foreground App it will run on the Performance Cores. Let’s say you decide to export the model in Zbrush and switch back to Blender. Blender would now be the Foreground App and the Blender threads would switch from the Efficient Cores to the Performance Cores while Zbrush threads would switch to the background Efficient Cores during exporting. Supposedly this type example will be a seamless transition between the P-Cores and E-Cores.

When it comes to video game engines things might not be as simple as it is with typical Windows OS scheduling. Certain gaming tasks need to ran on the Performance-Cores for the best results while other less important tasks should be ran on the Efficient-Cores. Traditionally developers could just locate how many processors (physical & logical cores) were on a PC and issue threads to those CPU cores. That is no longer the case and Intel recommends that video game developers differentiate between the Performance-Cores and Efficient-Cores for maximum performance. Intel will still allow the “Intel Thread Director” to be used, but it is recommended that developers optimize the game engine for scheduling.

Intel is working to allow more usage with Alder Lake P & E-cores by allowing the usage of the “Power Throttling” API which will allow developers to determine which threads are more important than others. This can greatly help the Thread Director. Even if developers do not use the Power Throttling API the Windows 11 OS scheduler (software) and Intel’s Thread Director, from a hardware standpoint, can still ensure current and future software can run efficiently across the Alder Lake P & E-Cores without any intervention from the developers. If Intel and Microsoft can pull this off this will be a game changer for sure. It is worth noting that Windows 10 does support Intel’s Thread Director, however it is an older version which does not allow the Alder Lake architecture to provide feedback to the OS. This means that threads won’t be scheduled to the correct cores in real-time.

Alder Lake Memory Technologies

Intel is once again leading the charge by bringing new technology to the public and will be the first to bring DDR5 to the market. Not only will Intel be supporting DDR5 (4800Mhz), but Alder Lake will also be supporting LP5 (Low-Power DDR5 (5200Mhz)), DDR4 (3200Mhz) and LP4 (Low-Power DDR4 (4266Mhz)). So enthusiast will have the choice between the current DDR4 DRAM Modules or the latest and greatest DDR5 DRAM modules. The Low-Power DDR 4 & 5 modules will more than likely be aimed at mobile users and other small devices (SODIMM). That’s not to say that LPDDR4\5 won’t be available for desktop users. Normally the Low-Power DDRx Modules are for the Enterprise\Server or Mobile markets. Intel will also support overclocking the modules in many ways (software and hardware).

Based on early reviews we noticed that Alder Lake has higher latencies when using DDR5, but more throughput and bandwidth. The opposite was true for DDR4 which had lower latencies, but also less throughput and bandwidth. As DDR5 matures we expect this to change in the near future. So there will be a balance that needs to be met, but it’s great to have more than one option. Alder Lake will support power management features that will allow the CPU to disabled specific areas in the memory subsystem and\or the Integrated Memory Controller (IMC) in order to save power and prevent interference.

As the CPU enter lower power states (C-states) these features will be enabled. From my understanding if you do not occupy all of the DRAM memory slots on the motherboard the Alder Lake CPU will disable specific areas of the Memory Subsystem\IMC, as I stated earlier, to save power and prevent possible interference between the signals. The IMC is used to transfer data from the system DRAM to the Alder Lake CPU. Since Intel is supporting both DDR4\LP4x and DDR5\LP5x, Alder Lake implements two different memory controllers and both supports dual channels. They cannot be used interchangeably. I suppose that it is possible to mix DDR4 and DDR5 on a motherboard, but the IMC will only see half of the total memory on the motherboard. However, Intel only allows one type of DDR type on a system so mismatching DDR4 & DDR5 won’t be allowed. You will have to choose between DDR4 or DDR5 when choosing a motherboard.

In addition to Alder Lake memory controller being able to lower (or higher) the power usage and capacity, it will be able to determine this scaling based on various criteria’s. Some criteria’s could be time-critical workloads that require low latency while other workloads might require high bandwidth, but for workloads that are neither time-critical and doesn’t require a lot of bandwidth the IMC can run at low power-low frequency (Gear-4). Most enthusiasts are familiar to the term “Gears” that Alder Lake’s IMC can use. Alder Lake has several Gears that will correspond to real-time workloads and various scenarios. These Gears are based on the mathematical ratio between the memory controller frequency and the DRAM frequency with the number following the term “Gear” being the IMC ratio (‘1:2’ ratio = “Gear-2”).

The goal for most enthusiasts, overclockers and gamers will typically be Gear 1 (1:1 ratio) which allows the IMC to run at the same speed as the DRAM. Or course this causes the CPU to use more power, but allows the lowest latency and highest bandwidth usage between the IMC and the DRAM. Alder Lake’s IMC is rated to run at (DDR4) 3200Mhz & (DDR5) 4800Mhz. With default settings DDR4 will run at Gear 1 and DDR5 will run at Gear 2. With DDR5 already hitting some pretty high frequencies such as DDR5-6800Mhz as a standard, there is no telling how far we will be able to push Alder Lakes memory controller frequency to match the DDR5 frequency. It’s safe to say that enthusiast\overclockers will be trying their best to use the lowest Gears possible (Gear 1 = 1:1) with high DRAM overclocks.

PCIe Technologies
PCIe 5.0 x16

Intel is the first company to release PCIe 5.0 to the consumer market with Alder Lake. Alder Lake will support a direct connection to a single PCIe 5.0 (x16 mode) slot or dual PCIe 5.0 (x8 mode). PCIe 4.0 (x4 mode) will be directly connected to the CPU as well for a SSD or Intel “Optane” Memory\Storage. Regardless of how many PCIe slots the motherboard manufactures add (x1 or x2 PCIe 5.0 slots) the CPU will support a single PCIe 4.0 running x4 mode (8GB\s). PCIe 5.0 (x16) can theoretically use up to 64GB\s with ease. PCIe is backwards compatible so any PCIe device or SSD (M.2 in this case) should work with no problems. The Z690 chipset will be capable of running PCIe 4.0 (x12) and PCIe 3.0 (x16), but I will speak more about the Z690 Chipset on the next page. For users, such as myself, who can easily run a desktop for well over half a decade there is plenty of upgrade potential if needed. The main reason that Intel decided to use PCIe 4.0 (x4 mode) was due to the increased processor die size and logic requirements. It was easier to follow the already established 20 lanes on the 11th Gen. PCIe 5.0 (x4) would have taken up more space and PCIe 5.0 (x16) was already much larger than PCIe 4.0 [I/O] on the 11th Gen CPUs. Using PCIe 4.0 will allow more flexibility when using more PCIe 4.0 lanes (1x4 mode, 2x8 mode etc.)

Alder Lake's Blazing Speed & Power Usage

Intel’s Alder Lake is capable of some impressive speeds within the CPU micro-architecture. Intel created three separate interconnect fabrics to ensure that the micro-architecture is scalable. This means that Alder Lake bandwidth and performance should not be compromised across multiple market segments when scaled. These interconnects are also able to report various information such as current loads on the fabric which will allow data to be directed to the most effective Core for completion. This is also true for the I\O and memory subsystem which can scale their speeds and\or bus width which can lead to low-latency scenarios, more performance if needed or power saving features.

Intel states that Alder Lake can get up to 1000 GB\s or 1TB in dynamic latency optimization between the Cores (P & E cores) via the fabric used to connect them together. That would be 100GB\s for each Golden Cove Performance-Core and for each cluster Gracemont Cluster Core (x4 E-Cores). Intel's Alder Lake CPU I\O interconnects can get up to 64GB\s in real time which should be expected since PCIe 5.0 (x16) is the standard for Alder Lake I\O. However, one must wonder if Intel made a mistake since the CPU I\O can support up to 3 devices if utilized which means the total bandwidth should be around 71.8GB\s. If the I\O interconnects follow the PCIe specifications then the Alder Lake CPU should comply and actually use nearly 72GB\s if needed. Otherwise there could be bottlenecks, but I honestly do not believe that the PCIe 5.0 slot(s) will be used to its full potential (64GB\s).

As far as Alder Lake Memory Subsystem goes Intel claims up to 204GB\s of bandwidth. That seems to be very impressive. I’m sure there will be synthetic benchmarks available to give more insight into these claims. Alder Lake will also support hardware acceleration for Microsoft's Direct X 12 pipeline. Intel has decided to use the same Xe-LP integrated graphics uarch that was used in the 11th Gen, however we should see some improvements due to architectural changes. Alder Lake supports Intel’s Turbo Boost Max Technology 3.0. This version allows cores to individually change their core frequency. The Operating System will need to support dynamic per-core frequencies and can be used to assign workloads to cores based on several criteria’s for low power and high performance scenarios.

Turbo Boost Max Technology 2.0 is still supported as well. Turbo Boost 2.0 is what allows a core to raise its rated ‘Base Frequency’. For example, the Core i9-12900K Performance Core has a base frequency of 3.20GHz, but when Intel’s Turbo Boost Max ‘2.0’ is enabled it can allow certain (P)-cores to run at 5.10GHz; while pushing the E-Cores running up to 3.2GHz (base freq. 2.40Mhz). Turbo Boost Max ‘3.0’ can push cores up by an extra 100Mhz to 5.2GHz and is only supported on the Performance cores. This is decided on several factors such as the CPU temperature and power usage \ TDP and so on. Turbo Boost Max ‘2.0’ could also allow the CPU to run up to PL2 mode which is far above the base frequency. This would be outside of the rated TDP in some cases for cooling solutions so people should take that into consideration when buying a cooling solution for Alder Lake. Intel has moved away from the term “TDP” to reflect their new architecture (i.e big.LITTLE core architecture). TDP is a term that I am sure that most enthusiast and CPU cooling manufacturers will continue to use.

Speaking of Power Limits (PL), Alder Lake supports 4 Power Limits (PL1, 2, 3 & 4). PL1 will be the typical usage wattage (TDP) and will be the same for every CPU model (i5-12xxx up to the i9-12xxx = PL1 -125Watts. PL1 can also be lowed to target 56 Watts across all CPU models when the CPU isn’t performing heavy workloads. However, the Power Limit 2 (PL2) will vary between the CPU models which depend on the total number of P-Cores & E-Cores, frequency and the power threshold that Intel wants to hit. The Core I5-12600K @ 4.9GHz Turbo Boost (PL2) would target 150Watts while the higher clocked i9-12900K @ 5.20GHz Turbo Boost (PL2) would target 241Watts. Obviously the i9-12900K (16 Cores) has more cores than the lower end i5-2600K (10 Cores).