Even though AMD has made significant advancements since 2002, the Linux kernel continues to approach current Threadrippers in the same manner as systems from the Athlon era—at least in one regard that has the potential to cause latency.
Recently, AMD developer Prateek Nayak made a contribution to Linux’s processor idle drivers in the form of a patch that would “omit the dummy and hold off until CPUs based on the Zen microarchitecture become available.
When ACPI support was added to the Linux kernel in 2002, it contained a “dummy wait op. This support was built by Andy Grover, and Linus Torvalds was the one who committed it.” The system received data, but it did so for almost no reason other than to postpone the execution of the subsequent instruction until the STPCLK# signal brought the CPU to a complete stop.
During the early days of ACPI implementation, when certain chipsets didn’t go to an idle state when one would expect it, this allowed for some power savings and compatibility. It also allowed for some power savings.
However, modern AMD processors based on the Zen architecture do not require this workaround, and according to what Nayak has written, it actually works against them, at least when it comes to certain workloads under Linux. The results of testing with workloads based on instruction-based sampling (IBS) demonstrate that “A sizeable portion of time is invested in the fake operation, which, for some reason, winds up being recorded as residing in the C-State.
When the central processing unit (CPU) detects all of this low-effort fake work, it might enter a deeper and slower C-state, which causes it to take the CPU more time to “wake up.” “in particular when it comes to activities that involve a lot of switching between active and inactive stages.
On a dual-socket Zen3 system, Nayak did tests in tbench against the baseline Linux kernel, as well as against a kernel in which the C2 state had been completely disabled, and a kernel in which the dummy wait operation had been patched out.
His modified version showed a 1,390 percent improvement in minimum MB/s throughput and a 51 percent gain in mean MB/s throughput over the baseline kernel. In many cases, it was only a tiny bit behind having C2 completely deactivated.
According to the Phoronix blog, Intel’s computers have managed to avoid falling victim to AMD’s heritage curse because they have been using an MWAIT-based system for at least ten years. This prompted Dave Hansen of Intel to provide an emergency patch for the software.
His solution was to restrict “dummy wait” to only Intel systems so that it would not affect “remotely modern Intel systems.” He also added comments to the idle drivers of the kernel that explained what was going on and encouraged people who were reading them to “consider moving your system to a more modern idle mechanism.”
It is possible that an urgent patch that eliminates or restricts the use of “dummy wait” will be included in the Linux 6.0 kernel that Torvalds plans to release the following week if it is submitted this week.
Comments