Mon Oct 10, 2011 11:23 am by Knightwolf654
yeah ive been watching a thread on another website on that poor excuse of a preview.
Actually, we already have such an issue known for Bulldozer, and NO bench-marked system has the patch installed!
The shared L1 cache is causing cross invalidations across threads so that the prefetch data is incorrect in too many cases and data must be fetched again. The fix is a "simple" memory alignment and (possible)tagging system in the kernel of Windows/Linux.
I reviewed the code for the Linux patch and was astonished by just how little I know of the Linux kernel... lol! In any event, it could easily cost 10% in terms of single threaded performance, possibly more than double that in multi-threaded loads on the same module due to the increased contention and randomness of accesses.
Not sure if ordained reviewers have been given access to the MS patch, but I'd imagine (and hope) so! Last I saw, the Linux kernel patch was still being worked on by AMD (publicly) and Linus was showing some distaste for the method used to address the issue. One comment questioned the performance cost but had received no replies... but you don't go re-working kernel memory mapping for anything less than 5-10%... just not worth it!
the preview was thrown together in 30 min. half the graphs have the names switched and they did not install the patch so i'm treating that as a grain of poo.
i7 3930k @ 4.2 Ghz
Gigabyte R9 Fury X
32gb ddr3-1866
3x 27" LG IPS displays + 1 23" IPS display
Samsung 850 pro 512GB
Samsung 830 pro 256GB
WD SE 2TB
1200 watt Corsair PSU
H100 water cooling
8TB FreeNas Server