The Truth: Problems of Ryzen
After AMD released Ryzen, Reviewers and Users alike were very fast to throw around Theories. And this has been occurring with no clear answer. many folks darned the hardware, others darned SMT. But wait... recently 2 unknown theory crafters and therefore with the facilitate of nwgat, we will currently get a more in-depth check up on the real cause. Let’s take a glance shall we?
The Ryzen downside.
All Reviews have shown that AMD Ryzen under-performs once all Cores and SMT are active, however no one is certain why. There are plenty of theories, a number of them are addressed by AMD as wrong. Others are tested true, like that Ryzen performs better with only the primary CCX active and SMT disabled.
But recently two theories have really sounded legit: slower MOV instructions and the 2nd CCX has to go through the first CCX for memory access.
Testing The Theories
To test these theories I used a self written tool that shows memory bandwidth. It tests single thread performance and multi threaded performance and (on Windows) sets the proper thread affinity mask. The tool was run with identical settings passed to it and maximum optimisations in the compiler enabled.
Task Intel i5-4690 Ryzen R7 1700X Ryzen R7 1800X
DDR3 1333Mhz DDR4 2100Mhz DDR4 2400Mhz
MOV Copy 6148.85 mb/s 5373.25 mb/s 5595.71 mb/s
Normal Copy 6154.97 mb/s 8158.41 mb/s 7962.22 mb/s
2 Threads 6654.01 mb/s 12399.49 mb/s 12207.71 mb/s
3 Threads 6716.55 mb/s 13092.77 mb/s 14448.28 mb/s
4 Threads 7004.97 mb/s 13433.51 mb/s 14430.77 mb/s
5 Threads 6828.04 mb/s 13271.88 mb/s 13769.70 mb/s
6 Threads 6962.61 mb/s 13160.45 mb/s 14092.03 mb/s
7 Threads 7018.98 mb/s 13044.91 mb/s 14123.68 mb/s
8 Threads 7026.20 mb/s 12993.32 mb/s 14200.40 mb/s
9 Threads 6990.10 mb/s 12969.62 mb/s 14096.46 mb/s
10 Threads 7049.04 mb/s 12956.59 mb/s 14005.07 mb/s
11 Threads 6973.88 mb/s 12765.16 mb/s 13917.83 mb/s
12 Threads 7012.68 mb/s 12745.24 mb/s 13770.88 mb/s
13 Threads 6983.30 mb/s 12495.60 mb/s 13609.24 mb/s
14 Threads 7038.75 mb/s 12386.70 mb/s 13612.83 mb/s
15 Threads 7086.29 mb/s 12276.65 mb/s 13307.97 mb/s
16 Threads 7084.65 mb/s 12563.24 mb/s 13577.06 mb/s
(Leaving the variations in Memory used aside, we will see some problems with AMD Ryzen memory bandwidth)1. MOV Copy is considerably slower than typical Copy on AMD Ryzen (by 35%)
MOV (and all instructions in this set) are typically used to move or copy memory. during this case, it's a REP MOVSB that's being employed, that is sometimes the quickest method to copy memory – that's, if the C.P.U. was truly optimised for it. Intel CPUs at one purpose performed similar therefore seeing this can be not a surprise. Consequently, moreover an enormous performance hit for any games that aren’t conscious of what C.P.U. they're running on.
2. Ryzen performance peaks at four Threads
Even though the C.P.U. has eight physical Cores, the maximum bandwidth was at 4 Threads, which indicates that the memory controller on the CPU itself can only handle 4 Cores at the same time – after that it has to balance the necessary work over all Cores. This is a step back from the behaviour observed in Piledriver and Bulldozer, which (after reaching the physical core count) kept about the same memory bandwidth instead of degrading.
3. Windows Scheduler Issues
You didn’t think I would include this here, did you? It turns out that the people are right, the Windows 10 Pro, Ultimate whatever scheduler has bug which makes the use of Windows 10 Scheduler a bit wrong. I even have not enclosed the info for that within the table which you can see above, however essentially when there's no thread affinity is set the performance drops back to single thread levels.
Here's the deal: A user on Reddit apparently got a response from AMD confirming that there's so only 1 memory controller on Ryzen (Infinity Fabric). This confirms that there's so a bottleneck on the C.P.U. itself.
The Upcoming Potential Future with patient Present
The question now could be, will any of those be fixed? For first and last one, the solution is that it depends on the software system company who manufactures it. For the other, we are going to probably need to watch for Zen2 to boost this performance downside.
All we will do now could be wait and realize additional things.
No comments:
Post a Comment
Comments are welcome