Jump to content

AMD Threadripper 3990x Issues


23d1

Recommended Posts

  • Supervisor

@23d1more performance If you use real cores till the OSX limit

but there you are alone without any comparative with others

so to see if TDP and frequency change as they change for us..try to simulate a real 3970x

OSX will not support for now more then 64 c/t

we will see in a near future

MMIO could be 4g on or off

but usually it is related to first two of the list

 

if you can open a thread in this forum and I will move there all these tests and we can go on there a keep this thread more generic..

I would very pleased to help also for a my personal goal to understand this problem

 

Edited by fabiosun
Link to comment
Share on other sites

Starting this thread from a discussion in the Bare Metal Vanilla thread.

 

The TL;DR:

There seems to be a lack of power management, or rather a lack of frequency management in macOS when it comes to the 3990x 64 core (128 thread) processor. Since macOS can only handle 64 threads, one has to either disable cores or disable multithreading. The CPU clock remains static at base 2.9Ghz, never moving up or down (the stock boost frequency is 4.3Ghz).

 

After exhausting many options with kexts, tweaks in config.plist, and BIOS settings my theory is that either macOS has no instructions for managing CPU frequencies for processors running single threaded, or there's just not been enough research as it's an overkill CPU for macOS in general.

 

The ideal situation would be macOS supporting 64-core/128-thread processors, but I doubt that will happen for anything other than ARM/Apple Silicon...

Link to comment
Share on other sites

28 minutes ago, fabiosun said:

@23d1more performance If you use real cores till the OSX limit

but there you are alone without any comparative with others

so to see if TDP and frequency change as they change for us..try to simulate a real 3970x

OSX will not support for now more then 64 c/t

we will see in a near future

MMIO could be 4g on or off

but usually it is related to first two of the list

 

if you can open a thread in this forum and I will move there all these tests and we can go on there a keep this thread more generic..

I would very pleased to help also for a my personal goal to understand this problem

 

Done;

 

Link to comment
Share on other sites

9 hours ago, fabiosun said:

@23d1more performance If you use real cores till the OSX limit

but there you are alone without any comparative with others

so to see if TDP and frequency change as they change for us..try to simulate a real 3970x

OSX will not support for now more then 64 c/t

we will see in a near future

MMIO could be 4g on or off

but usually it is related to first two of the list

 

if you can open a thread in this forum and I will move there all these tests and we can go on there a keep this thread more generic..

I would very pleased to help also for a my personal goal to understand this problem

 

Next step I'll try disabling cores and report back, to see if it's a matter of how macOS handles single-threaded vs multi-threaded, or if it's just a lack of support for 3990x in general.

  • +1 1
Link to comment
Share on other sites

Reporting back; disabling half of the cores and leaving multithreading on didn't change any base/boost frequency control on the macOS side. Cinebench r23 came back at about 30,000 points, which makes sense considering we're losing 32 physical cores.

 

CC: @fabiosun

 

Might be something that is patchable through kernel or otherwise, but hard to say, obviously.

 

Current conclusion; everything is super smooth and works great regardless of the fairly massive performance drop compared to fully threaded in Win/Linux. To be able to tap into some of that extra juice, KVM is the way to go with 3990x at this point. Unless you're doing highly threaded workflows (CPU based 3D rendering, simulation, and so forth), Bare Metal is the smoother experience compared to KVM which has it's quirks to iron out.

Edited by 23d1
Added conclusion.
  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

While bare metal is slow on Big Sur compared to KVM passthrough (where voltages and clocks are handled by linux as opposed to macOS), it's almost equal in Monterey, so there seems to be some underlying changes in macOS.

Will do a more in-depth comparison when I have a moment, but I think it's safe to assume that the same issues persist, where the 3990x clocks remain flat at 2.9Ghz, instead of boosting up to the factory boost clocks of 4.3Ghz.

Funny enough, it's still about twice as fast as the top tier Mac Pro from Apple, but it could theoretically by more than 3 times as fast, as it is in Linux/Windows.

  • +1 1
Link to comment
Share on other sites

Simply changing the topology in Kernel > Patch from BA20... to BA40... (hexadecimal for 64 is 40) only cosmetically changed the reading.

 

Had a hunch that NUMA/Virtualization may have been an issue, as PowerManagement couldn't detect the CPU states, more or less. Said and done, I disabled those in BIOS (wanted those intact for the times I do boot Linux or Windows to virtualize macOS), and now only about 10% performance drop on bare metal. Amazing. The AMD Power Gadget App is now showing the frequencies being altered on the fly. Sick! About 62,000 points in Cinebench (78,000 in Windows, though) but in general a lot closer to "native" performance.

  • Like 1
Link to comment
Share on other sites

After further testing, the true performance falloff seems to be about 15%... It's a lot better than 25-30%, and in line with turning off SMT (Hyperthreading) in Windows and Linux. It's a shame macOS can't take more than 64 threads, because my machine would be crushing it in the top with PBO (AMD Precision Boost Overdrive) enabled—as I'm getting close to 80k points in Cinebench in Windows.

Edited by 23d1
Link to comment
Share on other sites

As a side note, the AMD RX 6800 XT is in par with 2 x 1080 Ti overclocked. At about 4 minutes Redshift Benchmark, which is a HUGE improvement compared to eGPU AMD RX 6800 XT connected to a MacBook Pro, that clocks in at over 5 minutes. Add one or two 6800 XT or 6900 XT and overall, this is the fastest Mac known to man. Boom!

Edited by 23d1
Link to comment
Share on other sites

  • Moderators
4 hours ago, 23d1 said:

After further testing, the true performance falloff seems to be about 15%... It's a lot better than 25-30%, and in line with turning off SMT (Hyperthreading) in Windows and Linux. It's a shame macOS can't take more than 64 threads, because my machine would be crushing it in the top with PBO (AMD Precision Boost Overdrive) enabled—as I'm getting close to 80k points in Cinebench in Windows.

 

Kernel can be easily patched to work with more than 64 threads:

 

https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/osfmk/i386/mp.h#L69

 

...just change the value of that constant to 128 (or more) and recompile the kernel, or apply a patch in OC / Clover... but the issue then is in the AppleACPIPlatform.kext which cannot handle more than 64 thread and needs to be modded otherwise throwing an early Kernel Panic... the only person known to be able to solve this "issue" is PikerAlpha... but he quitted the Hackintosh scene, after some bad disgraces happened to him and his family... so we can only hope that Apple raises the max possible threads in Darwin.

 

In the meantime, you could raise the base clock of your 3990x overclocking it by 15%, trying to compensate the lack of SMT, if your cooler can stand all the heat produced!

Link to comment
Share on other sites

6 hours ago, tomnic said:

 

Kernel can be easily patched to work with more than 64 threads:

 

https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/osfmk/i386/mp.h#L69

 

...just change the value of that constant to 128 (or more) and recompile the kernel, or apply a patch in OC / Clover... but the issue then is in the AppleACPIPlatform.kext which cannot handle more than 64 thread and needs to be modded otherwise throwing an early Kernel Panic... the only person known to be able to solve this "issue" is PikerAlpha... but he quitted the Hackintosh scene, after some bad disgraces happened to him and his family... so we can only hope that Apple raises the max possible threads in Darwin.

 

In the meantime, you could raise the base clock of your 3990x overclocking it by 15%, trying to compensate the lack of SMT, if your cooler can stand all the heat produced!

 

The kernel seems straightforward enough, but yeah—I wouldn't even know where to start with modding kext and so forth. My cooler can take it, but I'm about longevity as well—so I typically run at base clock/boost unless I need the extra power for a sim or CPU rendering. Would be fantastic if Apple raised the core count on their end in general, but unlikely for x86, I'm assuming—more likely for ARM.

Link to comment
Share on other sites

18 hours ago, fabiosun said:

@23d1when you have time , could you post some screenshots benchmark for CPU/Gpu of your "fat" system? 🙂

 

thank you

Sure thing!

These benchmarks were all done at base clock/boost, without PBO enabled. Pretty much almost on par with Windows with SMT (multithreading) off. With multithreading on, Cinebench in Windows nets about 64,000 points, and with PBO that shoots off the chart beyond 78,000 points. Now if I can figure out a way to recompile the kernel and get some help with the AppleACPIPlatform.kext to support 128 threads, I think we may be able to see the same results as in Windows/Linux.

 

In regards to the 6800XT the benchmark scene for Redshift finishes in 4m6s, which is about the same as using two 1080Ti in Windows/Linux. Amazing performance compared to eGPU on MacBook Pro, where Apple Metal seems to add some overhead for some reason, resulting in about 30% performance loss...

Screen Shot 2021-11-01 at 11.41.41.png

Screen Shot 2021-11-01 at 11.48.22.png

Edited by 23d1
  • Like 1
Link to comment
Share on other sites

  • Supervisor
9 minutes ago, 23d1 said:

Now if I can figure out a way to recompile the kernel and get some help with the AppleACPIPlatform.kext to support 128 threads, I think we may be able to see the same results as in Windows/Linux

it is an impossible mission 🙂

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • There are no registered users currently online
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.