1

On a HPE DL380 G10 Server with 2 Xeon(R) Gold 6246R (32 phys. Cores, 64 log. with HT), in the BIOS, using this setting:

System Configuration > BIOS/Platform Configuration (RBSU) > Performance Options > Advanced Performance Tuning Options > NUMA Group Size Optimization

We can choose of these 2 options:

  1. Clustered — Optimizes groups along NUMA boundaries, providing better performance.
  2. Flat - Enables applications that are not optimized to take advantage of processors spanning multiple groups to utilize morelogical processors.

The "Clustered" Option will force Windows to only reporting ONE NUMA Node BUT 2 Processor Groups, each one 32 Cores. The "Flat" Option on the other side shows 2 NUMA Nodes and only 1 Processor Group with 64 logical Cores.

Sure i know already a lot about NUMA and Processor groups, so i'm really not asking for any recommendations about the best Setting for my usecase or so. Instead the question is: Why would it make sense for windows to decide to go with 2 Processor Groups when there is only one NUMA node and vice-versa? Also, i never found any documentation that Windows (Server 2019) Splits 64 Cores into 2 Groups, is this expected behaviour at all?

Harry
  • 327

1 Answers1

3

The article you have found dates from the year 2008. At that time Windows was confronted with NUMA computers with more than 64 processors, while its implementation of processor groups was limited to 64. The solution then was to create automatically more than one such groups of no more than 64 processors each.

A more flexible solution was introduced end 2014. The earliest reference I have found is an HP Advisory note regarding HP Gen9 servers and dating from 2015-04-24:

In the Revision 1.30 (12/24/2014) and later versions of the System ROMs for Gen9 servers, a new ROM-Based Setup Utility, "NUMA Group Size Optimization," has been added that allows the user to change the behavior of reporting processors to the Operating System. This option will allow the OS to put all logical processors into a single group if there are 64 logical processors or fewer ("Flat".)

The Clustered option creates one processor group for the cores of each CPU. The Flat option is intended to let applications use all the computer cores, since in Windows by default an application is limited to a single group and therefore will see and use only its one group.

Regarding your questions:

Why would it make sense for windows to decide to go with 2 Processor Groups when there is only one NUMA node and vice-versa?

This is dictated by the Clustered option, which makes sense for an application and all its processes and threads to execute all of them on nearby memory for performance.

If an application needs in your case more than 32 cores, you should use the Flat option, to make all cores disponible.

I never found any documentation that Windows (Server 2019) Splits 64 Cores into 2 Groups, is this expected behaviour at all?

Yes, this is the expected behavior. It's not very well documented, but there are references for this behavior, for example Exchange performance:HP NUMA BIOS settings.

harrymc
  • 498,455