0

I was interested in how Hyper-Threading awareness impacts the scheduling of threads onto logical and physical cores, e.g., does it co-locate threads from the same process to benefit from cache sharing, does it separate threads that it somehow knows will contend a lot for core resources, does it combine computationally-intense threads with I/O-intense threads, etc.

I googled how-does-scheduler-treat-logical-core. I didn't ask for an AI answer, but the Google results led with the AI answer that a Hyper-Threading-unaware scheduler treats logical cores the same as physical cores. In a multi-core CPU, if there are hypothetically only two threads to run, the lack of Hyper-Threading awareness could cause the scheduling of both threads onto the same physical core. This makes the one core too busy and delays the completion of the thread tasks.

However, that's just Google's AI. If it is to have any credibility, then it needs corroboration. The AI response seems to imply that the a Hyper-Threading-aware scheduler would prefer to spread threads out onto different physical processors and only double them up when there are no more physical processors. Is this actually true? Where can I find this information?

There is a reason why I wonder about the veracity of this Hyper-Threading-aware preference to spread threads out onto different physical cores. On a typical computer, there are thousands of threads waiting to run. This is greater than the number of logical and physical processors by many fold, if not orders of magnitude. This isn't just an AI answer, so it seems that outside of very specific scientific computing applications, it's quite plausible that there is no advantage to avoid doubling up threads onto physical cores. Is this correct, or am I missing something?

Ultimately, I'm trying to get an idea of how Hyper-Threading awareness improves scheduling. Based on what I have been able to find thus far, it's quite possible that it doesn't. However, I am not a computer scientist -- I merely had to learn about multithreading to make my software component thread-safe since the host application does use hyperthreading.

P.S. I've been perusing Hyper-Threading pages online for days, so I'm not really asking about the basics of multithreading, SMT, what Hyper-Threading is, superscalar, or that kind of background.

user2153235
  • 1,543

1 Answers1

1

Hyperthreading shares a CPU core across two threads. That means that both those threads need to share caches, core resources and execution units. When you are only using one of the hyperthreaded cores that means that the cache and resource can be better allocated to the thread that is running.

Put simply if you have 4 things to do, and 4 hyperthreaded cores, and as a result you only need 4 cores, then best performance will be achieved by using the "full" cores and avoiding the second (hyper) thread on each core.

As a clarification hyperthreading does not equal a 100% boost in performance per core. An old example was that hyperthreading could allow an extra 30% performance boost depending on task. Two tasks running on a single hyperthreaded core would only be equivalent to 1.3 processors.

As an alternative running two tasks on two "full" processors, assuming no memory or other constraints, would achieve the full performance of both cores. Two things would truly be running simultaneously rather than sharing execution resources.

So to clarify:

  • 1 core with Hyperthreading = 130% performance uplift
  • 2 x cores = 200% performance

So you absolutely would prefer to have individual task running on "full" cores where possible.

That's where the hyperthreading aware scheduler comes in.

It would preferentially schedule tasks on each core, and then only make use of the remaining core resource via hyperthreading when necessary.

Ignore "I/O" and other features, between processes each core is basically identical and has the same speed to memory and hard disks. There is no benefit to "co-locating" processes as the moment they do a memory or other hardware request they will see the same limit regardless of what core they are on.

Pai-to
  • 413