Why does a Python script using PyTorch only use 6 out of 12 cores?

Question

I run Whisper on an Intel-Mac with an Intel Core i7-CPU (Whisper doesn't seem to support AMD Radeon GPUs at the moment, hence I use CPU). When I run Whisper on this computer when it otherwise mostly is idle Whisper takes around 500-550 % CPU (1200 % is max - 6+6 cores), so around half of my CPU capacity is used. If I launch another Whisper it too takes 500-550 %, which means that my CPU is more or less fully used.

In other words, two files are processed at around half speed. I would prefer to process one file at full speed, that is, that the first Whisper process used maybe 1100 %.

Why does Python (?) only use half of the available CPU-capacity in a situation like this? Can this be controlled via some setting, flag or similar?

Trevor Boyd Smith · Answer 1 · 2024-10-23T21:14:03.480

I tested whisper on my CPU intel 12700 with the torch.set_num_threads(n) set to n=[4,8,12,20].

the fastest speed was 8 (this intel CPU has 8 performance cores and 4 efficiency cores): 9 seconds
next fastest was 4 cores: 11.8 seconds
next fastest was 12 cores: 11.9 seconds
slowest was 20 cores: 23 seconds

The 20 cpu cores is the total count of all the logical cores as given by: import multiprocessing;print(multiprocessing.cpu_count());.

So if you want fast processing time (i.e. least processing time) then you should to trust `whisper` (i.e. `PyTorch`) because it is being smart and using all of your physical cores.

p.s. For "FASTEST" speed read further:

If you know you have an intel CPU AND if your CPU has performance and efficiency cores, then you can speed it up further by counting your performance cores and then passing that in and your whisper processing will be fastest by a significant amount. In my above example about 30% faster. However I do not know how to programmatically get the number of performance cores other than to get the intel CPU model number then go to intel-ark website then read the specs THEN finally go back to code to hardcoded that number.

So IMO using the total number of physical cores is easier and guarantees almost fastest speed without some painful hardcoding or creating a giant look up table of intel CPU models vs performance core count.

p.p.s. For more information on what is logical vs physical cores on a x86_64/AMD64 CPU pleas see https://en.wikipedia.org/wiki/Hyper-threading . Some future Intel CPU products will not have Hyper Threading and so there will not be any logical vs physical cores anymore. In general applications that have lots of 100% CPU processing do not benefit from HyperThreading and can be hurt by HyperThreading. Whisper/PyTorch is a perfect example where HyperThreading is not helping at all and if you try to use all the logical cores then processing is 2x slower.

score -1 · Answer 2 · answered Feb 23 '24 at 01:55

whisper has a parameter for the number of threads.

--threads THREADS number of threads used by torch for CPU inference;
supercedes MKL_NUM_THREADS/OMP_NUM_THREADS (default: 0)

So to get full utilization on a CPU with 12 cores run

whisper --threads 12

Why does a Python script using PyTorch only use 6 out of 12 cores?

2 Answers2

So if you want fast processing time (i.e. least processing time) then you should to trust whisper (i.e. PyTorch) because it is being smart and using all of your physical cores.

So if you want fast processing time (i.e. least processing time) then you should to trust `whisper` (i.e. `PyTorch`) because it is being smart and using all of your physical cores.