Since the upgrade to linux kernel 4.7 (Debian Stretch), it seems that my system (Aurora-R4 i7 3820) is a bit slower (short hangs sometimes) and the most strange thing is that I can't see anymore the percentages on the CPU cores with top, KSysGuard, etc, ...
top (sort by %CPU):
Tasks: 263 total, 1 running, 262 sleeping, 0 stopped, 0 zombie
%Cpu(s): 7.0 us, 1.9 sy, 8.9 ni, 81.5 id, 0.6 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 8095452 total, 4514552 free, 1361576 used, 2219324 buff/cache
KiB Swap: 8301564 total, 8301564 free, 0 used. 6390680 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 203000 7504 5156 S 0.0 0.1 0:00.11 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
4 root 20 0 0 0 0 S 0.0 0.0 31:41.17 kworker/0:0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_sched
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain
11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1
But I can see something with mpstat:
$ mpstat -P ALL
Linux 4.7.0-1-amd64 (alienium) 23. 10. 16 _x86_64_ (8 CPU)
14:37:02 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
14:37:02 all 5.58 5.02 1.48 0.40 0.00 0.06 0.00 0.00 0.00 87.46
14:37:02 0 6.73 5.03 1.71 0.12 0.00 0.00 0.00 0.00 0.00 86.41
14:37:02 1 6.35 5.15 1.69 0.37 0.00 0.05 0.00 0.00 0.00 86.39
14:37:02 2 4.73 4.98 1.68 0.22 0.00 0.15 0.00 0.00 0.00 88.24
14:37:02 3 6.78 5.18 1.62 0.12 0.00 0.00 0.00 0.00 0.00 86.31
14:37:02 4 7.43 4.96 1.75 2.26 0.00 0.19 0.00 0.00 0.00 83.41
14:37:02 5 3.61 4.83 1.22 0.06 0.00 0.02 0.00 0.00 0.00 90.26
14:37:02 6 5.07 5.06 1.16 0.04 0.00 0.03 0.00 0.00 0.00 88.63
14:37:02 7 3.96 5.01 0.97 0.04 0.00 0.00 0.00 0.00 0.00 90.03
cpuinfo (just the first core)
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz
stepping : 7
microcode : 0x710
cpu MHz : 3600.045
cache size : 10240 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7200.09
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
uname:
Linux alienium 4.7.0-1-amd64 #1 SMP Debian 4.7.8-1 (2016-10-19) x86_64 GNU/Linux
My only solution is to use the kernel 4.6, then everything is fine.
Any ideas ?
Thank you
Edit 1
The problem seems really intel_pstate.
cpupower frequency-info
analyse du CPU 0 :
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: Cannot determine or is not supported.
limitation matérielle : 1.20 GHz - 4.00 GHz
régulateurs disponibles : performance powersave
tactique actuelle : la fréquence doit être comprise entre 1.20 GHz et 4.00 GHz.
Le régulateur "powersave" est libre de choisir la vitesse
dans cette plage de fréquences.
current CPU frequency: Unable to call hardware
current CPU frequency: Unable to call to kernel
boost state support:
Supported: yes
Active: yes
4000 MHz max turbo 4 active cores
4000 MHz max turbo 3 active cores
4000 MHz max turbo 2 active cores
4000 MHz max turbo 1 active cores
Here we can see that's it's unable to call hardware and to kernel. But sometimes (or after a very long time, I'm not sure); this command returns correctly the CPU frequency.
Edit 2
Still not working with kernel 4.8.5, the frequency stucks at 3.6 GHz according to /proc/cpuinfo.
The frequency is unknown:
sudo cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
<unknown>
Note that my system seems running slower; then I think that it's using the minimal CPU frequency (1.2 GHz). My watercooling seems not working hard.
Edit 3
I've tried kernel 4.9-rc5, it looks like 4.6 but there is a problem. I noticed that on 4.9 and 4.6 it's not really working fine. The first core stucks always at the same frequency (and even one thread on the second core with 4.9):
With 4.6 and stress
$ cat /proc/cpuinfo | grep MHz
cpu MHz : 3600.045
cpu MHz : 3600.045
cpu MHz : 3899.953
cpu MHz : 3899.953
cpu MHz : 3899.953
cpu MHz : 3899.953
cpu MHz : 3899.953
cpu MHz : 3899.953
And 4.9 and stress
$ cat /proc/cpuinfo | grep MHz
cpu MHz : 3600.045
cpu MHz : 3600.045
cpu MHz : 3600.045
cpu MHz : 3899.953
cpu MHz : 3899.953
cpu MHz : 3899.953
cpu MHz : 3899.953
cpu MHz : 3899.953
If I disable pstate, then the problem persists but CPU0, 1 and 2 stucks at different frequencies. Only 3-7 are working correctly. I will try with linux 4.4.
Edit 4
I build Linux 4.4.33 (LTS) and everything is working perfectly. All cores are changing the frequencies as expected. I think that it was fine with Linux 4.5 too (but I'm a bit discouraged by building 20 times the kernel in one day). I should search what commit breaks on my system but it's a bit difficult to build, install, reboot, test for each poential commit that potentially breaks something between Linux 4.4 and 4.6. It takes too much time.
Edit 5
I've upgraded my Debian Stretch to Debian Buster and now it's using the kernel 4.13. Everything seems working fine now.