4

Since the upgrade to linux kernel 4.7 (Debian Stretch), it seems that my system (Aurora-R4 i7 3820) is a bit slower (short hangs sometimes) and the most strange thing is that I can't see anymore the percentages on the CPU cores with top, KSysGuard, etc, ...

top (sort by %CPU):

Tasks: 263 total,   1 running, 262 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.0 us,  1.9 sy,  8.9 ni, 81.5 id,  0.6 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem :  8095452 total,  4514552 free,  1361576 used,  2219324 buff/cache
KiB Swap:  8301564 total,  8301564 free,        0 used.  6390680 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                        
    1 root      20   0  203000   7504   5156 S   0.0  0.1   0:00.11 systemd                                                                                                                                                                        
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd                                                                                                                                                                       
    3 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0                                                                                                                                                                    
    4 root      20   0       0      0      0 S   0.0  0.0  31:41.17 kworker/0:0                                                                                                                                                                    
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H                                                                                                                                                                   
    7 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcu_sched                                                                                                                                                                      
    8 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcu_bh                                                                                                                                                                         
    9 root      rt   0       0      0      0 S   0.0  0.0   0:00.00 migration/0                                                                                                                                                                    
   10 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 lru-add-drain                                                                                                                                                                  
   11 root      rt   0       0      0      0 S   0.0  0.0   0:00.00 watchdog/0                                                                                                                                                                     
   12 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/0                                                                                                                                                                        
   13 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/1

But I can see something with mpstat:

$ mpstat -P ALL
Linux 4.7.0-1-amd64 (alienium)  23. 10. 16      _x86_64_        (8 CPU)

14:37:02     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
14:37:02     all    5.58    5.02    1.48    0.40    0.00    0.06    0.00    0.00    0.00   87.46
14:37:02       0    6.73    5.03    1.71    0.12    0.00    0.00    0.00    0.00    0.00   86.41
14:37:02       1    6.35    5.15    1.69    0.37    0.00    0.05    0.00    0.00    0.00   86.39
14:37:02       2    4.73    4.98    1.68    0.22    0.00    0.15    0.00    0.00    0.00   88.24
14:37:02       3    6.78    5.18    1.62    0.12    0.00    0.00    0.00    0.00    0.00   86.31
14:37:02       4    7.43    4.96    1.75    2.26    0.00    0.19    0.00    0.00    0.00   83.41
14:37:02       5    3.61    4.83    1.22    0.06    0.00    0.02    0.00    0.00    0.00   90.26
14:37:02       6    5.07    5.06    1.16    0.04    0.00    0.03    0.00    0.00    0.00   88.63
14:37:02       7    3.96    5.01    0.97    0.04    0.00    0.00    0.00    0.00    0.00   90.03

cpuinfo (just the first core)

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 45
model name      : Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz
stepping        : 7
microcode       : 0x710
cpu MHz         : 3600.045
cache size      : 10240 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts
bugs            :
bogomips        : 7200.09
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

uname:

Linux alienium 4.7.0-1-amd64 #1 SMP Debian 4.7.8-1 (2016-10-19) x86_64 GNU/Linux

My only solution is to use the kernel 4.6, then everything is fine.

Any ideas ?

Thank you

Edit 1

The problem seems really intel_pstate.

cpupower frequency-info
analyse du CPU 0 :
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  limitation matérielle : 1.20 GHz - 4.00 GHz
  régulateurs disponibles : performance powersave
  tactique actuelle : la fréquence doit être comprise entre 1.20 GHz et 4.00 GHz.
                  Le régulateur "powersave" est libre de choisir la vitesse
                  dans cette plage de fréquences.
  current CPU frequency: Unable to call hardware
  current CPU frequency:  Unable to call to kernel
  boost state support:
    Supported: yes
    Active: yes
    4000 MHz max turbo 4 active cores
    4000 MHz max turbo 3 active cores
    4000 MHz max turbo 2 active cores
    4000 MHz max turbo 1 active cores

Here we can see that's it's unable to call hardware and to kernel. But sometimes (or after a very long time, I'm not sure); this command returns correctly the CPU frequency.

Edit 2

Still not working with kernel 4.8.5, the frequency stucks at 3.6 GHz according to /proc/cpuinfo.

The frequency is unknown: sudo cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq <unknown>

Note that my system seems running slower; then I think that it's using the minimal CPU frequency (1.2 GHz). My watercooling seems not working hard.

Edit 3

I've tried kernel 4.9-rc5, it looks like 4.6 but there is a problem. I noticed that on 4.9 and 4.6 it's not really working fine. The first core stucks always at the same frequency (and even one thread on the second core with 4.9):

With 4.6 and stress $ cat /proc/cpuinfo | grep MHz cpu MHz : 3600.045 cpu MHz : 3600.045 cpu MHz : 3899.953 cpu MHz : 3899.953 cpu MHz : 3899.953 cpu MHz : 3899.953 cpu MHz : 3899.953 cpu MHz : 3899.953

And 4.9 and stress $ cat /proc/cpuinfo | grep MHz cpu MHz : 3600.045 cpu MHz : 3600.045 cpu MHz : 3600.045 cpu MHz : 3899.953 cpu MHz : 3899.953 cpu MHz : 3899.953 cpu MHz : 3899.953 cpu MHz : 3899.953

If I disable pstate, then the problem persists but CPU0, 1 and 2 stucks at different frequencies. Only 3-7 are working correctly. I will try with linux 4.4.

Edit 4

I build Linux 4.4.33 (LTS) and everything is working perfectly. All cores are changing the frequencies as expected. I think that it was fine with Linux 4.5 too (but I'm a bit discouraged by building 20 times the kernel in one day). I should search what commit breaks on my system but it's a bit difficult to build, install, reboot, test for each poential commit that potentially breaks something between Linux 4.4 and 4.6. It takes too much time.

Edit 5

I've upgraded my Debian Stretch to Debian Buster and now it's using the kernel 4.13. Everything seems working fine now.

0 Answers0