2

The shell keyword time can be used to time how long a process takes, but how much overhead does time itself actually add to the process?

In other words, how much slower does a command run when being timed with time:

time grep "a string"

...than when it isn't being timed:

grep "a string"
Hashim Aziz
  • 13,835

3 Answers3

1

You can test how much does time takes by wrapping it into another time:

time time true

Note: I'm adding true which is the simple command which 'do nothing'.

true - do nothing, successfully - man true

On mine, it shows:

real:0m0.000s user:0m0.000s sys:0m0.000s

Even with 20 occurrences:

$ time time time time time time time time time time time time time time time time time time time time true true
real:0m0.000s user:0m0.000s sys:0m0.000s

Here are 1000:

$ eval $(for i in $(seq 1 1000); do printf "time "; done) true
real:0m0.000s user:0m0.000s sys:0m0.000s

man time gives some clues how the time is calculated:

Most information shown by time is derived from the wait3(2) system call. The numbers are only as good as those returned by wait3(2). On systems that do not have a wait3(2) call that returns status information, the times(2) system call is used instead. However, it provides much less information than wait3(2), so on those systems time reports the majority of the resources as zero.

For further technical details, check: man 2 wait3 or man wait4.


To debug time on Linux, you can use strace, e.g.

$ strace -f time true

To count time of syscalls (-c) and summarise the time differences (-s), run:

$ strace -fwc time true
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 30.91    0.007723          69       112           write
 27.19    0.006794        6794         1           wait4
 20.44    0.005107         340        15        13 execve
  7.43    0.001855        1855         1           clone
  2.94    0.000734         122         6         6 access
...

Here is how to debug time 10 times:

$ strace -fwc $(echo $(for i in $(seq 1 10); do printf "time "; done)) true | head
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 87.35    1.057704      105770        10           wait4
  7.33    0.088807          79      1120           write
  2.25    0.027275         222       123       112 execve
  1.37    0.016571        1657        10           clone
  0.32    0.003913          71        55           mmap
  0.32    0.003844         116        33        33 access
  ...

And 1000 times:

$ strace -fwc $(echo $(for i in $(seq 1 100); do printf "time "; done)) true | head
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.66  112.115311     1121153       100           wait4
  0.77    0.874091          78     11200           write
  0.24    0.271382         226      1203      1102 execve
  0.16    0.179477        1795       100           clone
  0.03    0.037229         123       303       303 access
  0.03    0.036702          73       505           mmap
...

Obviously running under the debugger, the time is going to be slower when running the actual command, but it can give the general idea what kind of syscalls are in use and how much time they can use compared to others.

kenorb
  • 26,615
1

Preliminary note

Sometimes it's easy to confuse time utilities. You explicitly asked about "the shell keyword time".


Code

The following code will run measured true (which is a shell builtin in Bash) n times and then unmeasured true the same number of times. Subtract the respective results and divide by n.

export n=10000000
time bash -c 'for ((i=1;i<=n;i++)); do time true; done' 2>/dev/null
time bash -c 'for ((i=1;i<=n;i++)); do      true; done' 2>/dev/null

Example

For n=10000000 I got respectively:

real    1m32.693s
user    1m13.679s
sys     0m18.998s

real    0m39.344s
user    0m39.343s
sys     0m0.000s

Respective differences:

real    53.349s
user    34.336s
sys     18.998s

Per one time:

real    5.3349μs
user    3.4336μs
sys     1.8998μs

Analysis

These depend on the general and temporary speed of the OS, so your results will vary. My testbed was a laptop with i7 CPU, Kubuntu 18.04.3 LTS, Bash 4.4.20. I also have a lot slower computer as a home router/server. There the results were like 36μs, 21μs, 15μs. Now you know what magnitude to expect.

I also tested with different values of n. Total times showed good linearity. In other words results "per one time" don't depend on n. Regarding my code, this was expected.


Conclusions

time in my Bash provides measurements with resolution of 1 ms. Even on a relatively slow computer the overhead is over 20 times smaller. I would say it's negligible.


Bonus

How about the other time?

On my laptop I replaced time in the loop with /usr/bin/time (and true with /bin/true in both lines because /usr/bin/time true cannot use the true builtin). These are (somewhat rounded) results per one /usr/bin/time obtained with n=100000:

real    600μs
user    530μs
sys      80μs

The external executable is about 100 times slower than the builtin. In reality a single invocation of /usr/bin/time may take much longer because the OS may need to read the executable file from HDD. Then the file gets cached so in my artificial test the rest of the loop benefits and this initial delay will become insignificant if n is large enough.

In comparison the time provided by Bash should be equally fast even for its first run because bash is already loaded (I haven't tested this though).

Back to /usr/bin/time. I expect the overhead to easily exceed 1 ms on slower machines.

1

In principle, time does nothing more than wait (literally, it calls wait3() which completely stops the execution of time's own process) for the child process to finish and then asks the kernel about the resource usage statistics of the child. So there is practically no impact of time on the run-time of the child process.

These statistics seem to always be collected. There seems to be no "start monitoring" mode or something that would slow down the child process. I use "seem" here, because I could not look into the code and could not find a source that explicitly confirmed that.

Source / Entrance to this rabbit hole: http://man7.org/linux/man-pages/man2/wait4.2.html

Hashim Aziz
  • 13,835
AndreasT
  • 769