I have been studying the Abstract Types and mainly the difference between signed integer types and unsigned integer types. I was really curious about the performance of Julia so I measured the differences between those types. Since Julia is mostly implemented in C, I would assume almost the same behavior in abstract types. This question: performance of unsigned vs signed integers give some explanation about that unsigned integers leads to the same or better performance than signed. But in my benchmarking, I found that some Integers are faster than unsigned Integers. Here is some reproducible code:
julia> using BenchmarkTools
julia> Int_8 = rand(Int8, 1000)
julia> Int_16 = rand(Int16, 1000)
julia> Int_32 = rand(Int32, 1000)
julia> Int_64 = rand(Int64, 1000)
julia> Int_128 = rand(Int128, 1000)
julia> UInt_8 = rand(UInt8, 1000)
julia> UInt_16 = rand(UInt16, 1000)
julia> UInt_32 = rand(UInt32, 1000)
julia> UInt_64 = rand(UInt64, 1000)
julia> UInt_128 = rand(UInt128, 1000)
julia> @benchmark Int_8.^2
BenchmarkTools.Trial: 10000 samples with 147 evaluations.
 Range (min … max):  698.333 ns … 54.836 μs  ┊ GC (min … max):  0.00% … 98.34%
 Time  (median):     733.088 ns              ┊ GC (median):     0.00%
 Time  (mean ± σ):   870.727 ns ±  2.209 μs  ┊ GC (mean ± σ):  11.56% ±  4.49%
  ▇█▅▄▂▁ ▇▇▄▂▁       ▂▁ ▁   ▁▁                                 ▂
  ██████████████▇▆▆▆██████▇██████▇▆▅▆▆▄▄▅▆▆▆▄▄▅▆▆▆▆▆▅▅▄▅▄▄▄▅▃▄ █
  698 ns        Histogram: log(frequency) by time      1.28 μs <
 Memory estimate: 1.14 KiB, allocs estimate: 5.
julia> @benchmark Int_16.^2
BenchmarkTools.Trial: 10000 samples with 16 evaluations.
 Range (min … max):  986.625 ns …  3.895 ms  ┊ GC (min … max):  0.00% … 99.93%
 Time  (median):       1.077 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):     3.069 μs ± 77.347 μs  ┊ GC (mean ± σ):  50.40% ±  2.00%
  ██▃              ▂▅▆▆▅▄▄▃▂▂▁                                 ▂
  ████▆▇▆▅▅▅▆▅▅▃▆▆▄████████████▇▅▅▅▅▄▄▆▇▆▆▇▇▇█▇▇▆▆▅▆▄▅▅▄▅▅▆▄▅▅ █
  987 ns        Histogram: log(frequency) by time      3.68 μs <
 Memory estimate: 2.14 KiB, allocs estimate: 5.
julia> @benchmark Int_32.^2
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.327 μs …  4.701 ms  ┊ GC (min … max):  0.00% … 99.89%
 Time  (median):     1.474 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   3.481 μs ± 80.410 μs  ┊ GC (mean ± σ):  39.99% ±  1.73%
  ▆█▆▃               ▁▄▄▄▄▃▃▂▁▁                              ▁
  █████▇▆▆▆▆▅▄▅▅▅▄▄▃▅███████████▇▇▅▅▅▆▆▆▄▅▅▆▅▆▄▆▅▅▇▆▅▅▅▆▅▅▆▆ █
  1.33 μs      Histogram: log(frequency) by time     5.82 μs <
 Memory estimate: 4.14 KiB, allocs estimate: 5.
julia> @benchmark Int_64.^2
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.749 μs …  2.949 ms  ┊ GC (min … max):  0.00% … 99.73%
 Time  (median):     1.887 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   5.069 μs ± 84.920 μs  ┊ GC (mean ± σ):  52.89% ±  3.15%
  ▇█▆▃▁                       ▂▃▃▃▂▂▂▁▁                      ▂
  █████▇▇▇█▇▆▅▅▅▅▅▄▅▄▄▅▁▃▅▃▅▅███████████▇▇▆▅▃▅▄▁▄▅▄▅▆▅▇▆▅▅▆▆ █
  1.75 μs      Histogram: log(frequency) by time     7.31 μs <
 Memory estimate: 8.02 KiB, allocs estimate: 5.
julia> @benchmark Int_128.^2
BenchmarkTools.Trial: 10000 samples with 3 evaluations.
 Range (min … max):   8.516 μs …  5.690 ms  ┊ GC (min … max):  0.00% … 99.72%
 Time  (median):      8.778 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   10.962 μs ± 96.809 μs  ┊ GC (mean ± σ):  15.28% ±  1.73%
  ▄█▅▃▁                                                       ▁
  ██████▇▇▆▆▆▆▇▆▆▆▆▅▄▃▃▄▃▂▃▄▄▄▄▃▄▄▅▆▆▆▆▆▆▆▄▅▄▄▃▃▄▆▆▆▆▆▅▄▄▅▆▆▆ █
  8.52 μs      Histogram: log(frequency) by time      19.2 μs <
 Memory estimate: 15.83 KiB, allocs estimate: 5.
julia> @benchmark UInt_8.^2
BenchmarkTools.Trial: 10000 samples with 104 evaluations.
 Range (min … max):  845.058 ns … 76.695 μs  ┊ GC (min … max): 0.00% … 98.21%
 Time  (median):     897.909 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):     1.027 μs ±  2.643 μs  ┊ GC (mean ± σ):  9.58% ±  3.68%
  ██▇▆▃▆█▇▅▄▃▂▁       ▁▁▁▁▁▂▁▁ ▁                               ▃
  ███████████████▇▇▆▇█████████████▇▆▆▆▆▅▅▆▆▆▆▆▆▅▆▆▆▅▅▆▅▅▄▅▄▅▁▄ █
  845 ns        Histogram: log(frequency) by time      1.59 μs <
 Memory estimate: 1.14 KiB, allocs estimate: 5.
julia> @benchmark UInt_16.^2
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.032 μs …  6.110 ms  ┊ GC (min … max):  0.00% … 99.96%
 Time  (median):     1.139 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   2.458 μs ± 83.275 μs  ┊ GC (mean ± σ):  47.86% ±  1.41%
   ██▅▂▁            ▁▂▃▂▂▁▁                                  ▂
  ███████▆▆▆▆▆▄▇▆▆▇███████████▆▄▅▆▆▄▄▄▃▃▃▄▃▄▃▅▁▃▄▅▃▃▄▄▅▄▅▆▆▆ █
  1.03 μs      Histogram: log(frequency) by time     3.83 μs <
 Memory estimate: 2.14 KiB, allocs estimate: 5.
julia> @benchmark UInt_32.^2
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.205 μs …   4.694 ms  ┊ GC (min … max):  0.00% … 99.91%
 Time  (median):     1.332 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   3.821 μs ± 101.562 μs  ┊ GC (mean ± σ):  59.39% ±  2.23%
   ▇█▆▃                      ▁▂▂▂▂▂▁                          ▂
  ▇██████▆▇▇▆▆▆▇▇▅▅▄▄▁▃▃▄▄▃▄▇█████████▇▇▇▆▇▅▅▅▄▁▄▄▄▆▇▆▆▆▄▅▆▆▆ █
  1.2 μs       Histogram: log(frequency) by time      4.42 μs <
 Memory estimate: 4.14 KiB, allocs estimate: 5.
julia> @benchmark UInt_64.^2
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.695 μs …  2.845 ms  ┊ GC (min … max):  0.00% … 99.81%
 Time  (median):     1.842 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   4.734 μs ± 84.073 μs  ┊ GC (mean ± σ):  56.11% ±  3.16%
  ▅██▆▃▁                                 ▁▁▁▁                ▂
  ███████▇▆▆▇▇▆▆▄▄▅▄▃▄▅▆▆▅▅▁▃▃▃▄▄▄▁▅▃▅▅▇█████████▇▇▇▆▆▄▆▆▄▄▃ █
  1.7 μs       Histogram: log(frequency) by time     5.71 μs <
 Memory estimate: 8.02 KiB, allocs estimate: 5.
julia> @benchmark UInt_128.^2
BenchmarkTools.Trial: 10000 samples with 4 evaluations.
 Range (min … max):   8.474 μs …  4.597 ms  ┊ GC (min … max):  0.00% … 99.69%
 Time  (median):      8.691 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   10.980 μs ± 87.765 μs  ┊ GC (mean ± σ):  15.96% ±  1.99%
  ▇█▆▄▃▁▁▁                                                    ▁
  █████████████▇█▇▇▇▆▆▆▅▅▄▆▄▄▄▄▅▄▄▂▄▅▄▄▄▅▆▇▇▇▆▇▇█▇▇▇▆▆▆▆▆▇▅▅▄ █
  8.47 μs      Histogram: log(frequency) by time        17 μs <
 Memory estimate: 15.83 KiB, allocs estimate: 5.
julia> @btime $Int_8.^2
  91.008 ns (1 allocation: 1.06 KiB)
julia> @btime $Int_16.^2
  319.261 ns (1 allocation: 2.06 KiB)
julia> @btime $Int_32.^2
  535.193 ns (1 allocation: 4.06 KiB)
julia> @btime $Int_64.^2
  923.778 ns (1 allocation: 7.94 KiB)
julia> @btime $Int_128.^2
  7.706 μs (1 allocation: 15.75 KiB)
julia> @btime $UInt_8.^2
  89.870 ns (1 allocation: 1.06 KiB)
julia> @btime $UInt_16.^2
  316.495 ns (1 allocation: 2.06 KiB)
julia> @btime $UInt_32.^2
  539.642 ns (1 allocation: 4.06 KiB)
julia> @btime $UInt_64.^2
  872.708 ns (1 allocation: 7.94 KiB)
julia> @btime $UInt_128.^2
  7.061 μs (1 allocation: 15.75 KiB)
As you can see UInt8, UInt16 is slightly faster than Int8, Int16. But Int32 becomes faster than UInt32. Is there a reason for that? Int64 and Int128 become slower again. As you can see the performance changes at some point between unsigned integers and signed integers and crosses.
So I was wondering if anyone could explain why some signed Integers are faster than unsigned Integers and vice versa? Should there not be a more linear difference, because now it crosses performances within the types?
Edit: add benchmark like @Shayan mentioned in the comments
Here I added the benchmark. It seems that UInt8 and UInt32 of unsigned integers are faster and Int64 and Int128 Integers are faster, so still some difference between these Abstract Types.
julia> @benchmark $Int_8.^2
BenchmarkTools.Trial: 10000 samples with 958 evaluations.
 Range (min … max):   81.209 ns …   9.620 μs  ┊ GC (min … max):  0.00% … 97.35%
 Time  (median):     115.392 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   221.416 ns ± 855.670 ns  ┊ GC (mean ± σ):  43.50% ± 10.98%
  █▃                                                            ▁
  ██▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▃▄▃▄▅▆ █
  81.2 ns       Histogram: log(frequency) by time       7.39 μs <
 Memory estimate: 1.06 KiB, allocs estimate: 1.
julia> @benchmark $Int_16.^2
BenchmarkTools.Trial: 10000 samples with 241 evaluations.
 Range (min … max):  312.037 ns … 259.117 μs  ┊ GC (min … max):  0.00% … 99.79%
 Time  (median):     363.174 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):     1.899 μs ±  18.716 μs  ┊ GC (mean ± σ):  77.72% ±  7.83%
  ▄▆██▃    ▂▂▁                                    ▁             ▂
  █████████████▇▆▆▆█▇▇▆▆▅▆▆▅▄▅▆▅▅▄▄▅▄▃▄▄▄▄▆▅▆▆▆▇▇██████▇▇▇▇▆▅▅▄ █
  312 ns        Histogram: log(frequency) by time       1.35 μs <
 Memory estimate: 2.06 KiB, allocs estimate: 1.
julia> @benchmark $Int_32.^2
BenchmarkTools.Trial: 9034 samples with 194 evaluations.
 Range (min … max):  520.572 ns … 251.764 μs  ┊ GC (min … max):  0.00% … 99.71%
 Time  (median):     583.026 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):     2.824 μs ±  22.005 μs  ┊ GC (mean ± σ):  77.60% ±  9.84%
  ▇█▄▁▂▃▁ ▁▁                                                    ▁
  ████████████▇▆▇▆▅▃▅▄▅▄▃▄▄▄▄▄▁▄▅▆▆▆▆▇▇▇▆▆▅▄▅▅▅▃▄▄▁▁▃▁▄▁▁▁▁▁▁▁▃ █
  521 ns        Histogram: log(frequency) by time       3.06 μs <
 Memory estimate: 4.06 KiB, allocs estimate: 1.
julia> @benchmark $Int_64.^2
BenchmarkTools.Trial: 10000 samples with 44 evaluations.
 Range (min … max):  867.795 ns … 653.472 μs  ┊ GC (min … max):  0.00% … 99.82%
 Time  (median):       1.009 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):     3.341 μs ±  36.073 μs  ┊ GC (mean ± σ):  68.34% ±  6.30%
  ▃▆▇██▆▆▄▂▁              ▁                                     ▂
  ██████████▇▆▆▇▇▆▇▇▆▅▅▇▇█████▇▇▅▆▄▅▄▃▄▄▃▃▄▁▅▆▇▅▄▅▄▄▅▅▁▄▅▃▃▃▄▄▃ █
  868 ns        Histogram: log(frequency) by time       2.94 μs <
 Memory estimate: 7.94 KiB, allocs estimate: 1.
julia> @benchmark $Int_128.^2
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range (min … max):   6.683 μs …  3.352 ms  ┊ GC (min … max):  0.00% … 99.74%
 Time  (median):      7.721 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   10.562 μs ± 96.224 μs  ┊ GC (mean ± σ):  27.31% ±  2.99%
  ▇▂▅▅   █▅                                                   ▁
  ████▇▄▆███▇▅▆▆▅▅▆▃▄▂▄▅▄▄▄▄▄▄▄▅▅▃▄▃▄▄▂▄▂▃▂▄▃▃▆▆▆▆▄▄▆▆▇▇▆▆▆▅▇ █
  6.68 μs      Histogram: log(frequency) by time      14.9 μs <
 Memory estimate: 15.75 KiB, allocs estimate: 1.
julia> @benchmark $UInt_8.^2
BenchmarkTools.Trial: 10000 samples with 967 evaluations.
 Range (min … max):   80.428 ns …  10.368 μs  ┊ GC (min … max):  0.00% … 98.18%
 Time  (median):     108.816 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   217.285 ns ± 835.138 ns  ┊ GC (mean ± σ):  43.71% ± 11.08%
  █▃                                                            ▁
  ██▅▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▄▅▅▅▅ █
  80.4 ns       Histogram: log(frequency) by time       7.23 μs <
 Memory estimate: 1.06 KiB, allocs estimate: 1.
julia> @benchmark $UInt_16.^2
BenchmarkTools.Trial: 8270 samples with 320 evaluations.
 Range (min … max):  315.113 ns … 202.421 μs  ┊ GC (min … max):  0.00% … 99.80%
 Time  (median):     361.734 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):     1.899 μs ±  16.395 μs  ┊ GC (mean ± σ):  79.09% ±  9.07%
  ▅▆█▆▂▃▁▂▂▁                                                    ▁
  ██████████▇▆▆███▇▆▆▆▆▅▅▅▅▅▄▁▅▄▄▄▅▃▄▃▃▄▄▅▆▅▆▆▇▇▇▇▇▇▇█▆▇▆▆▄▆▅▄▆ █
  315 ns        Histogram: log(frequency) by time       1.35 μs <
 Memory estimate: 2.06 KiB, allocs estimate: 1.
julia> @benchmark $UInt_32.^2
BenchmarkTools.Trial: 9148 samples with 194 evaluations.
 Range (min … max):  525.278 ns … 236.308 μs  ┊ GC (min … max):  0.00% … 99.74%
 Time  (median):     582.335 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):     2.782 μs ±  21.821 μs  ┊ GC (mean ± σ):  78.12% ±  9.84%
  ▇█▄▂▂▃▁  ▁                                                    ▁
  ███████▇████▇▆▆▅▆▆▄▄▄▄▄▄▃▃▃▄▃▃▄▅▅▆▅▅▅▆▆▅▆▅▁▃▄▁▁▁▁▁▁▄▁▁▃▁▁▁▁▁▃ █
  525 ns        Histogram: log(frequency) by time       2.92 μs <
 Memory estimate: 4.06 KiB, allocs estimate: 1.
julia> @benchmark $UInt_64.^2
BenchmarkTools.Trial: 10000 samples with 45 evaluations.
 Range (min … max):  859.467 ns … 592.092 μs  ┊ GC (min … max):  0.00% … 99.80%
 Time  (median):     988.933 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):     3.448 μs ±  36.452 μs  ┊ GC (mean ± σ):  70.21% ±  6.60%
    ▁▃▄█▃                                                        
  ▃▅█████▇▅█▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▂▁▂ ▃
  859 ns           Histogram: frequency by time          2.3 μs <
 Memory estimate: 7.94 KiB, allocs estimate: 1.
julia> @benchmark $UInt_128.^2
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range (min … max):   7.691 μs …   4.544 ms  ┊ GC (min … max):  0.00% … 99.77%
 Time  (median):      7.794 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   11.989 μs ± 115.585 μs  ┊ GC (mean ± σ):  27.18% ±  2.82%
  █▄▃▃▂▂▁                  ▁                                   ▁
  █████████████▇▇▇▇▇▆▇▆▇▆▇██▇▇██▇▆▇▅▆▅▅▅▆▅▅▅▅▅▄▅▅▃▄▅▅▅▅▅▅▅▄▅▄▅ █
  7.69 μs       Histogram: log(frequency) by time      20.6 μs <
 Memory estimate: 15.75 KiB, allocs estimate: 1.