TL;DR: You are observing what happens when JIT compiler cannot trust that values are not changing inside the loop. Additionally, in the tiny benchmark like this, Blackhole.consume costs dominate, obscuring the results.
Simplifying the test:
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
public class JdkBenchmarks {
    public int[] values;
    @Setup
    public void setupArray() {
        int count = 1000;
        values = new int[count];
        for(int i = 0; i < count; i++) {
            values[i] = i;
        }
    }
    @Benchmark
    @CompilerControl(CompilerControl.Mode.DONT_INLINE)
    public void indexed(Blackhole bh) {
        for(int i = 0; i < values.length; i++) {
            bh.consume(values[i]);
        }
    }
    @Benchmark
    @CompilerControl(CompilerControl.Mode.DONT_INLINE)
    public void indexed_cached(Blackhole bh) {
        int[] vs = values;
        int length = vs.length;
        for(int i = 0; i < length; i++) {
            bh.consume(vs[i]);
        }
    }
    @Benchmark
    @CompilerControl(CompilerControl.Mode.DONT_INLINE)
    public void enhanced(Blackhole bh) {
        for (int value : values) {
            bh.consume(value);
        }
    }
}
Running both enhanced and indexed_cached under -prof perfasm reveals this hot loop (I specifically did @CompilerControl(DONT_INLINE) to let @Benchmark method be compiled alone, which makes an easier to digest perfasm output):
         ↗  0x...4240: mov  0x10(%r8,%rsi,4),%r10d  ; load values[i], blackhole it
 22.68%  │  0x...4245: mov  0x14(%r8,%rsi,4),%r11d  ; ... repeat 7 more times...
         │  0x...424a: mov  0x18(%r8,%rsi,4),%r10d  ;
 20.95%  │  0x...424f: mov  0x1c(%r8,%rsi,4),%r10d  ;
  0.02%  │  0x...4254: mov  0x20(%r8,%rsi,4),%r11d  ;
 24.73%  │  0x...4259: mov  0x24(%r8,%rsi,4),%r10d  ;
  0.24%  │  0x...425e: mov  0x28(%r8,%rsi,4),%r11d  ;
 20.04%  │  0x...4263: mov  0x2c(%r8,%rsi,4),%r10d  ; 
  0.22%  │  0x...4268: add  $0x8,%esi               ; i += 8
         │  0x...426b: cmp  %ebp,%esi               ; compare i with length (in %ebp)
  0.26%  ╰  0x...426d: jl   0x...4240               ; circle back if 8 more elements available
Very efficient!
Running indexed with -prof perfasm reveals:
         ↗  0x...4170: mov  0xc(%r12,%r8,8),%r9d    ; array bounds check, load values.length
  3.42%  │  0x...4175: cmp  %r9d,%r10d              ; array bounds check, compare i
 16.02%  │  0x...4178: jae  0x...41b1               ;  failed? jump to exception handling
         │  0x...417a: lea  (%r12,%r8,8),%r11       ; load values[i], part 1
  0.04%  │  0x...417e: mov  0x10(%r11,%r10,4),%r11d ; load values[i], part 2
         │                                          ; %r11d is blackholed
 35.69%  │  0x...4183: mov  0xc(%rsi),%r8d          ; get "values"
  0.71%  │  0x...4187: mov  0x348(%r15),%r11        ; safepoint poll, part 1 (JVM infra)
  4.03%  │  0x...418e: inc  %r10d                   ; i++
  0.12%  │  0x...4191: test %eax,(%r11)             ; safepoint poll, part 2 (JVM infra)
 27.74%  │  0x...4194: mov  0xc(%r12,%r8,8),%r9d    ; load values.length
  8.53%  │  0x...4199: cmp  %r9d,%r10d              ; check i < values.length
  0.24%  ╰  0x...419c: jl   0x...4170               ; circle back if more 
This is because Blackhole.consume call is opaque to the compiler (like many other non-inlined calls), so it has to conservatively presume that values can change in the middle of the loop!
Which means, compiler cannot stash values in a register, it cannot trust the array bounds check to always succeed, it cannot even guarantee the loop terminates (hence safepoint polls), and on top of that, loop unrolling does not want to multiply that per-element mess even more.
So you get the penalty like this (TR 3970X, JDK 17.0.2 EA, Linux x86_64):
Benchmark                     Mode  Cnt     Score   Error  Units
JdkBenchmarks.enhanced        avgt    5   144.962 ± 0.918  ns/op
JdkBenchmarks.indexed         avgt    5  1030.981 ± 3.775  ns/op ; + 880 ns/op!
JdkBenchmarks.indexed_cached  avgt    5   144.799 ± 0.643  ns/op ; same as enhanced
Additional fun part:
On most JDKs the dominating costs are the costs of calling the Blackhole.consume in this test. Compared to the cost of array access, the cost of Java-style Blackhole is quite bad. With JDK 17+ and JMH 1.34, the compiler Blackholes would be used, and thus provide much more fidelity for the test.
Without compiler blackholes, the effect hides in the Blackhole overhead nearly completely (>25x overhead means we can execute a lot of bad code preceding the Blackhole call!):
Benchmark                     Mode  Cnt     Score   Error  Units
JdkBenchmarks.enhanced        avgt    5  4062.866 ± 4.736  ns/op
JdkBenchmarks.indexed         avgt    5  4071.620 ± 1.057  ns/op ; + 10 ns/op [whoops]
JdkBenchmarks.indexed_cached  avgt    5  4061.390 ± 0.692  ns/op ; same as enhanced
It would re-manifest if we drop @CompilerControl(DONT_INLINE), because the resulting generated code would be much messier:
Benchmark                     Mode  Cnt     Score    Error  Units
JdkBenchmarks.enhanced        avgt    5  4067.118 ± 40.699  ns/op
JdkBenchmarks.indexed         avgt    5  4601.370 ±  0.632  ns/op ; + 530 ns/op
JdkBenchmarks.indexed_cached  avgt    5  4064.455 ±  1.554  ns/op ; same as enhanced