I figure most of you know that goto is a reserved keyword in the Java language but is not actually used. And you probably also know that goto is a Java Virtual Machine (JVM) opcode. I reckon all the sophisticated control flow structures of Java, Scala and Kotlin are, at the JVM level, implemented using some combination of goto and ifeq, ifle, iflt, etc.
Looking at the JVM spec https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.goto_w I see there's also a goto_w opcode. Whereas goto takes a 2-byte branch offset, goto_w takes a 4-byte branch offset. The spec states that
Although the goto_w instruction takes a 4-byte branch offset, other factors limit the size of a method to 65535 bytes (§4.11). This limit may be raised in a future release of the Java Virtual Machine.
It sounds to me like goto_w is future-proofing, like some of the other *_w opcodes. But it also occurs to me that maybe goto_w could be used with the two more significant bytes zeroed out and the two less significant bytes the same as for goto, with adjustments as needed.
For example, given this Java Switch-Case (or Scala Match-Case):
12: lookupswitch {
112785: 48 // case "red"
3027034: 76 // case "green"
98619139: 62 // case "blue"
default: 87
}
48: aload_2
49: ldc #17 // String red
51: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
54: ifeq 87
57: iconst_0
58: istore_3
59: goto 87
62: aload_2
63: ldc #19 // String green
65: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
68: ifeq 87
71: iconst_1
72: istore_3
73: goto 87
76: aload_2
77: ldc #20 // String blue
79: invokevirtual #18
// etc.
we could rewrite it as
12: lookupswitch {
112785: 48
3027034: 78
98619139: 64
default: 91
}
48: aload_2
49: ldc #17 // String red
51: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
54: ifeq 91 // 00 5B
57: iconst_0
58: istore_3
59: goto_w 91 // 00 00 00 5B
64: aload_2
65: ldc #19 // String green
67: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
70: ifeq 91
73: iconst_1
74: istore_3
75: goto_w 91
79: aload_2
81: ldc #20 // String blue
83: invokevirtual #18
// etc.
I haven't actually tried this, since I've probably made a mistake changing the "line numbers" to accommodate the goto_ws. But since it's in the spec, it should be possible to do it.
My question is whether there is a reason a compiler or other generator of bytecode might use goto_w with the current 65535 limit other than to show that it can be done?