TLDR: I want to give runtime branch prediction hits for x86-64, ideally if compiled by MSVC without asm, for a branch that is based on random data, by peeking into that data. Is it possible?
Assume sequentially interpreting a byte stream, where variable-sized StructureA and StructureB occur, distinguished by some bit patterns, and they occur randomly with approximately equal probability.
They have different functions to interpret them, so there's a branch dependent on data.
As CPU will not be able to predict a random pattern, mispredictions are expected to delay execution.
I see at least two ways I could provide information for branching in advance:
- By peeking forward into the stream
- By capturing bits, then by starting processing independently both
StructureAandStructureB, interleaving code, hoping that out of order superscalar execution make the lines of code where I processStructureAandStructureBexecuting simultaneously, then branching, and discarding results for the wrong structure.
I know there are (exotic) architectures which always employ delayed branching instead of branch prediction, so at least the second option would have worked.
But is there a way to give such branch prediction hint on usual x86-64 ?
I'm looking for machine instruction or something like this.
Although if such thing exist, I ideally want it to compile as C++ code. MSVC, x64, Windows 10, if these details matter.
As it is apparently not possible, do following workarounds make sense:
- Always alternate branches, when there are consecutive
StructureA, still process fakeStructureB, so that the pattern is predictable. - Always take both branches, discard results of wrong branch by
cmov - Make some static hint with some sort of
[[likely]]towards longer branch, assuming they are not equally fast.
By make sense I mean whether they can they benefit, i.e. produce better throughput than control flow with unpredictable branches?