The compiler should produce the same assembly for 1 and 2 and it may unroll the loop in option 3.  When faced with questions like this, a useful tool you can use to empirically test what's going on is to look at the assembly produced by the compiler. In g++ this can be achieved using the -S switch.
For example, both options 1 and 2 produce this assembler when compiled with the command g++ -S inc.cpp (using g++ 4.5.2)
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    addl    $5, -4(%rbp)
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
g++ produces significantly less efficient assembler for option 3:
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    jmp .L2
.L3:
    addl    $1, -4(%rbp)
    addl    $1, -8(%rbp)
.L2:
    cmpl    $4, -8(%rbp)
    setle   %al
    testb   %al, %al
    jne .L3
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
But with optimisation on (even -O1) g++ produces this for all 3 options:
main:
.LFB0:
    .cfi_startproc
    leal    5(%rdi), %eax
    ret
    .cfi_endproc
g++ not only unrolls the loop in option 3, but it also uses the lea instruction to do the addition in a single instruction instead of faffing about with mov.
So g++ will always produce the same assembly for options 1 and 2. g++ will produce the same assembly for all 3 options only if you explicitly turn optimisation on (which is the behaviour you'd probably expect).  
(and it looks like you should be able to inspect the assembly produced by C# too, although I've never tried that)