SLAE64 - Egg Hunter
The third assignment of the SLAE64 exam states:
- Study about the Egg Hunter shellcode
- Create a working demo of the Egg Hunter
- It should be configurable for different payloads
I for one had not heard before of the concept of an egg hunter so a little searching around led me to a (the?) paper by skape called Safely Searching Process Virtual Address Space published in 2004.
In a nutshell an egg hunter is a piece of code that searches the virtual address space (VAS) of a process looking for a predefined marker, called an egg. Once this marker has been found it jumps to it and the code that was marked by the egg is executed in the context of the application.
The situations where one would use an egg hunter are situations where an attacker is able to inject code of arbitrary length into the application’s memory but cannot predict its final location. An overflow elsewhere which would otherwise not allow uploading the larger piece of code could then be utilized to insert the egg hunter. It then hunts for the egg in memory to find the rest of the exploit.
The paper defines three criteria an egg hunter should satisfy:
- It should be robust
- It must be small
- It should be fast
The paper then makes a number of observations which I’ll summarize here:
- the egg should be 8 bytes; this is due to the fact the egg hunter would typically store the egg as a 4 byte key and it may otherwise end up finding it’s own key rather than the actual egg when searching memory;
- the pattern of the egg should be easy to recognize by the hunter without requiring a lot of code to verify the “something” is actually the egg (criterium 2);
- the key could be valid code; for example:
0x50905090
(thus the egg is the dword0x50905090
placed in memory twice in a row) which decodes to:nop;push eax;nop;push eax
. When the hunter jumps to it the key mustn’t contain invalid instructions or the application would crash, or, the hunter should account for this and jump beyond the marker to first bytes of actual code. - iterating through memory addresses and potentially dereferencing uninitialized/unavailabale memory could lead to a segmentation fault. Registering a “fake”
SIGSEGV
handler requires too much code according to the author so it is argued it’s better to rely on standard syscalls such asaccess(2)
andsigaction(2)
to catch theEFAULT
error upon touching uninitialized memory.
Note however that while the paper focusses on Linux/x86, the described concepts do generally apply to Linux/x86_64 too. However it should be noted that the 64-bit shellcode cannot really come close to their 32-bit equivalents of 30-odd bytes with a runtime below 10 seconds. To me this effectively renders this (interesting) technique not feasible for practical usage on Linux/x86_64 as the haystack to search for our needle is so much bigger.
The second implementation described by the author eliminates the need for an executable egg by using the scasd
instruction which actually does the incrementing of EDI by four bytes per comparison. Thus bypassing the need to jump to the beginning of the egg or explicitly accounting for 8 bytes of non-executable egg.
For this assignment I have based my code on skape’s access(2) revisited implementation. I thought this implementation was very clear in what it’s doing without making sacrifices in size. In the end it comes in at a total of 45 bytes.
I have used the egg value of 0x90509050
which is actually valid code for:
nop
push eax
nop
push eax
but due to using the scasd
call we don’t actually end up jumping to the very beginning of our egg. RDI has been advanced by scasd
until right where our actual payload would begin.
One way we could speed this up can be learned from pmap $PID
where $PID
is the process ID of our shellcode wrapper. For example:
root@slae64:~# pmap 20182
20182: ./egghunter
00005635d4eb4000 4K r-x-- egghunter
00005635d50b4000 4K r-x-- egghunter
00005635d50b5000 4K rwx-- egghunter
00005635d5c24000 132K rwx-- [ anon ]
00007f40cc854000 1948K r-x-- libc-2.27.so
00007f40cca3b000 2048K ----- libc-2.27.so
00007f40ccc3b000 16K r-x-- libc-2.27.so
00007f40ccc3f000 8K rwx-- libc-2.27.so
00007f40ccc41000 16K rwx-- [ anon ]
00007f40ccc45000 156K r-x-- ld-2.27.so
00007f40cce63000 8K rwx-- [ anon ]
00007f40cce6c000 4K r-x-- ld-2.27.so
00007f40cce6d000 4K rwx-- ld-2.27.so
00007f40cce6e000 4K rwx-- [ anon ]
00007ffc5e1bc000 132K rwx-- [ stack ]
00007ffc5e1f2000 12K r---- [ anon ]
00007ffc5e1f5000 8K r-x-- [ anon ]
ffffffffff600000 4K r-x-- [ anon ]
total 4512K
As you can see the program is mapped into memory starting at 0x5635d4eb4000
yet our egg hunter will spend a considerable amount of time iterating through lower memory before getting to that address. One optimization at the cost of code size might be to determine the base address and start searching from there.
In order to verify my code is correct I forced my shellcode wrapper to place the .text
section at a relatively low address so the egg would be found quite quickly. I used the -Wl,-Ttext-segment,0x40000000
compiler flag for that.
Wrapping up⌗
I have uploaded my code to jasperla/slae64 on GitHub:
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification. Student ID: SLAE64-1614