Strcpy Using Cdecl And Sysenter Calling Conventions
strcpy using cdecl and sysenter calling convention
Intro:
I will be making a binary that
- Implements
strcpy
andstrlen
without a libc - Allocates a single page of dynamic memory
- Copies a string into that memory
See [Part 1]() and [Quick and Dirty Assembly]()
CDECL Calling conventions
The calling convention for cdecl is:
- Arguments are passed R to L onto the stack (e.g., first argument is pushed last)
- The caller cleans the stack
- preserve ebx, esi, edi, ebp, esp
sysenter and sysexit calling conventions
https://en.wikibooks.org/wiki/X86_Assembly/Interfacing_with_Linux
syscall | |
---|---|
argument 1 | ebx |
argument 2 | ecx |
argument 3 | edx |
argument 4 | esi |
argument 5 | edi |
argument 6 | ebp |
syscall number | eax |
The sysenter
call results in the function _kernel_vsyscall
being called. The end of this function contains the following instructions
pop %ebp
pop %edx
pop %ecx
ret
Therefore we need to prep the stack with the appropriate values before actually executing the sysenter
.
- Push address of where to return to (e.g., saved EIP)
- Push
ecx
edx
andebp
https://reverseengineering.stackexchange.com/questions/2869/how-to-use-sysenter-under-linux
Getting memory with mmap2 syscall
void *mmap2(unsigned long addr, unsigned long length,
unsigned long prot, unsigned long flags,
unsigned long fd, unsigned long pgoffset)
Both the flags
and prot
parameters take constants in C to make our lives easier. We’ll have to look through the Linux source code to find them.
; https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/mman-common.h#L23
%define PROT_READ 0x1
%define PROT_WRITE 0x2
%define MAP_ANONYMOUS 0x20
; https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/mman.h#L17
%define MAP_PRIVATE 0x2
Now we plug our parameters into the appropriate registers and call syscall
mov ebx, 0
mov ecx, 4096 ; length < page length (4k) results in a page being allocated anyway
mov edx, PROT_READ
or edx, PROT_WRITE ; R/W permissions
mov esi, MAP_ANONYMOUS
or esi, MAP_PRIVATE ; private and not file backed (just allocate memory, don't make it point to a file)
mov edi, -1 ; no fd
mov ebp, 0x0 ; no offset
mov eax, 192 ; syscall 192 is mmap2, 90 is mmap but fails b/c it wants an argument struct
; https://stackoverflow.com/questions/59923709/problem-trying-to-call-mmap-in-32-bit-i386-assembly#comment105970761_59923709
Calling strcpy
The man page for strcpy
gives use the following function definition
char *strcpy(char *restrict dest, const char *src);
The destination will be the freshly mapped memory. After the call to mmap
the address of that page of memory is in eax
.
To make life easier, I defined a string in the data section for the src
parameter.
section .data
str1: db 'this is only a test', 0
Once again, we plug the parameters into the appropriate registers but this time we use call
. Since only 2 parameters are needed, the others are ignored.
push str1
push eax ; eax is address from mmap
call strcpy
Detour into 32 bit function basics
Prologue
The beginning of a function contains a prologue that does the following
- Save off the previous base pointer (required)
- Set the stack pointer equal to the current base pointer (required)
- Subtract N bytes from the stack pointer for any local variables (if needed)
- Save any preserved registers that this function clobbers (if needed)
; prologue example
push ebp
mov ebp, esp
sub esp, 0xC
push ebx
The stack after the prologue
Using the ebp
as a reference we can access the arguments, saved EIP, and any local variables. Remember the stack grows downward (subtract = using stack space)
; more args here if needed
ebp + 0xC <- argument 2
ebp + 0x8 <- argument 1
ebp + 0x4 <- Saved EIP
ebp + 0x0 <- Current function's base pointer
ebp - 0x4 <- local var 1
ebp - 0x8 <- local var 2
; more local vars here if needed
ebp - N*(0x4) <- local var N & <-- ESP
Epilogue
The end of a function contains a epilogue that reverses the prologue
- Reset any preserved registers (if needed)
- Add N bytes back to stack to “clean” local vars (if needed)
- Set stack pointer back to base pointer (required)
- Reset previous base pointer (required)
; epilogue example from above example
pop ebx
add esp, 0xC
mov esp, ebp
pop ebp
strcpy part 1 - Determining how many bytes to copy
In order to copy the string, we need to know exactly how many bytes the source string is. Calling strlen
will give us that length. We have to also clean the stack afterward.
strcpy:
push ebp
mov ebp, esp ; End of prologue, no local vars or clobbered regs
xor eax,eax
mov edx, [ebp + 0xC] ; param 2 src
push edx
call strlen
add esp, 0x4
strlen
The man page for strlen
gives us the following function definition
size_t strlen(const char *s);
My implementation for strlen
does the following:
- Clear the
eax
register, to represent the null character\0
- Copy the maximum number of bytes into
edx
andecx
(4k, and this is cheating) - Search until we find a byte in the string that matches
eax
- Subtract
ecx
fromedx
to get the number of bytes
strlen:
push ebp
mov ebp, esp
push ecx
push edi
cld
xor eax,eax
mov ecx, 0x1000
mov edx, ecx
mov edi, [ebp + 0x8]
b:
repne scasb
je strlen_done
strlen_done:
std
sub edx, ecx
mov eax, edx
pop edi
pop ecx
mov esp, ebp
pop ebp
ret
The funky line repne scasb
is shorthand for:
- Compare the contents of the
al
register with the byte at pointed at inrsi
, and then increment or decrement the pointer atesi
(scasb) - Repeat while those bytes are not equal (repne) and
ecx
is not 0
strcpy part 2 - The copy loop
Now that we have the length, we can make a loop that copies the required bytes over.
mov ecx, eax
mov edx, ecx
mov edi, [ebp + 0x8] ; param 1 dst
mov esi, [ebp + 0xC] ; param 2 src
cld
rep movsb
std
mov esp, ebp
pop ebp
ret
The line rep movsb
is shorthand for:
- move the byte pointed at by
esi
into the byte pointed at byedi
(movsb) - Repeat while
ecx
is not 0 (rep)
The final code can be seen (https://github.com/mulpdev/practice/blob/master/asm-strcpy-calling-conventions/strcpy-cdecl.asm