I am posting a message here because I am new to assembly programming. My goal today was to re-code strdup
in assembly, so to save my first parameter which is a string (const char*
), I have doubts when handling RSP to reserve space for the string and how I reset it at the end.
However I'm not sure I did well, and I would like to have a review, and advice on what I could improve.
extern malloc extern _ft_strlen extern _ft_strcpy
section .text
global _ft_strdup
_ft_strdup:
push rbp ; prologue
mov rbp, rsp ; prologue
sub rsp, 8 ; reserve space for string
mov qword [rbp - 8], rdi ; put first param on the stack
call _ft_strlen ; strlen the first param
mov rdi, rax ; put return value on the rdi register
call malloc ; malloc rdi bytes
cmp rax, 0 ; check if malloc failed
je return
mov rdi, rax ; put malloc address in rdi (DEST)
mov rsi, qword [rbp - 8] ; put source address in rsi (SRC)
call _ft_strcpy ; copy
add rsp, 8 ; reset
pop rbp ; epilogue
ret
return:
xor rax, rax
add rsp, 8
pop rbp
ret
1 Answer 1
If you’re writing x86_64 code, you should be following the System V x86_64 ABI. I’m not sure where your _ft_
versions of the system calls are coming from, but you should be able to call strlen
, malloc
and other standard-library functions with the arguments in registers, not using the stack except to save variables. (You appear to be doing this correctly.)
You need to reserve space for the terminating null byte, which is not counted in the length reported by strlen
. You also need to set this in the duplicate. Currently, you could return an unterminated string!
Since you already check the exact length of the string before you duplicate it, you are better off calling memcpy
, which should be more optimized than strcpy
. And is definitely safer from buffer overruns!
If you picked this exercise on your own, you should know to avoid the old-fashioned C library functions that don’t bounds-check the string. For new code, you always want a function like this to take an upper bound on the length (which could be SIZE_MAX
). This very code contains a bug which could cause an unterminated string to generate a buffer overrun!
You can save a few bytes by having the check to see if allocation failed jump over the code to copy, and writing the epilogue only once.
There is no reason for you to set up rbp
as a frame pointer. The fact that you mislabel what these lines do is going to make your grader realize that you copied them from code generated by a compiler.
You’re pretty much stuck calling malloc
, but you can replace the calls to strlen
and strcpy
with x86 string instructions (scasb
and movsb
), and in an exercise like this, you should.
-
\$\begingroup\$ Basically, I didn't use the prologue/epilogue because saving the value of my chain on my stack could simply be done with push. However I have noticed that compilers define a stack frame when I manually manipulate the stack to define local variables. Therefore in my case I did this for learning interests and would like to save the source string by manipulating the stack while respecting the 16 byte alignment. could you tell me how to achieve this correctly ? PS: The strlen etc functions are perfect copies of the glibc functions \$\endgroup\$knocknock– knocknock2023年05月16日 09:50:54 +00:00Commented May 16, 2023 at 9:50
-
\$\begingroup\$ @pharaoh2000 To maintain the 16-byte alignment of the stack pointer, if you push an odd number of 8-byte registers onto the stack, subtract another eight bytes from
rsp
. \$\endgroup\$Davislor– Davislor2023年05月16日 15:36:04 +00:00Commented May 16, 2023 at 15:36
sub rsp, 8
as "reserve space for string" when, in fact, its purpose is to maintain the required 16-bit alignment of the stack pointer in the System V ABI. The string is stored on the heap, and it is never safe to return a pointer to data on the current stack frame. No one who knew to subtract eight bytes after pushing eight bytes would think thatmalloc
stored the string in them. \$\endgroup\$rbp
with apush rdi
. \$\endgroup\$