3
\$\begingroup\$

I am posting a message here because I am new to assembly programming. My goal today was to re-code strdup in assembly, so to save my first parameter which is a string (const char*), I have doubts when handling RSP to reserve space for the string and how I reset it at the end.

However I'm not sure I did well, and I would like to have a review, and advice on what I could improve.

extern malloc extern _ft_strlen extern _ft_strcpy
section .text
 global _ft_strdup
 _ft_strdup:
 push rbp ; prologue 
 mov rbp, rsp ; prologue 
 sub rsp, 8 ; reserve space for string 
 mov qword [rbp - 8], rdi ; put first param on the stack
 call _ft_strlen ; strlen the first param
 mov rdi, rax ; put return value on the rdi register 
 call malloc ; malloc rdi bytes 
 cmp rax, 0 ; check if malloc failed 
 je return 
 mov rdi, rax ; put malloc address in rdi (DEST) 
 mov rsi, qword [rbp - 8] ; put source address in rsi (SRC)
 call _ft_strcpy ; copy 
 add rsp, 8 ; reset 
 pop rbp ; epilogue 
 ret
 
 return:
 xor rax, rax
 add rsp, 8 
 pop rbp 
 ret
Sep Roland
4,78317 silver badges28 bronze badges
asked May 16, 2023 at 2:08
\$\endgroup\$
4
  • \$\begingroup\$ Welcome to code review! Do you know if the code works as expected? \$\endgroup\$ Commented May 16, 2023 at 2:38
  • \$\begingroup\$ To be a little blunt, this looks a lot like you added some comments to code generated by a compiler. You repeatedly describe the instruction sub rsp, 8 as "reserve space for string" when, in fact, its purpose is to maintain the required 16-bit alignment of the stack pointer in the System V ABI. The string is stored on the heap, and it is never safe to return a pointer to data on the current stack frame. No one who knew to subtract eight bytes after pushing eight bytes would think that malloc stored the string in them. \$\endgroup\$ Commented May 16, 2023 at 6:27
  • \$\begingroup\$ You also use the internal names from one implementation of the C standard library, when any human writing from scratch would use the documented, human-readable names. \$\endgroup\$ Commented May 16, 2023 at 6:33
  • \$\begingroup\$ By the way, your compiler will generate better code if you tell it to omit the frame pointer and optimize for space. You could replace the lines with rbp with a push rdi. \$\endgroup\$ Commented May 16, 2023 at 6:39

1 Answer 1

2
\$\begingroup\$

If you’re writing x86_64 code, you should be following the System V x86_64 ABI. I’m not sure where your _ft_ versions of the system calls are coming from, but you should be able to call strlen, malloc and other standard-library functions with the arguments in registers, not using the stack except to save variables. (You appear to be doing this correctly.)

You need to reserve space for the terminating null byte, which is not counted in the length reported by strlen. You also need to set this in the duplicate. Currently, you could return an unterminated string!

Since you already check the exact length of the string before you duplicate it, you are better off calling memcpy, which should be more optimized than strcpy. And is definitely safer from buffer overruns!

If you picked this exercise on your own, you should know to avoid the old-fashioned C library functions that don’t bounds-check the string. For new code, you always want a function like this to take an upper bound on the length (which could be SIZE_MAX). This very code contains a bug which could cause an unterminated string to generate a buffer overrun!

You can save a few bytes by having the check to see if allocation failed jump over the code to copy, and writing the epilogue only once.

There is no reason for you to set up rbp as a frame pointer. The fact that you mislabel what these lines do is going to make your grader realize that you copied them from code generated by a compiler.

You’re pretty much stuck calling malloc, but you can replace the calls to strlen and strcpy with x86 string instructions (scasb and movsb), and in an exercise like this, you should.

answered May 16, 2023 at 2:52
\$\endgroup\$
2
  • \$\begingroup\$ Basically, I didn't use the prologue/epilogue because saving the value of my chain on my stack could simply be done with push. However I have noticed that compilers define a stack frame when I manually manipulate the stack to define local variables. Therefore in my case I did this for learning interests and would like to save the source string by manipulating the stack while respecting the 16 byte alignment. could you tell me how to achieve this correctly ? PS: The strlen etc functions are perfect copies of the glibc functions \$\endgroup\$ Commented May 16, 2023 at 9:50
  • \$\begingroup\$ @pharaoh2000 To maintain the 16-byte alignment of the stack pointer, if you push an odd number of 8-byte registers onto the stack, subtract another eight bytes from rsp. \$\endgroup\$ Commented May 16, 2023 at 15:36

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.