27
\$\begingroup\$

I wrote a stack implementation using a singly linked list in x86_64 assembly. This stack supports the usual push/pop operations as well as first/next for iterating over each element.

I'm looking for general feedback.

Here are the stack subroutines:

; Stack Structure
; Pointer Head
; Pointer Current
; Stack Node
; Pointer Next
; Pointer Value
; input
; void
; output
; stack or 0 on error
StackCreate:
 mov rdi, 24
 call malloc
 test rax,rax
 jz StackCreate_end
 xor rdi, rdi
 mov [rax], rdi
 mov [rax+8], rdi
 ret
StackCreate_end:
 add rsp, 8
 ret
; input
; stack
; output
; void
StackDestroy:
 call free
 ret
; input
; stack
; value
; output
; 0 on error
StackPush:
 push rdi
 push rsi
 mov rdi, 16
 call malloc
 test rax,rax
 jz StackPush_end
 pop rsi
 pop rdi
 mov rdx, [rdi]
 mov [rax], rdx
 mov [rax+8], rsi
 mov [rdi], rax
 ret
StackPush_end:
 add rsp, 16
 ret
; input
; stack
; output
; value or 0 if empty
StackPop:
 mov rsi, rdi
 mov rdi, [rdi]
 test rdi, rdi
 jz StackPop_end
 mov rax, [rdi]
 mov [rsi], rax
 mov rax, [rdi+8]
 push rax
 call free
 pop rax
 ret
StackPop_end:
 xor rax, rax
 ret
 nop;e
; input
; stack
; output
; value or 0 if empty
StackFirst:
 mov rax, [rdi]
 test rax, rax
 jz StackFirst_end
 mov rsi, [rax]
 mov [rdi+8], rsi
 mov rax, [rax+8]
StackFirst_end:
 ret
; input
; stack
; output
; value or 0 if end
StackNext:
 mov rsi, [rdi+8]
 test rsi, rsi
 jz StackNext_end
 mov rax, [rsi]
 mov [rdi+8], rax
 mov rax, [rsi+8]
 ret
StackNext_end:
 xor rax, rax
 ret

Here is a driver:

[extern puts]
[extern malloc]
[extern free]
[SECTION .data]
arguments:
 db "Incorrect arguments.",10,"Expected: stackme <string1> <string2> ...",0
[SECTION .text]
BITS 64
global main
wrong_arguments:
 mov rdi, arguments
 call puts
 jmp main_end
main:
 cmp rdi, 1
 jle wrong_arguments
 lea r12, [rsi+8]
 lea r13, [rdi-1]
 call StackCreate
 test rax, rax
 jz main_end
 mov r14, rax
pushing_loop:
 test r13, r13
 jz pushing_done
 mov rdi, r14
 mov rsi, [r12]
 call StackPush
 add r12, 8
 dec r13
 jmp pushing_loop
pushing_done:
 mov rdi, r14
 call StackFirst
print_loop:
 test rax, rax
 jz print_done
 mov rdi, rax
 call puts
 mov rdi, r14
 call StackNext
 jmp print_loop
print_done:
stack_delete:
 mov rdi, r14
 call StackPop
 test rax, rax
 jnz stack_delete
stack_delete_done:
 mov rdi, r14
 call StackDestroy
main_end:
 xor eax, eax
 ret

Assembling/Linking

nasm -f elf64 -l stackme.lst stackme.asm
gcc -o stackme stackme.o
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Apr 2, 2014 at 2:22
\$\endgroup\$

1 Answer 1

12
+100
\$\begingroup\$

The code is generally well written and easy to understand, but I have a few comments on it that could help improve it.

Eliminate "magic numbers"

In the StackCreate routine, the first instruction is mov rdi,24 but it's not clear what 24 signifies in this context. Either a comment or a named constant or both would help with that.

Add more meaningful comments

The stack structure is not very clear from the comments. We can infer that the stack consists of nodes, but it's not easy to tell from the code or the comments.

Also, consider the user of the stack code. There isn't currently enough within the code to understand the calling sequence, register usage or return values. All of those would be useful to have in the comments. If you're intending to use a standard ABI, it would be useful to say which one.

Use rep ret as appropriate

The gcc compiler emits rep ret when the ret could be the target of a jump. This seems weird (and it is!) but the reason is that the branch prediction logic on both AMD and Intel processors works better when the ret is not the target of branch. So this means the StackFirst_end label in your code should actually have a rep prefix just before the ret.

Use struc as appropriate

Because your stack uses two structures, it would benefit from using NASM's struc/endstruc macros. This would both make it more clear when the code is manipulating data structures and also eliminate quite a few of the "magic numbers" I mentioned earlier.

Consider refactoring StackFirst and StackNext

Other than minor differences in specific register selection, the StackFirst and StackNext routines are very similar. It may be possible to combine them to eliminate some code.

Reduce use of malloc and free

The malloc and free calls tend to be relatively computationally expensive, especially relative to the assembly code you've got. For that reason, it would probably make more sense to either allocate and manage a block rather than calling malloc for every stack push, or to replace calls to malloc and free with some other memory allocation scheme that might be user-replaceable for speed.

Rearrange the loop to avoid unconditional jumps

The current pushing_loop looks like this:

 pushing_loop:
 test r13, r13
 jz pushing_done
 mov rdi, r14
 mov rsi, [r12]
 call StackPush
 add r12, 8
 dec r13
 jmp pushing_loop
 pushing_done:

However, it could be slightly rearranged to eliminate one of the jumps:

 inc r13
 jmp push_test
 pushing_loop:
 mov rdi, r14
 mov rsi, [r12]
 call StackPush
 add r12, 8
 push_test:
 dec r13
 jnz pushing_loop

The unconditional jump is only made once at the top of the loop and all other iterations use only the conditional test at the bottom of the loop. There are a few other places in the code this could be applied.

answered May 5, 2014 at 1:50
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.