I'm writing a program to print binary string of a hardcoded word. Here is how it looks like currently:
main.asm
section .text
global _start
extern _print_binary_content
_start:
push word [word_to_print] ; pushing word. Can we push just one byte?
call _print_binary_content
mov rax, 60
mov rdi, 0
syscall
section .data
word_to_print: dw 0xAB0F
printer.asm
SYS_BRK_NUM equ 0x0C
BITS_IN_WORD equ 0x10
SYS_WRITE_NUM equ 0x01
STD_OUT_FD equ 0x01
FIRST_BIT_BIT_MASK equ 0x01
ASCII_NUMBER_OFFSET equ 0x30
section .text
global _print_binary_content
_print_binary_content:
pop rbp
xor ecx, ecx ;zeroing rcx
xor ebx, ebx ;zeroing rbx
pop bx ;the word to print the binary content of
;sys_brk for current location
mov rax, SYS_BRK_NUM
mov rdi, 0
syscall
;end sys_brk
mov r12, rax ;save the current brake location
;sys_brk for memory allocation 16 bytes
lea rdi, [rax + BITS_IN_WORD]
mov rax, SYS_BRK_NUM
syscall
;end sys_brk
xor ecx, ecx
mov cl, byte BITS_IN_WORD - 1; used as a counter in the loop below
loop:
mov dx, bx
and dx, FIRST_BIT_BIT_MASK
add dx, ASCII_NUMBER_OFFSET
mov [r12 + rcx], dl
shr bx, 0x01
dec cl
cmp cl, 0
jge loop
mov rsi, r12
mov rax, SYS_WRITE_NUM
mov rdi, STD_OUT_FD
mov rdx, BITS_IN_WORD
syscall
push rbp ; pushing return address back
ret
If I compile link and run this program it works. But the question is about performance and maybe conventions of writing assembly programs. In the file printer.asm
I cleaned ecx
twice which looks kind of not optimal. Maybe some registers were used not by their purpose (I used intel-manual).
Can you please help me to improve this very simple program?
3 Answers 3
In
main.asm
you use the magic number60
while inprinter.asm
all syscalls have their numbers declared. That's inconsistent.Instead of using
SYS_BRK
, you should just allocate the memory on the stack. The stack will surely have 16 bytes, so it's not a big deal to have them there. Plus, allocation and deallocation is much faster than using 3 syscalls.Your code produces a memory leak. If you run the
print_binary_content
repeatedly, your process will allocate more and more memory, 16 more bytes for each call. Allocating the memory on the stack will crash your program quickly if you forget to deallocate the memory, and this will be found early during initial testing.There's no reason to
pop ebp
and restore it again at the end of the function. This will confuse every debugger out there. Why don't you just leave it on the stack? Afraid of buffer overflows? Your buffer is currently not on the stack, and even if it were, you should just put two canary values around your buffer and check these before returning.In English reading order, the bit with the mask
0x0001
is the last bit, not the first bit.The UNIX convention for writing the syscall constants is
SYS_write
andSYS_exit
, i.e. uppercaseSYS
and lowercase system call name.The
stdout
file number is spelledSTDOUT_FILENO
in POSIX, which is the relevant standard for that constant.
Low level programming is as much about optimization as it is legibility. As you've noted, yours works, but I'd like to inject a few of my personal preferences here.
_start:
; Grab a buffer from stack suitably large enough to convert a 64 bit number
push rbp
mov rpb, rsp
sub rsp, 64 ; Always keep RSP at least DWORD aligned
I prefer to keep functions distinct. Even though encorporating print into the conversion function is doable, but maybe this conversion code could be used somewhere that printing isn't required.
mov esi, Value ; Establish pointer to 16 bit value
mov rdi, rbp ; Point to end of ASCII buffer
call Bin2Asc ; Return a pointer in RSI and characters in RDX
To accomodate SYSCALLs, have the function return values in register(s) that will be required.
mov eax, SYS_WRITE
syscall
leave ; Kill procedure frame
xor eax, eax
mov edi, eax
mov cl, SYS_EXIT
syscall
NOTE: A lot of times I'm using 32 bit registers and that is because I'm taking advantage of processors sign extension mechanism, where some instruction sign extend into 64 bits.
Bin2Asc:
xor eax, eax ; Clear accum
mov edx, eax ; EDX is what SYS_WRITE is going to need
lodsw
std ; Change direction so buffer is populated in reverse
.L0: push rax
and al, 1 ; Isolate bit
xor al, '0' ; Convert to "0" or "1"
stosb ; Write to buffer
inc dl ; Bump character cound
pop rax
shr eax, 1 ; Until RAX = 0
jnz .L0
cld ; Reset direction flag
inc edi ; Point back to first character
ret
Might this code have ERRORS, it may very well have, but as I'm not doing it on a Linux box and I usually like to check my work, but it does demonstrate a concept that is optimized at least a bit.
-
\$\begingroup\$ Writing a 32-bit register zero extends to 64-bit. (Fun fact: MIPS64 shift instructions do sign-extend implicitly; most other 32-bit MIPS instructions require operands to be correctly sign-extended on a MIPS64) \$\endgroup\$Peter Cordes– Peter Cordes2020年04月11日 12:48:18 +00:00Commented Apr 11, 2020 at 12:48
-
\$\begingroup\$
inc edi
- You just truncated a 64-bit pointer to 32-bit. Passing that to write() will return -EFAULT. And BTW, you might as well justdec rdi
/mov [rdi], al
inside the loop instead of messing around with std/cld and stosb, and a fixup inc at the end. And really, push/pop inside the loop? \$\endgroup\$Peter Cordes– Peter Cordes2020年04月11日 12:51:17 +00:00Commented Apr 11, 2020 at 12:51
Sorry I have to virtually repeat my answer to another question.
The only advantage assembler has over high level languages is an access to the flags. And a bit-to-ascii conversion literally begs for it:
mov [r12 + rcx], ASCII_NUMBER_OFFSET # Prepare the destination shr dx # The LSB lands in a carry flag (CF) adc [r12 + rcx], 0 # Add with carry!
Along the same line, you don't really need to
cmp cl, 0
. Adec cl
instruction conveniently sets the necessary flags whencl
falls below 0, sodec cl jge loop
suffices.
brk
)? And then not free it when you're done? I'm wondering if you had a specific reason for doing that instead of using stack memory for your small fixed-size buffer like in this integer->decimal string function. Also, "binary" is ambiguous in this context. I thought from the title you were going to callwrite(1, &word, 2)
, but you're actually converting the word to a base-2 string. \$\endgroup\$push word [word_to_print]
- I would expect this one even to fail to compile, but it works! In 64b common OSes there're often very stringent requirements for thersp
modifications, like keeping it 16B aligned before calling other functions (if you want to respect the ABI calling convention, as in this case you are calling your own custom function, which is not obeying the ABI, you can misalign the stack without running into some crash). But it's not even clear why you put the word argument into the stack, while using custom calling convention, why don't you use registers instead? \$\endgroup\$pop
/push
of return address by usingebp
oresp
for addressing also arguments in memory), but first check 64b linux ABI examples, which is passing arguments in registers. = much easier to learn and understand and faster performance-wise, overall the 64b linux ABI is superior to 32b ABI (but has lot more requirements for thersp
value itself! That's tricky for people moving 32->64). Then check the 32b stack examples, to build your skills. \$\endgroup\$