5
\$\begingroup\$

I'm writing a program to print binary string of a hardcoded word. Here is how it looks like currently:

main.asm

section .text
 global _start
 extern _print_binary_content
_start:
 push word [word_to_print] ; pushing word. Can we push just one byte?
 call _print_binary_content
 mov rax, 60
 mov rdi, 0
 syscall
section .data
 word_to_print: dw 0xAB0F

printer.asm

SYS_BRK_NUM equ 0x0C
BITS_IN_WORD equ 0x10
SYS_WRITE_NUM equ 0x01
STD_OUT_FD equ 0x01
FIRST_BIT_BIT_MASK equ 0x01
ASCII_NUMBER_OFFSET equ 0x30
section .text
 global _print_binary_content
_print_binary_content:
 pop rbp
 xor ecx, ecx ;zeroing rcx
 xor ebx, ebx ;zeroing rbx
 pop bx ;the word to print the binary content of
 ;sys_brk for current location
 mov rax, SYS_BRK_NUM
 mov rdi, 0
 syscall
 ;end sys_brk
 mov r12, rax ;save the current brake location
 ;sys_brk for memory allocation 16 bytes
 lea rdi, [rax + BITS_IN_WORD]
 mov rax, SYS_BRK_NUM
 syscall
 ;end sys_brk
 xor ecx, ecx
 mov cl, byte BITS_IN_WORD - 1; used as a counter in the loop below
 loop:
 mov dx, bx
 and dx, FIRST_BIT_BIT_MASK
 add dx, ASCII_NUMBER_OFFSET
 mov [r12 + rcx], dl
 shr bx, 0x01
 dec cl
 cmp cl, 0
 jge loop
 mov rsi, r12
 mov rax, SYS_WRITE_NUM
 mov rdi, STD_OUT_FD
 mov rdx, BITS_IN_WORD
 syscall
 push rbp ; pushing return address back
 ret

If I compile link and run this program it works. But the question is about performance and maybe conventions of writing assembly programs. In the file printer.asm I cleaned ecx twice which looks kind of not optimal. Maybe some registers were used not by their purpose (I used intel-manual).

Can you please help me to improve this very simple program?

200_success
146k22 gold badges190 silver badges479 bronze badges
asked Jan 11, 2018 at 12:45
\$\endgroup\$
5
  • \$\begingroup\$ Why did you dynamically allocate memory on the heap (with brk)? And then not free it when you're done? I'm wondering if you had a specific reason for doing that instead of using stack memory for your small fixed-size buffer like in this integer->decimal string function. Also, "binary" is ambiguous in this context. I thought from the title you were going to call write(1, &word, 2), but you're actually converting the word to a base-2 string. \$\endgroup\$ Commented Jan 11, 2018 at 19:39
  • 1
    \$\begingroup\$ push word [word_to_print] - I would expect this one even to fail to compile, but it works! In 64b common OSes there're often very stringent requirements for the rsp modifications, like keeping it 16B aligned before calling other functions (if you want to respect the ABI calling convention, as in this case you are calling your own custom function, which is not obeying the ABI, you can misalign the stack without running into some crash). But it's not even clear why you put the word argument into the stack, while using custom calling convention, why don't you use registers instead? \$\endgroup\$ Commented Jan 29, 2018 at 0:09
  • \$\begingroup\$ @Ped7g Thats why I was asking for review. But now more or less understood thank you. \$\endgroup\$ Commented Jan 29, 2018 at 11:57
  • \$\begingroup\$ To get more skill with stack-based argument passing, check examples of 32 bit ABI code, which was using stack to pass arguments (and how they avoid pop/push of return address by using ebp or esp for addressing also arguments in memory), but first check 64b linux ABI examples, which is passing arguments in registers. = much easier to learn and understand and faster performance-wise, overall the 64b linux ABI is superior to 32b ABI (but has lot more requirements for the rsp value itself! That's tricky for people moving 32->64). Then check the 32b stack examples, to build your skills. \$\endgroup\$ Commented Jan 29, 2018 at 12:02
  • 1
    \$\begingroup\$ @Ped7g Actually later on I found the intel manual about which register has which purpose. There is a good explanation about it. I mean value of what register has to be preserved across function calls. \$\endgroup\$ Commented Jan 29, 2018 at 12:08

3 Answers 3

3
\$\begingroup\$
  • In main.asm you use the magic number 60 while in printer.asm all syscalls have their numbers declared. That's inconsistent.

  • Instead of using SYS_BRK, you should just allocate the memory on the stack. The stack will surely have 16 bytes, so it's not a big deal to have them there. Plus, allocation and deallocation is much faster than using 3 syscalls.

  • Your code produces a memory leak. If you run the print_binary_content repeatedly, your process will allocate more and more memory, 16 more bytes for each call. Allocating the memory on the stack will crash your program quickly if you forget to deallocate the memory, and this will be found early during initial testing.

  • There's no reason to pop ebp and restore it again at the end of the function. This will confuse every debugger out there. Why don't you just leave it on the stack? Afraid of buffer overflows? Your buffer is currently not on the stack, and even if it were, you should just put two canary values around your buffer and check these before returning.

  • In English reading order, the bit with the mask 0x0001 is the last bit, not the first bit.

  • The UNIX convention for writing the syscall constants is SYS_write and SYS_exit, i.e. uppercase SYS and lowercase system call name.

  • The stdout file number is spelled STDOUT_FILENO in POSIX, which is the relevant standard for that constant.

answered Jan 11, 2018 at 23:55
\$\endgroup\$
4
\$\begingroup\$

Low level programming is as much about optimization as it is legibility. As you've noted, yours works, but I'd like to inject a few of my personal preferences here.

_start:
 ; Grab a buffer from stack suitably large enough to convert a 64 bit number
 push rbp
 mov rpb, rsp
 sub rsp, 64 ; Always keep RSP at least DWORD aligned

I prefer to keep functions distinct. Even though encorporating print into the conversion function is doable, but maybe this conversion code could be used somewhere that printing isn't required.

 mov esi, Value ; Establish pointer to 16 bit value
 mov rdi, rbp ; Point to end of ASCII buffer
 call Bin2Asc ; Return a pointer in RSI and characters in RDX

To accomodate SYSCALLs, have the function return values in register(s) that will be required.

 mov eax, SYS_WRITE
 syscall
 leave ; Kill procedure frame
 xor eax, eax
 mov edi, eax
 mov cl, SYS_EXIT
 syscall

NOTE: A lot of times I'm using 32 bit registers and that is because I'm taking advantage of processors sign extension mechanism, where some instruction sign extend into 64 bits.

Bin2Asc:
 xor eax, eax ; Clear accum
 mov edx, eax ; EDX is what SYS_WRITE is going to need
 lodsw
 std ; Change direction so buffer is populated in reverse
 .L0: push rax
 and al, 1 ; Isolate bit
 xor al, '0' ; Convert to "0" or "1"
 stosb ; Write to buffer
 inc dl ; Bump character cound
 pop rax
 shr eax, 1 ; Until RAX = 0
 jnz .L0
 cld ; Reset direction flag
 inc edi ; Point back to first character
 ret

Might this code have ERRORS, it may very well have, but as I'm not doing it on a Linux box and I usually like to check my work, but it does demonstrate a concept that is optimized at least a bit.

answered Jan 11, 2018 at 23:46
\$\endgroup\$
2
  • \$\begingroup\$ Writing a 32-bit register zero extends to 64-bit. (Fun fact: MIPS64 shift instructions do sign-extend implicitly; most other 32-bit MIPS instructions require operands to be correctly sign-extended on a MIPS64) \$\endgroup\$ Commented Apr 11, 2020 at 12:48
  • \$\begingroup\$ inc edi - You just truncated a 64-bit pointer to 32-bit. Passing that to write() will return -EFAULT. And BTW, you might as well just dec rdi / mov [rdi], al inside the loop instead of messing around with std/cld and stosb, and a fixup inc at the end. And really, push/pop inside the loop? \$\endgroup\$ Commented Apr 11, 2020 at 12:51
3
\$\begingroup\$

Sorry I have to virtually repeat my answer to another question.

  • The only advantage assembler has over high level languages is an access to the flags. And a bit-to-ascii conversion literally begs for it:

     mov [r12 + rcx], ASCII_NUMBER_OFFSET # Prepare the destination
     shr dx # The LSB lands in a carry flag (CF)
     adc [r12 + rcx], 0 # Add with carry!
    
  • Along the same line, you don't really need to cmp cl, 0. A dec cl instruction conveniently sets the necessary flags when cl falls below 0, so

     dec cl
     jge loop
    

    suffices.

answered Jan 11, 2018 at 23:27
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.