Assembler. Algorithm for incrementing a decimal number

Question 1

I share an algorithm that I have written myself. It is useful for incrementing any digit of a decimal number. That way, a conversion to decimal is not necessary.

Please excuse the mistakes. I am learning assembler and x86 architecture. I am sure that it is possible to improve and optimize it (thank you). This code obviously uses recursion. But it doesn't go through the unnecessary digits. Following the advice of Peter Cordes. I think the algorithm is very readable thanks to the labels. It is written using the NASM syntax. Tested on Intel i3 x86-64 CPU and Linux operating system.

section .data
 number db "////////////////////////////////////////////////////////////////";For 64-digits integers
 countN dd 0
section .bss
 index resb 1 
section .text
 global CMAIN
INCREMENT:
 ;xor rax, rax
 ;mov eax, [index]
 cmp byte[number+eax], "9" 
 jz EQUAL
 jmp checkNEWdigit
EQUAL:
 mov byte[number +eax],"0"
 dec eax
 jmp INCREMENT
NEWdigit:
 mov byte[number + eax],"1"
 inc byte[countN]
 jmp FINISH
checkNEWdigit:
 cmp byte[number + eax],"/" ;check if the position is empty
 jz NEWdigit
 inc byte[number+eax]
FINISH: 
 ret
 
CMAIN:
;Set the number to be increased- For example: 
mov byte[number+59],"9"; Ten thousands
mov byte[number+60],"9"; Thousands
mov byte[number+61],"9"; Hundreds
mov byte[number+62],"9"; Tens
mov byte[number+63],"9"; Ones
;Until 64 digits
mov byte[countN],5 ;Set the number of digits that the number has at the beginning.
mov [index],byte 63 ;Set the position to be increased(63-ones;62-Tens;61-Hundreds..etc..),
xor rax, rax
mov eax, [index];the use of the variable "index" is for readability only.
call INCREMENT
;the next instructions are needed to calculate the position to be printed
mov rbx, 64 ;total of digits
sub rbx, [countN] ;sub number of digits used 
add rbx,number;add the base 
;print
xor rax,rax
mov rax, 1 
mov rdi, 1
mov rsi, rbx
mov rdx, [countN]
syscall
;exit
mov rax,60
mov rdi, 2
syscall

Question 2

This code obviously uses recursion.

That's not the case. For recursion your INCREMENT routine would have to call itself which it doesn't. The INCREMENT subroutine simply iterates over the characters in the text.

You're right that there is room for improvement. I'll share these observations with you.

xor rax,rax
mov rax, 1

It's not useful to clear RAX right before loading it with a value that is going to overwrite the whole qword anyway.
Writing to the low dword of a 64-bit register will automatically zero out the high dword.

The above code simply becomes mov eax, 1. You can apply this several times using different registers.

mov byte[countN],5
sub rbx, [countN]
mov rdx, [countN]
mov eax, [index]

You have defined countN as a dword variable but the program is using it both as a byte and as a qword!
You have reserved a single byte for the index variable but your program is also using it as a dword!
Always use your variables for what size they really have. Don't count on the fact that your index variable happens to be the last item in the .BSS and that your program worked well because of this.

mov byte[number+59],"9"; Ten thousands
mov byte[number+60],"9"; Thousands
mov byte[number+61],"9"; Hundreds
mov byte[number+62],"9"; Tens
mov byte[number+63],"9"; Ones

You can initialize the integer number "99999" using fewer instructions:

mov rax, "///99999" (*)
mov [number+56], rax

(*) The x86-64 architecture only allows 64-bit immediates on register loads. See this SO answer for more info.

mov rbx, 64 ;total of digits
sub rbx, [countN] ;sub number of digits used 
add rbx,number;add the base

Calculating the address where the number starts only needs these 2 instructions, and because you need this in RSI, why not calculate it in RSI:

lea esi, [number+64]
sub esi, [countN]

 jz EQUAL
 jmp checkNEWdigit
EQUAL:

Code like this should use the opposite condition and shave off the extra jump:

 jnz checkNEWdigit
EQUAL:

I think the algorithm is very readable thanks to the labels.

You're right that it is readable, but then again it is also some kind of spaghetti code with all that jumping around!

If you can accept using another character like ":" for the non-occupied positions, then your INCREMENT subroutine would only need 1 comparison instead of the current 2. The trick is to choose a character that is above the character "9" in the ASCII set.
Below is my rewrite that reduces the number of instructions that are needed:

INCREMENT:
 cmp byte [number+rax], "9" ; "0123456789:"
 jb .Increment
 mov byte [number+rax], "0"
 ja .NewDigit
 dec eax
 jmp INCREMENT
.NewDigit:
 inc dword [countN]
.Increment:
 inc byte [number+rax]
 ret

Please notice that creating a new digit happens in 2 steps: first "0" is written over the ":" and then that zero is incremented. Because creating a new digit happens rarely, this can't possibly make the code any worse. It does however produce clean code.

You've done well to not include code that checks for an overflow after the 64th digit. Who would still be around to see that happen?

Question 3

Thank you. I will apply the changes. I thought I could name recursion. I only use the basic instructions because I have a lot to learn.Thank you very much also for this ":"

Question 4

I have needed to use this to get it to work, mov rax, ":::19999" mov qword [number+56],rax To what is due? Is it better to use this to maintain 32bit compatibility? mov eax, ":::9" mov dword [number+56],eax mov eax, "9999" mov dword [number+60],eax –

Question 5

By the way, now I'm using 32-bit registers, and initialized countN with dd. and reserving index with resd. It is right?

Question 6

@Sep: Writing to the low dword of a 64-bit register will automatically zero out the high dword if the value is at most 2GB-1. - You seem to be mixing up two things. Writing the low 32 bits, like mov eax, -1, will always zero-extend and clear the high bits of RAX, allowing 64-bit values of 0..4G-1. Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?. Writing the whole 64-bit register with a value between 0 and 2^31-1 will zero the high half. So NASM can optimize mov rax, 1 into mov eax, 1 for you.

Question 7

Also, [number+eax] should be [number+rax]. When you know EAX is already zero-extended into RAX, there's no advantage to override the addressing to 32-bit instead of the default 64-bit. (@FER31). Although I probably would have used a pointer, instead of hard-coding the address of static storage into the increment function, not least because then it's usable in a PIE executable or on MacOS, not dependent on number's absolute address fitting into a sign-extended disp32.

Sep Roland Sep Roland 4,78317 silver badges28 bronze badges · Accepted Answer · 2021-06-26 22:34:16Z

This code obviously uses recursion.

That's not the case. For recursion your INCREMENT routine would have to call itself which it doesn't. The INCREMENT subroutine simply iterates over the characters in the text.

You're right that there is room for improvement. I'll share these observations with you.

xor rax,rax
mov rax, 1

It's not useful to clear RAX right before loading it with a value that is going to overwrite the whole qword anyway.
Writing to the low dword of a 64-bit register will automatically zero out the high dword.

The above code simply becomes mov eax, 1. You can apply this several times using different registers.

mov byte[countN],5
sub rbx, [countN]
mov rdx, [countN]
mov eax, [index]

You have defined countN as a dword variable but the program is using it both as a byte and as a qword!
You have reserved a single byte for the index variable but your program is also using it as a dword!
Always use your variables for what size they really have. Don't count on the fact that your index variable happens to be the last item in the .BSS and that your program worked well because of this.

mov byte[number+59],"9"; Ten thousands
mov byte[number+60],"9"; Thousands
mov byte[number+61],"9"; Hundreds
mov byte[number+62],"9"; Tens
mov byte[number+63],"9"; Ones

You can initialize the integer number "99999" using fewer instructions:

mov rax, "///99999" (*)
mov [number+56], rax

(*) The x86-64 architecture only allows 64-bit immediates on register loads. See this SO answer for more info.

mov rbx, 64 ;total of digits
sub rbx, [countN] ;sub number of digits used 
add rbx,number;add the base

Calculating the address where the number starts only needs these 2 instructions, and because you need this in RSI, why not calculate it in RSI:

lea esi, [number+64]
sub esi, [countN]

 jz EQUAL
 jmp checkNEWdigit
EQUAL:

Code like this should use the opposite condition and shave off the extra jump:

 jnz checkNEWdigit
EQUAL:

I think the algorithm is very readable thanks to the labels.

You're right that it is readable, but then again it is also some kind of spaghetti code with all that jumping around!

If you can accept using another character like ":" for the non-occupied positions, then your INCREMENT subroutine would only need 1 comparison instead of the current 2. The trick is to choose a character that is above the character "9" in the ASCII set.
Below is my rewrite that reduces the number of instructions that are needed:

INCREMENT:
 cmp byte [number+rax], "9" ; "0123456789:"
 jb .Increment
 mov byte [number+rax], "0"
 ja .NewDigit
 dec eax
 jmp INCREMENT
.NewDigit:
 inc dword [countN]
.Increment:
 inc byte [number+rax]
 ret

Please notice that creating a new digit happens in 2 steps: first "0" is written over the ":" and then that zero is incremented. Because creating a new digit happens rarely, this can't possibly make the code any worse. It does however produce clean code.

You've done well to not include code that checks for an overflow after the 64th digit. Who would still be around to see that happen?

Thank you. I will apply the changes. I thought I could name recursion. I only use the basic instructions because I have a lot to learn.Thank you very much also for this ":"
I have needed to use this to get it to work, mov rax, ":::19999" mov qword [number+56],rax To what is due? Is it better to use this to maintain 32bit compatibility? mov eax, ":::9" mov dword [number+56],eax mov eax, "9999" mov dword [number+60],eax –
By the way, now I'm using 32-bit registers, and initialized countN with dd. and reserving index with resd. It is right?
@Sep: Writing to the low dword of a 64-bit register will automatically zero out the high dword if the value is at most 2GB-1. - You seem to be mixing up two things. Writing the low 32 bits, like mov eax, -1, will always zero-extend and clear the high bits of RAX, allowing 64-bit values of 0..4G-1. Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?. Writing the whole 64-bit register with a value between 0 and 2^31-1 will zero the high half. So NASM can optimize mov rax, 1 into mov eax, 1 for you.
Also, [number+eax] should be [number+rax]. When you know EAX is already zero-extended into RAX, there's no advantage to override the addressing to 32-bit instead of the default 64-bit. (@FER31). Although I probably would have used a pointer, instead of hard-coding the address of static storage into the increment function, not least because then it's usable in a PIE executable or on MacOS, not dependent on number's absolute address fitting into a sign-extended disp32.

Stack Exchange Network

Assembler. Algorithm for incrementing a decimal number

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Assembler. Algorithm for incrementing a decimal number

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions