Assembly: Sum up the single bytes of a 32bit-register to a checksum

Question 1

Exercise description:

"Write a program that takes a double word (4 bytes) as an argument, and then adds all the 4 bytes. It returns the sum as output. Note that all the bytes are considered to be of unsigned value.

Example: For the number 03ff0103 the program will calculate 0x03 + 0xff + 0x01 + 0x3 = 0x106, and the output will be 0x106

HINT: Use division to get to the values of the highest two bytes."

Full description on GitHub: xorpd

The code I've written:

format PE console
entry start
include 'win32a.inc' 
; ===============================================
section '.text' code readable executable
start:
 mov eax, 0x01020304 
 xor ebp, ebp
process_eax: 
 movzx ebx, al 
 add ecx, ebx
 movzx ebx, ah
 add ecx, ebx
 cmp ebp, 0x1
 je print_result
 xor edx, edx
 mov ebx, 0xffff
 div ebx
 mov ebp, 0x1
 jmp process_eax 
print_result:
 mov eax, ecx
 call print_eax ; Provided by the teacher. Prints eax to the console.
exitProgram: 
 ; Exit the process:
 push 0
 call [ExitProcess]
include 'training.inc'

I think it works. I've tried it with different values and the sums were correct.

Screenshot with the output of the code above (with 0x01020304 as the hardcoded value).

Screenshot with 0x01020304 as value

But it's surely not the most efficient way to solve the exercise.

Question 2

"HINT: Use division to get to the values of the highest two bytes." What? Why not shift? A simple loop in which you add eax & 0xFF to the sum and shift eax right by 8 would suffice, wouldn't it?

Question 3

Comment your code. Regardless of language(, but in keeping with language conventions). What is mov ebx, 0xffff div ebx supposed to do? Does your code work for 0xdeadface?

Question 4

@kyrill Thought about shifts too. I know it from Java. But because it will be topic of the NEXT lecture I thought that I'm expected to find a way without using it.

Question 5

@greybeard Yep. You are right. I admit I have meglected comments. Will take better care concerning it.

Question 6

Since you're still learning, I won't cheat you out of the opportunity to discover for yourself, but I will offer some words of advice on how you can improve your program.

Minimize register use

The current code uses eax, ebx, ecx, edx and ebp. One of the most important things for an assembly language programmer is to use registers efficiently and effectively. This particular task can easily be done with just two registers.

Prefer shift to division

As alluded to in a comment, shift instructions are typically much faster to execute than divide instructions. For that reason, in tasks like this, it's much more common to see a shift than a divide.

Avoid loops

Branching tends to be computationally disruptive for processors. While modern desktop machines tend to compensate for this via speculative execution and large cache sizes, code often runs faster if loops and branches are avoided entirely. This can confer other benefits such as more predictable running time which can be important for the scheduling of Real Time Operating Systems (RTOS) and in some kinds of cryptographic code to provide some resistance to side channel attacks.

Question 7

There are a few errors in this program.

You build the result in ECX but you did not clear that one beforehand. If results are correct, as you stated, it's because the ECX register was empty and you got lucky.
To bring the high word down to the low word, you need to divide by 65536, not by 65535 (0xffff) like you did.

Optimizations.

Instead of dividing a mere shift down by 16 would produce the same result.

Of course I noticed that the task hinted to use the division operation, but then again a hint is just a hint, not something mandatory!
The second movzx ebx, ah could be written also as mov bl,ah since the highest 24 bits of EBX are still empty.
You're using EBP as a flag (values 0 and 1 only). You can replace cmp ebp, 0x1 by the shorter test ebp, ebp. Remember to jump on the opposite condition: jnz print_result.
You're using EBP as a flag (values 0 and 1 only). You can replace mov ebp, 0x1 by the shorter inc ebp.

Your program but modified based on the above.

start:
 mov eax, 0x01020304
 xor ecx, ecx
 xor ebp, ebp
process_eax: 
 movzx ebx, al
 add ecx, ebx
 mov bl, ah
 add ecx, ebx
 test ebp, ebp
 jnz print_result
 xor edx, edx
 mov ebx, 0x10000
 div ebx
 inc ebp
 jmp process_eax
print_result:

Your program but modified more using 1 register less.

Only repeat the code when the quotient produced a non-zero AX.

start:
 mov eax, 0x01020304
 xor ecx, ecx
process_eax:
 movzx ebx, al
 add ecx, ebx
 mov bl, ah
 add ecx, ebx
 xor edx, edx
 mov ebx, 0x10000
 div ebx
 test ax, ax
 jnz process_eax ;At most 1 time

Your program but modified more using 2 registers less and preferring to use shift over divide.

Only repeat the code when the quotient produced a non-zero AX.

start:
 mov eax, 0x01020304
 xor ecx, ecx
process_eax:
 movzx ebx, al
 add ecx, ebx
 mov bl, ah
 add ecx, ebx
 shr eax, 16
 jnz process_eax ;At most 1 time

Edward Edward 67.2k4 gold badges120 silver badges284 bronze badges · Accepted Answer · 2017-04-17 17:05:57Z

Since you're still learning, I won't cheat you out of the opportunity to discover for yourself, but I will offer some words of advice on how you can improve your program.

Minimize register use

The current code uses eax, ebx, ecx, edx and ebp. One of the most important things for an assembly language programmer is to use registers efficiently and effectively. This particular task can easily be done with just two registers.

Prefer shift to division

As alluded to in a comment, shift instructions are typically much faster to execute than divide instructions. For that reason, in tasks like this, it's much more common to see a shift than a divide.

Avoid loops

Branching tends to be computationally disruptive for processors. While modern desktop machines tend to compensate for this via speculative execution and large cache sizes, code often runs faster if loops and branches are avoided entirely. This can confer other benefits such as more predictable running time which can be important for the scheduling of Real Time Operating Systems (RTOS) and in some kinds of cryptographic code to provide some resistance to side channel attacks.

Stack Exchange Network

Assembly: Sum up the single bytes of a 32bit-register to a checksum

2 Answers 2

Minimize register use

Prefer shift to division

Avoid loops

There are a few errors in this program.

Optimizations.

Your program but modified based on the above.

Your program but modified more using 1 register less.

Your program but modified more using 2 registers less and preferring to use shift over divide.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Assembly: Sum up the single bytes of a 32bit-register to a checksum

2 Answers 2

Minimize register use

Prefer shift to division

Avoid loops

There are a few errors in this program.

Optimizations.

Your program but modified based on the above.

Your program but modified more using 1 register less.

Your program but modified more using 2 registers less and preferring to use shift over divide.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions