File dump in assembly

Question 1

This is a file dump program written in assembly that displays 320 bytes at a time in 20 lines. It shows the hex code for each character and also the string

At the start of the program it asks for the name of the file and starts dumping. After each dump it asks if they want to see more.

x86 64bit NASM

%macro input 1
 push rax
 push rdi
 push rsi
 push rdx
 mov eax, SYSREAD
 mov edi, 1
 mov esi, buffer2
 mov edx, 20
 syscall
 mov esi, buffer2
 mov edi, %1
 ; this loop makes sure the string is the correct size
 %%loop:
 cmp byte [esi], 0
 je %%exitinput
 cmp byte [esi], 10
 je %%exitinput
 movsb
 jmp %%loop
 %%exitinput:
 pop rdx
 pop rsi
 pop rdi
 pop rax
%endmacro
%macro print 2
 push rdi
 push rax
 push rdi
 push rdx
 push rsi
 push rcx
 mov esi, %1
 mov eax, SYSWRITE
 mov edi, 1
 mov edx, %2
 syscall
 pop rcx
 pop rsi
 pop rdx
 pop rdi
 pop rax
 pop rdi
%endmacro
%macro exit 0
 mov rax, 60
 mov rdi, 0
 syscall
%endmacro
section .bss 
 hex resb 2
 alert resb 16
 char resb 1
 filename resb 20
 buffer resb 320
SYSREAD equ 0
SYSWRITE equ 1
SYSOPEN equ 2
section .data
 TABLE db "0123456789ABCDEF", 0
 log db "Show more? (y/n)", 10, 0
 loglen equ $ -log
section .text
 global _start 
 _start:
 input filename
 
 mov eax, 2
 mov rdi, filename
 mov rsi, 0
 mov edx, 0777
 syscall
 mov ebx, TABLE
 push rax
 program: 
 mov eax, SYSREAD
 pop rdi
 push rdi
 mov esi, buffer
 mov edx, 320
 syscall
 ; check for end of file
 cmp eax, 0
 je eof
 mov edi, buffer
 ; print 20 lines
 mov ecx, 20
 printer: 
 call displayhex 
 print space, 1
 print alert, 16
 print newline, 1 
 loop printer
 print log, loglen
 input char
 cmp byte [char], "y"
 je program
 eof:
 exit
 displayhex:
 push rcx
 ; string characters stored here
 mov esi, alert
 
 ; get 16 characters from file
 mov ecx, 16
 loop:
 push rcx
 mov al, [edi]
 cmp al, 10
 je addspace
 cmp al, 13
 je continue
 mov [esi], al
 jmp continue
 addspace:
 mov byte [esi], 32
 continue:
 ; convert ascii decimal to ascii hex
 mov cx, 16
 cwd
 div cx
 push rax
 mov al, dl
 xlat
 mov [hex + 1], al
 pop rax
 cwd
 div cx
 mov al, dl
 xlat
 mov [hex], al
 
 print space, 1
 print hex, 2
 end:
 inc esi
 inc edi
 pop rcx
 dec ecx
 cmp ecx, 0
 jnz loop
 pop rcx
 ret

Question 2

As mentioned in Edward's review, please post complete code when asking for a review

Question 3

Here are some things that may help you improve your program.

Make sure your code is complete

If you're going to ask for a code review, it should be your complete, best code to be respectful of the time reviewers spend looking it over. This code was incomplete and was missing definitions for buffer2, space and newline. It was not too hard to figure those out and define them, but it's a sign that maybe this code isn't actually ready for review.

Avoid the use of "magic numbers"

This code is littered with "magic numbers," that is, unnamed constants such as 0777, 320, 20, etc. Generally it's better to avoid that and give such constants meaningful names. That way, if anything ever needs to be changed, you won't have to go hunting through the code for all instances of "20" and then trying to determine if this particular 20 means the length of the filename buffer or some other constant that happens to have the same value.

Add comments

There are very few comments in this program and lots of "magic numbers" making it much harder to read than it should be. The use of macros helps, however, and is a good practice I would encourage you to continue.

Fix the bugs

There are at least two bugs in this program. The first is that there is a wrong file descriptor number used in the input macro (0 is stdin, 1 is stdout). The second is that for files that are not a multiple of 320 bytes long, the last partial block is shown overlaid on top of the last full block which gives the user bad information.

Add error handling

If the user gives the name of a nonexistent file (or doesn't have read permissions, etc.) the file open call will fail and an error number will be returned in rax. You should check the return values generally to see if they have failed and handle the error appropriately.

Eliminate useless code

The input of the filename receives data into buffer2 and then copies it into filename. Why not just receive it into filename directly and eliminate the copy? Also, the code has these two lines:

pop rdi
push rdi

Rather than storing the handle on the stack and accessing it in this strange way, just put it in r9 and replace this with:

mov rdi, r9

Think of the user

The human interface for this program is not great. For instance, there is no prompt to indicate that the computer is waiting for a filename, and the user is asked whether or not to display more data even though the computer already knows there is no more data. Fixing these would make this a much more usable program. Also consider that a typical Linux command might be used in a script or with a human sitting at the keyboard. For that reason, I'd suggest getting the input filename optionally as a command line argument (otherwise read from stdin) and optionally eliminate the prompt-per-block.

Reduce system calls

The output code is rather slow and inefficient because it writes one or two characters at a time. Better would be to create an output buffer and then invoke the syscall to print an entire buffer at a time. It doesn't matter much for a program that is run at "human speed" like this is, but if you're writing in assembly language, efficiency is usually one of the goals.

Document register use

When you have things like pop rdi and push rdi keeping a variable on the stack instead of in an unused register, it tells me (and it should tell you!) that you could make better use of the registers available to you. Comments will help you; I often write comment blocks above labeled lines (jump targets) that tell what I'm expecting to be in each register and on the stack. It really helps debugging code like this. Ideally, you'd assign some purpose to each register and then use them only for that purpose in the entire rest of the code.

Document macro parameters

In what I believe was a design error, nasm's macros use numbered rather than named parameters. For that reason, we have to refer to parameters as %1 and %2 rather than, say, buffer and bufferlength. To make up for this deficit, documenting macro parameters in the form of comments is vital for others to be able to understand your program.

Edward Edward 67.2k4 gold badges120 silver badges284 bronze badges · Answer 1 · 2022-06-17 13:15:01Z

Here are some things that may help you improve your program.

Make sure your code is complete

If you're going to ask for a code review, it should be your complete, best code to be respectful of the time reviewers spend looking it over. This code was incomplete and was missing definitions for buffer2, space and newline. It was not too hard to figure those out and define them, but it's a sign that maybe this code isn't actually ready for review.

Avoid the use of "magic numbers"

This code is littered with "magic numbers," that is, unnamed constants such as 0777, 320, 20, etc. Generally it's better to avoid that and give such constants meaningful names. That way, if anything ever needs to be changed, you won't have to go hunting through the code for all instances of "20" and then trying to determine if this particular 20 means the length of the filename buffer or some other constant that happens to have the same value.

Add comments

There are very few comments in this program and lots of "magic numbers" making it much harder to read than it should be. The use of macros helps, however, and is a good practice I would encourage you to continue.

Fix the bugs

There are at least two bugs in this program. The first is that there is a wrong file descriptor number used in the input macro (0 is stdin, 1 is stdout). The second is that for files that are not a multiple of 320 bytes long, the last partial block is shown overlaid on top of the last full block which gives the user bad information.

Add error handling

If the user gives the name of a nonexistent file (or doesn't have read permissions, etc.) the file open call will fail and an error number will be returned in rax. You should check the return values generally to see if they have failed and handle the error appropriately.

Eliminate useless code

The input of the filename receives data into buffer2 and then copies it into filename. Why not just receive it into filename directly and eliminate the copy? Also, the code has these two lines:

pop rdi
push rdi

Rather than storing the handle on the stack and accessing it in this strange way, just put it in r9 and replace this with:

mov rdi, r9

Think of the user

The human interface for this program is not great. For instance, there is no prompt to indicate that the computer is waiting for a filename, and the user is asked whether or not to display more data even though the computer already knows there is no more data. Fixing these would make this a much more usable program. Also consider that a typical Linux command might be used in a script or with a human sitting at the keyboard. For that reason, I'd suggest getting the input filename optionally as a command line argument (otherwise read from stdin) and optionally eliminate the prompt-per-block.

Reduce system calls

The output code is rather slow and inefficient because it writes one or two characters at a time. Better would be to create an output buffer and then invoke the syscall to print an entire buffer at a time. It doesn't matter much for a program that is run at "human speed" like this is, but if you're writing in assembly language, efficiency is usually one of the goals.

Document register use

When you have things like pop rdi and push rdi keeping a variable on the stack instead of in an unused register, it tells me (and it should tell you!) that you could make better use of the registers available to you. Comments will help you; I often write comment blocks above labeled lines (jump targets) that tell what I'm expecting to be in each register and on the stack. It really helps debugging code like this. Ideally, you'd assign some purpose to each register and then use them only for that purpose in the entire rest of the code.

Document macro parameters

In what I believe was a design error, nasm's macros use numbered rather than named parameters. For that reason, we have to refer to parameters as %1 and %2 rather than, say, buffer and bufferlength. To make up for this deficit, documenting macro parameters in the form of comments is vital for others to be able to understand your program.

Stack Exchange Network

File dump in assembly

1 Answer 1

Make sure your code is complete

Avoid the use of "magic numbers"

Add comments

Fix the bugs

Add error handling

Eliminate useless code

Think of the user

Reduce system calls

Document register use

Document macro parameters

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

File dump in assembly

1 Answer 1

Make sure your code is complete

Avoid the use of "magic numbers"

Add comments

Fix the bugs

Add error handling

Eliminate useless code

Think of the user

Reduce system calls

Document register use

Document macro parameters

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions