This is a file dump program written in assembly that displays 320 bytes at a time in 20 lines. It shows the hex code for each character and also the string
At the start of the program it asks for the name of the file and starts dumping. After each dump it asks if they want to see more.
x86 64bit NASM
%macro input 1
push rax
push rdi
push rsi
push rdx
mov eax, SYSREAD
mov edi, 1
mov esi, buffer2
mov edx, 20
syscall
mov esi, buffer2
mov edi, %1
; this loop makes sure the string is the correct size
%%loop:
cmp byte [esi], 0
je %%exitinput
cmp byte [esi], 10
je %%exitinput
movsb
jmp %%loop
%%exitinput:
pop rdx
pop rsi
pop rdi
pop rax
%endmacro
%macro print 2
push rdi
push rax
push rdi
push rdx
push rsi
push rcx
mov esi, %1
mov eax, SYSWRITE
mov edi, 1
mov edx, %2
syscall
pop rcx
pop rsi
pop rdx
pop rdi
pop rax
pop rdi
%endmacro
%macro exit 0
mov rax, 60
mov rdi, 0
syscall
%endmacro
section .bss
hex resb 2
alert resb 16
char resb 1
filename resb 20
buffer resb 320
SYSREAD equ 0
SYSWRITE equ 1
SYSOPEN equ 2
section .data
TABLE db "0123456789ABCDEF", 0
log db "Show more? (y/n)", 10, 0
loglen equ $ -log
section .text
global _start
_start:
input filename
mov eax, 2
mov rdi, filename
mov rsi, 0
mov edx, 0777
syscall
mov ebx, TABLE
push rax
program:
mov eax, SYSREAD
pop rdi
push rdi
mov esi, buffer
mov edx, 320
syscall
; check for end of file
cmp eax, 0
je eof
mov edi, buffer
; print 20 lines
mov ecx, 20
printer:
call displayhex
print space, 1
print alert, 16
print newline, 1
loop printer
print log, loglen
input char
cmp byte [char], "y"
je program
eof:
exit
displayhex:
push rcx
; string characters stored here
mov esi, alert
; get 16 characters from file
mov ecx, 16
loop:
push rcx
mov al, [edi]
cmp al, 10
je addspace
cmp al, 13
je continue
mov [esi], al
jmp continue
addspace:
mov byte [esi], 32
continue:
; convert ascii decimal to ascii hex
mov cx, 16
cwd
div cx
push rax
mov al, dl
xlat
mov [hex + 1], al
pop rax
cwd
div cx
mov al, dl
xlat
mov [hex], al
print space, 1
print hex, 2
end:
inc esi
inc edi
pop rcx
dec ecx
cmp ecx, 0
jnz loop
pop rcx
ret
-
\$\begingroup\$ As mentioned in Edward's review, please post complete code when asking for a review \$\endgroup\$Zachary Vance– Zachary Vance2022年06月17日 16:04:16 +00:00Commented Jun 17, 2022 at 16:04
1 Answer 1
Here are some things that may help you improve your program.
Make sure your code is complete
If you're going to ask for a code review, it should be your complete, best code to be respectful of the time reviewers spend looking it over. This code was incomplete and was missing definitions for buffer2
, space
and newline
. It was not too hard to figure those out and define them, but it's a sign that maybe this code isn't actually ready for review.
Avoid the use of "magic numbers"
This code is littered with "magic numbers," that is, unnamed constants such as 0777, 320, 20, etc. Generally it's better to avoid that and give such constants meaningful names. That way, if anything ever needs to be changed, you won't have to go hunting through the code for all instances of "20" and then trying to determine if this particular 20 means the length of the filename
buffer or some other constant that happens to have the same value.
Add comments
There are very few comments in this program and lots of "magic numbers" making it much harder to read than it should be. The use of macros helps, however, and is a good practice I would encourage you to continue.
Fix the bugs
There are at least two bugs in this program. The first is that there is a wrong file descriptor number used in the input
macro (0 is stdin, 1 is stdout). The second is that for files that are not a multiple of 320 bytes long, the last partial block is shown overlaid on top of the last full block which gives the user bad information.
Add error handling
If the user gives the name of a nonexistent file (or doesn't have read permissions, etc.) the file open call will fail and an error number will be returned in rax. You should check the return values generally to see if they have failed and handle the error appropriately.
Eliminate useless code
The input of the filename receives data into buffer2
and then copies it into filename
. Why not just receive it into filename
directly and eliminate the copy? Also, the code has these two lines:
pop rdi
push rdi
Rather than storing the handle on the stack and accessing it in this strange way, just put it in r9
and replace this with:
mov rdi, r9
Think of the user
The human interface for this program is not great. For instance, there is no prompt to indicate that the computer is waiting for a filename, and the user is asked whether or not to display more data even though the computer already knows there is no more data. Fixing these would make this a much more usable program. Also consider that a typical Linux command might be used in a script or with a human sitting at the keyboard. For that reason, I'd suggest getting the input filename optionally as a command line argument (otherwise read from stdin) and optionally eliminate the prompt-per-block.
Reduce system calls
The output code is rather slow and inefficient because it writes one or two characters at a time. Better would be to create an output buffer and then invoke the syscall to print an entire buffer at a time. It doesn't matter much for a program that is run at "human speed" like this is, but if you're writing in assembly language, efficiency is usually one of the goals.
Document register use
When you have things like pop rdi
and push rdi
keeping a variable on the stack instead of in an unused register, it tells me (and it should tell you!) that you could make better use of the registers available to you. Comments will help you; I often write comment blocks above labeled lines (jump targets) that tell what I'm expecting to be in each register and on the stack. It really helps debugging code like this. Ideally, you'd assign some purpose to each register and then use them only for that purpose in the entire rest of the code.
Document macro parameters
In what I believe was a design error, nasm's macros use numbered rather than named parameters. For that reason, we have to refer to parameters as %1
and %2
rather than, say, buffer
and bufferlength
. To make up for this deficit, documenting macro parameters in the form of comments is vital for others to be able to understand your program.