String routines

Question 1

These are a few string routines (just the mem* ones). I've tried to optimize them the best I can without having them be too big, but I'm unsure if I've done a good job.

I'd prefer size over speed unless it's just a few bytes, in which case that would be fine. I would also prefer not to sacrifice simplicity for speed.

memchr.S (related):

.globl memchr
memchr:
 mov %rdx, %rcx
 movzbl %sil, %eax
 repne scasb
 lea -1(%rdi), %rax
 test %rcx, %rcx
 cmove %rcx, %rax
 ret

memcmp.S:

.globl memcmp
memcmp:
 mov %rdx, %rcx
 repe cmpsb
 movzbl -1(%rdi), %eax
 movzbl -1(%rsi), %edx
 sub %edx, %eax
 ret

memcpy.S:

.globl memcpy
memcpy:
 mov %rdx, %rcx
 mov %rdi, %rax
 rep movsb
 ret

memmove.S:

.globl memmove
memmove:
 mov %rdx, %rcx
 mov %rdi, %rax
 cmp %rdi, %rsi
 jge 0f
 dec %rdx
 add %rdx, %rdi
 add %rdx, %rsi
 std
0: rep movsb
 cld
 ret

memrchr.S:

.globl memrchr
memrchr:
 mov %rdx, %rcx
 add %rdx, %rdi
 movzbl %sil, %eax
 std
 repne scasb
 cld
 lea 1(%rdi), %rax
 test %rcx, %rcx
 cmove %rcx, %rax
 ret

memset.S:

.globl memset
memset:
 mov %rdx, %rcx
 mov %rdi, %rdx
 movzbl %sil, %eax
 rep stosb
 mov %rdx, %rax
 ret

As usual for Stack Exchange sites, this code is released under CC/by-sa 3.0, but any future changes can be accessed here.

Question 2

The code looks straight-forward and really optimized for size and simplicity.

There's a small detail that I would change, though: replace cmove with cmovz, to make the code more expressive. It's not that "being equal" would be of any interest here, it's the zeroness of %ecx that is interesting.

I like the omitted second jmp in memmove. It's obvious after thinking a few seconds about it.

According to this quote it's ok to rely on the direction flag being always cleared.

I still suggest to write a few unit tests to be on the safe side.

Question 3

See my answer for a bug that I found on my own (found by writing unit tests).

Question 4

There's a bug in your code if memchr finds %sil in the last byte of %rdi; if %rcx tests to be zero and yet the byte has been found, it will incorrectly return zero.

To fix that, do something like this:

.globl memchr
memchr:
 mov %rdx, %rcx
 movzbl %sil, %eax
 repne scasb
 sete %cl
 lea -1(%rdi), %rax
 test %cl, %cl
 cmovz %rcx, %rax
 ret

The same applies to memrchr.

Question 5

In memmove you have the following:

 cmp %rdi, %rsi
 jge 0f

(cmp rsi, rdi in Intel syntax I believe.) For rsi = 8000_0000_0000_0000h and rdi = 7FFF_FFFF_FFFF_FFFFh (we want to jump to make a forward move here) the signed-comparison conditional branch "jump if greater or equal" evaluates rsi as being "less than" rdi (rsi being a negative number in 64-bit two's complement while rdi is positive), so it doesn't jump and will make a backwards move. This is incorrect. You should use the equivalent unsigned branch "jump if above or equal", jae instead.

Question 6

Isn't this only an issue when a userspace address and kernelspace address are mixed?

Question 7

How likely is this in reality, though? The least you could expect is a segfault.

Question 8

It may not be an issue depending on the operating system / address-space layout. However, if it does happen, then a wrong move direction (if the buffers are actually overlapping) will result in silently corrupting the destination buffer.

Question 9

This won't ever happen because addresses are only actually used up to 48 bits, much less than 0x8000000000000000.

Roland Illig Roland Illig 21.8k2 gold badges36 silver badges83 bronze badges · Accepted Answer · 2019-08-16 21:15:58Z

The code looks straight-forward and really optimized for size and simplicity.

There's a small detail that I would change, though: replace cmove with cmovz, to make the code more expressive. It's not that "being equal" would be of any interest here, it's the zeroness of %ecx that is interesting.

I like the omitted second jmp in memmove. It's obvious after thinking a few seconds about it.

According to this quote it's ok to rely on the direction flag being always cleared.

I still suggest to write a few unit tests to be on the safe side.

See my answer for a bug that I found on my own (found by writing unit tests).

Stack Exchange Network

String routines

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

String routines

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions