Summary: memchr is faster, but the obvious implement seems to beat the builtin versions.
There are two related C functions for finding the next character in a string - strchr which assumes the string has a NUL character at the end, and memchr which takes the string length as an argument. For strings where you have the size and a NUL terminator, which is fastest? Using gcc 6.2.0 64bit MSYS2 on Windows 10, searching for a single byte 10M bytes along a string, the times were (fastest to slowest):
- 11.05ms
memchrimplemented the obvious way. - 14.82ms
strchrimplemented the obvious way. - 14.96ms
memchrprovided by GCC. - 19.63ms
strchrprovided by GCC.
Trying on 3 different Windows computers, the results are all similar (but scaled).
Given the choice, you should prefer memchr over strchr.
Surprise result
The optimised implementations shipped with GCC are slower than the obvious C implementations taken from a wiki. I have absolutely no idea why. From what I can tell, the builtin versions are coded in assembly, operating on multiple bytes at a time, using SSE instructions. In contrast, the C variants operate on a single byte at a time, and aren't vectorised by the optimiser according to Godbolt. If anyone has an explanation I'd be keen to hear it.
Benchmark Code
To benchmark the variants I wrote a Haskell program using criterion. The full code and build instructions are available in this gist. I compiled the C code with -O3, using the gcc shipped with GHC 8.2.1. I've reproduced the Haskell code below, with some comments:
-- Import all the necessary pieces
import qualified Data.ByteString as BS
import qualified Data.ByteString.Unsafe as BS
import Criterion.Main
import Foreign
import Foreign.C.Types
import Data.Monoid
-- Make all the C imports
foreign import ccall unsafe "string.h memchr" memchr_std :: Ptr Word8 -> CInt -> CSize -> IO (Ptr Word8)
foreign import ccall unsafe "string.h strchr" strchr_std :: Ptr Word8 -> CInt -> IO (Ptr Word8)
foreign import ccall unsafe memchr_c :: Ptr Word8 -> CInt -> CSize -> IO (Ptr Word8)
foreign import ccall unsafe strchr_c :: Ptr Word8 -> CInt -> IO (Ptr Word8)
-- Method for ignoring the size when using strchr
ignoreSize f a b _ = f a b
-- Build a suitable string with an interesting character i bytes along
cstr i = BS.replicate i 32 <> BS.singleton 64 <> BS.replicate i 32 <> BS.singleton 0
-- The functions to benchmark
funs =
[("memchr_std", memchr_std)
,("strchr_std", ignoreSize strchr_std)
,("memchr_c", memchr_c)
,("strchr_c", ignoreSize strchr_c)]
-- The main function, using Criterion
main = defaultMain
[ seq bs $ bench (show i ++ " " ++ name) $ whnfIO $ test fun bs
| i <- [1,10,100,1000,10000,100000,1000000,10000000]
, let bs = cstr i
, (name, fun) <- funs]
-- The function under test and input string
{-# NOINLINE test #-}
test fun bs =
BS.unsafeUseAsCStringLen bs $ \(ptr,len) ->
fun (castPtr ptr) 64 (fromIntegral len)