lua-users home
lua-l archive

Re: Fast C string comparison?

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Am 27.03.2013 22:54 schröbte Philipp Janda:
[...]
There are some options (with varying degrees of speed and safety)
mentioned in this[1] thread.
 [1]: http://lua-users.org/lists/lua-l/2012-01/msg00396.html
Another possibility (that I think isn't mentioned in the above thread,
but I only took a brief look) could be to put all your string constants
into upvalues and use the Lua API (lua_upvalueindex,
lua_compare/lua_rawequal) to compare those upvalues to the argument
strings on the stack.
But then again, libc implementers and compiler writers have had some
time to optimize strcmp.
Time for some micro-benchmarks ...
I've tested 6 possible implementations:
sample1: using memcmp
 size_t len = 0;
 char const* s = lua_tolstring( L, 1, &len );
#define KEY "mymethod"
 if( len == sizeof( KEY )-1 && !memcmp( s, KEY, sizeof( KEY )-1 ) ) )
#undef KEY
 /* ... */;
 else if( ... ) ...
sample2: using strcmp
 char const* s = lua_tostring( L, 1 );
 if( !strcmp( s, "mymethod" ) )
 /* ... */;
 else if( ... ) ...
sample3: using ragel to get a compiled state machine (see linked thread)
 size_t len = 0;
 char const* s = lua_tolstring( L, 1, &len );
 switch( str2id( s, len ) ) {
 case ID_MYMETHOD:
 /* ... */;
 break;
 case ...
sample4: using lua_rawequal and lua_upvalueindex
 if( lua_rawequal( L, 1, lua_upvalueindex( 1 ) ) )
 /* ... */;
 else if( ... ) ...
sample5: using gperf (see linked thread)
 switch( str2id( s, len ) ) {
 case 0:
 /* ... */;
 break;
 case ...
sample6: lua_upvalueindex, lua_tostring and pointer comparison
 char const* s = lua_tostring( L, 1 );
 if( s == lua_tostring( L, lua_upvalueindex( 1 ) ) )
 /* ... */;
 else if( ... ) ...
Some interesting facts before the results:
- sample1 does actually "call" memcmp (using -O3) which was surprising for me. I suspected inlining and loop unrolling, since two of the three arguments are compile-time constants
- sample2 uses a special assembler instruction for strcmp
 'if( strcmp( s, "mymethod" ) )' basically becomes:
 movl $.LC0, %edi # string constant "mymethod"
 movl 9,ドル %ecx # length of "mymethod"
 movq %rax, %rsi
 repz cmpsb
 je .L18
And here are the relative timings for
- 10 valid keys (mean length: 6.3)
- 2/3 lookups of valid keys, 1/3 invalid lookups:
sample1
time elapsed:	1.15
sample2
time elapsed:	2.19
sample3
time elapsed:	1.2
sample4
time elapsed:	2.58
sample5
time elapsed:	1.26
sample6
time elapsed:	2.2
Philipp
#!/usr/bin/lua5.1
local n = 1000000
local keys = {
 -- those are valid keys:
 "mymethod", "mykey", "foo", "bar", "baz", "helloworld", "terminal", "imoutofideas", "ubuntu", "linux",
 -- those are nonexistent:
 "abcedefg", "12343567", "blablabla", "non", "enough",
}
local funcs = {
 require( "sample1" ), -- memcmp
 require( "sample2" ), -- strcmp
 require( "sample3" ), -- ragel state machine
 require( "sample4" ), -- lua_rawequal, lua_upvalueindex
 require( "sample5" ), -- gperf perfect hashing function
 require( "sample6" ), -- s1 == s2, lua_upvalueindex
}
for i = 1, #funcs do
 local f = funcs[ i ]
 print( "sample" .. i )
 local start = os.clock()
 for j = 1, n do
 for k = 1, #keys do
 f( keys[ k ] )
 end
 end
 local stop = os.clock()
 print( "time elapsed:", stop-start )
end

AltStyle によって変換されたページ (->オリジナル) /