Functions with ref parameters can be up to 2x slower than functions which parameters are not passed by reference. It is especially visible for programs which are compiled with all optimizations on: -O -inline -release Test program: ------------------------ char peek_ref(ref char[] a) { return a[0]; } char peek_noref(char[] a) { return a[0]; } ulong rdtsc() { asm { naked; rdtsc; ret; } } void main() { char[] tst = "test string"; char c; long starttime; writef("Direct: "); starttime = rdtsc; for(int i=0; i<1000; i++) c = tst[0]; writefln(c, " - time: ", rdtsc - starttime); writef("With ref: "); starttime = rdtsc; for(int i=0; i<1000; i++) c = tst.peek_ref(); writefln(c, " - time: ", rdtsc - starttime); writef("Without ref: "); starttime = rdtsc; for(int i=0; i<1000; i++) c = tst.peek_noref(); writefln(c, " - time: ", rdtsc - starttime); } ----------------------
I believe the problem is that ref parameters disable inlining in DMD. Or so I think I heard somewhere before.
This is true, and also applies to D2. Apparently ref prevents inlining. It's somewhere in the comments in inline.c. Why, I don't know. The weird thing is that apparently DMD inlines stuff with pointer parameters, and ref parameters are just syntactic sugar for pointers. Here's a test program and the disassembly. import std.stdio; // Shouldn't this generate *exactly* the same code as ptrSwap? void swap(T)(ref T a, ref T b) { T temp = a; a = b; b = temp; } void ptrSwap(T)(T* a, T* b) { T temp = *a; *a = *b; *b = temp; } void main() { uint a, b; swap(a, b); ptrSwap(&a, &b); writeln(a); // Keep DMD from optimizing out ptrSwap entirely. } Here's the disassembly of the relevant portion: COMDEF __Dmain push eax ; push eax ; xor eax, eax ; push ebx ; lea ecx, [esp+4H] ; mov dword ptr [esp+4H], eax ; mov dword ptr [esp+8H], eax ; push ecx ; lea eax, [esp+0CH] ; call _D5test711__T4swapTkZ4swapFKkKkZv ; mov edx, dword ptr [esp+4H] ; mov ebx, dword ptr [esp+8H] ; mov dword ptr [esp+4H], ebx ; mov eax, offset FLAT:_main ; mov dword ptr [esp+8H], edx ; push ebx ; push 10 ; call _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv; xor eax, eax ; pop ebx ; add esp, 8 ; ret ; __Dmain ENDP
Recent change to dmd forced everybody to use opEquals with ref arguments. That leaves no other option than the poorly optimised one. Isn't it time to fix this bug?
(In reply to comment #3) > Recent change to dmd forced everybody to use opEquals with ref arguments. That > leaves no other option than the poorly optimised one. Isn't it time to fix this > bug? I think the more important fix is to start allowing opEquals with non-ref arguments again. Only allowing ref arguments is ridiculous because it forces the argument to be an lvalue, thus creating a horribly ugly inconsistency with builtin types.
(In reply to comment #4) > (In reply to comment #3) > > Recent change to dmd forced everybody to use opEquals with ref arguments. That > > leaves no other option than the poorly optimised one. Isn't it time to fix this > > bug? > > I think the more important fix is to start allowing opEquals with non-ref > arguments again. Only allowing ref arguments is ridiculous because it forces > the argument to be an lvalue, thus creating a horribly ugly inconsistency with > builtin types. That's true. Perhaps functions with signature (const ref T) could accept rvalues as well, as the semantics is equivalent to (T)?
Created attachment 626 [details] a potential patch (lots of debugging code left in for now) This diff likely needs to be used in conjunction with the patch in bug 4149. I haven't tested it separately to find out. I also haven't tested it with more than the simple test case in bug 4149.
Testing with the code from Marcin, using -O -inline and fixing several bugs due to changes since 2008: Direct: t - time: 4319 With ref: t - time: 4319 Without ref: t - time: 4382 The code changes (not perfect, but enough to get it to compile): 1) add .dup to the "test string" literal 2) s/writefln/writeln/
Testing with David's code, with -O -inline, here's the new _Dmain: _Dmain: push EBP mov EBP,ESP mov EAX,offset FLAT:_D3std5stdio6stdoutS3std5stdio4File@SYM32 push 0 push 0Ah call near ptr _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv@PC32 xor EAX,EAX pop EBP ret Without -O: _Dmain: push EBP mov EBP,ESP sub ESP,024h push EBX push ESI xor EAX,EAX mov -024h[EBP],EAX mov -020h[EBP],EAX lea ECX,-024h[EBP] mov -01Ch[EBP],ECX lea EDX,-020h[EBP] mov -018h[EBP],EDX mov EBX,[ECX] mov -014h[EBP],EBX mov ESI,[EDX] mov [ECX],ESI mov [EDX],EBX lea EAX,-024h[EBP] mov -010h[EBP],EAX lea ECX,-020h[EBP] mov -0Ch[EBP],ECX mov EDX,[EAX] mov -8[EBP],EDX mov EBX,[ECX] mov [EAX],EBX mov [ECX],EDX mov ESI,-024h[EBP] mov -4[EBP],ESI push ESI push 0Ah mov EAX,offset FLAT:_D3std5stdio6stdoutS3std5stdio4File@SYM32 call near ptr _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv@PC32 xor EAX,EAX pop ESI pop EBX leave ret
Created attachment 627 [details] Enable inlining of functions with ref parameters, v2 The previous patch didn't pass the dmd test suite (still need to understand why). This newer patch isolates the changes to the inlining code to generate AST closer to what would be manually written for hand inlined code. I also dropped all of the debugging code and unrelated code cleanup. This version DOES pass the dmd test suite. I've also confirmed that it is NOT dependent on the patch attached to bug 4149.
Created attachment 633 [details] add support for out parameters as well (so both ref and out) The previous patch only added support for ref parameters. This updated patch also includes adding support for inlining of functions that have out parameters.
changeset 497. Does static arrays too.
AltStyle によって変換されたページ (->オリジナル) / アドレス: モード: デフォルト 音声ブラウザ ルビ付き 配色反転 文字拡大 モバイル