2008 – Poor optimization of functions with ref parameters

D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 2008 - Poor optimization of functions with ref parameters
Summary: Poor optimization of functions with ref parameters
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D1 (retired)
Hardware: x86 Windows
: P2 enhancement
Assignee: Brad Roberts
URL:
Keywords: performance
Depends on:
Blocks:
Reported: 2008年04月18日 16:17 UTC by Marcin Kuszczak
Modified: 2014年02月24日 15:32 UTC (History)
4 users (show)

See Also:


Attachments
a potential patch (lots of debugging code left in for now) (5.44 KB, patch)
2010年05月09日 04:16 UTC, Brad Roberts
Details | Diff
Enable inlining of functions with ref parameters, v2 (811 bytes, patch)
2010年05月10日 00:07 UTC, Brad Roberts
Details | Diff
add support for out parameters as well (so both ref and out) (1.62 KB, patch)
2010年05月16日 18:03 UTC, Brad Roberts
Details | Diff
Show Obsolete (2) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.
Description Marcin Kuszczak 2008年04月18日 16:17:57 UTC
Functions with ref parameters can be up to 2x slower than functions which parameters are not passed by reference. It is especially visible for programs which are compiled with all optimizations on:
-O
-inline
-release
Test program:
------------------------
char peek_ref(ref char[] a) {
 return a[0];
}
char peek_noref(char[] a) {
 return a[0];
}
ulong rdtsc() {
 asm {
 naked;
 rdtsc;
 ret;
 }
}
void main() {
 char[] tst = "test string";
 char c;
 long starttime;
 writef("Direct: ");
 starttime = rdtsc;
 for(int i=0; i<1000; i++)
 c = tst[0];
 writefln(c, " - time: ", rdtsc - starttime);
 writef("With ref: ");
 starttime = rdtsc;
 for(int i=0; i<1000; i++)
 c = tst.peek_ref();
 writefln(c, " - time: ", rdtsc - starttime);
 writef("Without ref: ");
 starttime = rdtsc;
 for(int i=0; i<1000; i++)
 c = tst.peek_noref();
 writefln(c, " - time: ", rdtsc - starttime);
}
----------------------
Comment 1 Bill Baxter 2008年04月18日 18:22:39 UTC
I believe the problem is that ref parameters disable inlining in DMD. Or so I think I heard somewhere before.
Comment 2 David Simcha 2009年05月15日 21:06:18 UTC
This is true, and also applies to D2. Apparently ref prevents inlining. It's somewhere in the comments in inline.c. Why, I don't know. The weird thing is that apparently DMD inlines stuff with pointer parameters, and ref parameters are just syntactic sugar for pointers. Here's a test program and the disassembly.
import std.stdio;
// Shouldn't this generate *exactly* the same code as ptrSwap?
void swap(T)(ref T a, ref T b) {
 T temp = a;
 a = b;
 b = temp;
}
void ptrSwap(T)(T* a, T* b) {
 T temp = *a;
 *a = *b;
 *b = temp;
}
void main() {
 uint a, b;
 swap(a, b);
 ptrSwap(&a, &b);
 writeln(a); // Keep DMD from optimizing out ptrSwap entirely.
}
Here's the disassembly of the relevant portion:
 COMDEF __Dmain
 push eax ; 
 push eax ; 
 xor eax, eax ;
 push ebx ; 
 lea ecx, [esp+4H] ; 
 mov dword ptr [esp+4H], eax ; 
 mov dword ptr [esp+8H], eax ; 
 push ecx ; 
 lea eax, [esp+0CH] ; 
 call _D5test711__T4swapTkZ4swapFKkKkZv ; 
 mov edx, dword ptr [esp+4H] ; 
 mov ebx, dword ptr [esp+8H] ; 
 mov dword ptr [esp+4H], ebx ; 
 mov eax, offset FLAT:_main ; 
 mov dword ptr [esp+8H], edx ; 
 push ebx ;
 push 10 ;
 call _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv;
 xor eax, eax ; 
 pop ebx ; 
 add esp, 8 ;
 ret ; 
__Dmain ENDP
Comment 3 Eldar Insafutdinov 2010年01月24日 09:39:04 UTC
Recent change to dmd forced everybody to use opEquals with ref arguments. That leaves no other option than the poorly optimised one. Isn't it time to fix this bug?
Comment 4 David Simcha 2010年01月24日 16:00:12 UTC
(In reply to comment #3)
> Recent change to dmd forced everybody to use opEquals with ref arguments. That
> leaves no other option than the poorly optimised one. Isn't it time to fix this
> bug?
I think the more important fix is to start allowing opEquals with non-ref arguments again. Only allowing ref arguments is ridiculous because it forces the argument to be an lvalue, thus creating a horribly ugly inconsistency with builtin types.
Comment 5 Eldar Insafutdinov 2010年01月24日 16:06:02 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Recent change to dmd forced everybody to use opEquals with ref arguments. That
> > leaves no other option than the poorly optimised one. Isn't it time to fix this
> > bug?
> 
> I think the more important fix is to start allowing opEquals with non-ref
> arguments again. Only allowing ref arguments is ridiculous because it forces
> the argument to be an lvalue, thus creating a horribly ugly inconsistency with
> builtin types.
That's true. Perhaps functions with signature (const ref T) could accept rvalues as well, as the semantics is equivalent to (T)?
Comment 6 Brad Roberts 2010年05月09日 04:16:48 UTC
Created attachment 626 [details] 
a potential patch (lots of debugging code left in for now) 
This diff likely needs to be used in conjunction with the patch in bug 4149. I haven't tested it separately to find out.
I also haven't tested it with more than the simple test case in bug 4149.
Comment 7 Brad Roberts 2010年05月09日 04:35:43 UTC
Testing with the code from Marcin, using -O -inline and fixing several bugs due to changes since 2008:
Direct: t - time: 4319
With ref: t - time: 4319
Without ref: t - time: 4382
The code changes (not perfect, but enough to get it to compile):
 1) add .dup to the "test string" literal
 2) s/writefln/writeln/
Comment 8 Brad Roberts 2010年05月09日 04:41:38 UTC
Testing with David's code, with -O -inline, here's the new _Dmain:
_Dmain:
 push EBP
 mov EBP,ESP
 mov EAX,offset FLAT:_D3std5stdio6stdoutS3std5stdio4File@SYM32
 push 0
 push 0Ah
 call near ptr _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv@PC32
 xor EAX,EAX
 pop EBP
 ret
Without -O:
_Dmain:
 push EBP
 mov EBP,ESP
 sub ESP,024h
 push EBX
 push ESI
 xor EAX,EAX
 mov -024h[EBP],EAX
 mov -020h[EBP],EAX
 lea ECX,-024h[EBP]
 mov -01Ch[EBP],ECX
 lea EDX,-020h[EBP]
 mov -018h[EBP],EDX
 mov EBX,[ECX]
 mov -014h[EBP],EBX
 mov ESI,[EDX]
 mov [ECX],ESI
 mov [EDX],EBX
 lea EAX,-024h[EBP]
 mov -010h[EBP],EAX
 lea ECX,-020h[EBP]
 mov -0Ch[EBP],ECX
 mov EDX,[EAX]
 mov -8[EBP],EDX
 mov EBX,[ECX]
 mov [EAX],EBX
 mov [ECX],EDX
 mov ESI,-024h[EBP]
 mov -4[EBP],ESI
 push ESI
 push 0Ah
 mov EAX,offset FLAT:_D3std5stdio6stdoutS3std5stdio4File@SYM32
 call near ptr _D3std5stdio4File14__T5writeTkTaZ5writeMFkaZv@PC32
 xor EAX,EAX
 pop ESI
 pop EBX
 leave
 ret
Comment 9 Brad Roberts 2010年05月10日 00:07:11 UTC
Created attachment 627 [details] 
Enable inlining of functions with ref parameters, v2
The previous patch didn't pass the dmd test suite (still need to understand why).
This newer patch isolates the changes to the inlining code to generate AST closer to what would be manually written for hand inlined code. I also dropped all of the debugging code and unrelated code cleanup. This version DOES pass the dmd test suite.
I've also confirmed that it is NOT dependent on the patch attached to bug 4149.
Comment 10 Brad Roberts 2010年05月16日 18:03:11 UTC
Created attachment 633 [details] 
add support for out parameters as well (so both ref and out)
The previous patch only added support for ref parameters. This updated patch also includes adding support for inlining of functions that have out parameters.
Comment 11 Walter Bright 2010年05月21日 18:34:35 UTC
changeset 497. Does static arrays too.


AltStyle によって変換されたページ (->オリジナル) /