4443 – Optimizer produces wrong code for || or && with struct arrays

D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 4443 - Optimizer produces wrong code for || or && with struct arrays
Summary: Optimizer produces wrong code for || or && with struct arrays
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: dmd (show other issues)
Version: D2
Hardware: Other Windows
: P2 critical
Assignee: No Owner
URL:
Keywords: patch, wrong-code
: 3761 (view as issue list)
Depends on:
Blocks: 3761
Show dependency tree / graph
Reported: 2010年07月10日 01:27 UTC by Rainer Schuetze
Modified: 2015年06月09日 05:11 UTC (History)
2 users (show)

See Also:


Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.
Description Rainer Schuetze 2010年07月10日 01:27:42 UTC
The code compiled from this source crashes when compiled with "dmd -O -release test.d":
module test;
struct TokenInfo
{
	// seems important to have a "complicated" struct size (so a calculation is necessary)
	int type;
	int StartIndex;
	int EndIndex;
}
int GetTokenInfoAt(TokenInfo[] infoArray, int col, ref TokenInfo info)
{
	for (int i = 0; i < infoArray.length; i++)
	{
		int start = infoArray[i].StartIndex;
		int end = infoArray[i].EndIndex;
		if (i == 0 && start > col)
			return -1;
		if (col >= start && col < end)
		{
			info = infoArray[i];
			return i;
		}
	}
	if (infoArray.length > 0 && col == infoArray[$-1].EndIndex)
	{
		info = infoArray[$-1];
		return infoArray.length-1;
	/* code generated for the 2 statements above:
		ov	ECX,024h[ESP] // infoArray.ptr
		mov	EBX,020h[ESP] // infoArray.length
		mov	EDX,ECX
		lea	ESI,[ECX*2][ECX] // uses pointer instead of length!!!
		mov	EAX,EBX
		lea	ESI,-0Ch[ESI*4][EDX] // crashes here!
		mov	EDI,014h[ESP]
		movsd
		movsd
		movsd
		lea	EAX,-1[EBX]
	*/
	}
	return -1;
}
int main()
{
	TokenInfo[] arr = new TokenInfo[9];
	arr[8].EndIndex = 11;
	TokenInfo info;
	
	return GetTokenInfoAt(arr, 11, info);
}
The crash goes away if 
- one of the conditions in the loop are removed
- the struct size of TokenInfo is reduced to 8 bytes
- dmd is executed without "-O"
- dmd is executed without "-release"
happens for all dmd version back to 2.032 and also in dmd 1.056
Comment 1 Don 2010年07月12日 07:58:26 UTC
Reduced test case. The for loop never runs, and gets optimized away, yet it still corrupts the code. The struct must be large enough that it doesn't fit in a register. The bug can be disabled in gloop.c, intronvars(), by commenting out line 2873, but I'm not yet sure what the root cause is. Probably failing to copy all of the data into the newly created variable.
---
struct Struct4443
{
 int x;
 char[5] unused;
}
void foo4443(Struct4443[] arr, Struct4443 *dest)
{
 for (int i = 0; false; ) {
 int junk = arr[i].x;
 }
 if (dest || arr[$-1].x) {
 *dest = arr[$-1];
 }
}
void main()
{
 Struct4443[1] a;
 Struct4443 info;
 foo4443(a, &info);
}
Comment 2 Don 2010年07月12日 13:08:53 UTC
Further reduction shows that it is unrelated to for loops. Seems to require either || or && in the if statement.
This bug existed in prehistoric times (tested on DMD0.140).
--------------
struct Struct4443
{
 int x;
 char[5] unused;
}
void foo4443(Struct4443 *dest, Struct4443[] arr)
{
 int junk = arr[$-1].x;
 if (dest || arr[$-1].x) {
 *dest = arr[$-1];
 }
}
void main()
{
 Struct4443[1] a;
 Struct4443 info;
 foo4443(&info, a);
}
Comment 3 Don 2010年08月03日 21:56:32 UTC
This is a really subtle code generation bug. The buggy bit of code is below.
In this section of code, the expression size is one register. 
In the bit of code at L13, it checks to see if the addition factor is already in a register.
In this case, it is, BUT it's actually in a double register: the {ptr, length} pair is a ulong, and here it's been cast to uint to get the length.
So isregvar() returns BOTH registers in regm.
Then, at the end of this section of code, the register to use is selected with a call to findreg(). But it just returns "the first register" which is correct only if there is only one register.
In this patch, I just use the LSW returned from isregvar. It would also be possible to change isregvar() to only return the LSW in the case where a cast to smaller type has occured, but I don't know what other code expects from isregvar. 
Not sure if the tysize() check is necessary. Maybe it could just be retregs = 1 << reg1;
PATCH: cod2.c, cdorth(), line 362. (Does 1<<reg1 work if reg1 is 0?).
 L13:
 regm_t regm;
 if (e11->Eoper == OPvar && isregvar(e11,&regm,&reg1))
 {
+ if (tysize[tybasic(e11->Ety)]<= REGSIZE)
+ retregs = 1 << reg1; // Only want the LSW
+ else
 retregs = regm;
 c1 = NULL;
 freenode(e11);
 }
 else
 c1 = codelem(e11,&retregs,FALSE);
 }
 rretregs = ALLREGS & ~retregs;
 c2 = scodelem(ebase,&rretregs,retregs,TRUE);
 {
 regm_t sregs = *pretregs & ~rretregs;
 if (!sregs)
 sregs = ALLREGS & ~rretregs;
 c3 = allocreg(&sregs,&reg,ty);
 }
+ // BUG: We should ensure there is only register in retregs
 reg1 = findreg(retregs);
Also noticed that in cod4.c, cdmsw() line 2838, there's an incorrect comment.
 retregs &= mMSW; // want LSW only
This should of course be // want MSW only
Comment 4 Don 2010年08月04日 00:07:00 UTC
(In reply to comment #3)
> if (!sregs)
> sregs = ALLREGS & ~rretregs;
> c3 = allocreg(&sregs,&reg,ty);
> }
> + // BUG: We should ensure there is only register in retregs
> reg1 = findreg(retregs);
That should be:
 if (!sregs)
 sregs = ALLREGS & ~rretregs;
 c3 = allocreg(&sregs,&reg,ty);
 }
+ assert( (retregs & (retregs-1)) == 0); // Must be only one register
 reg1 = findreg(retregs);
There are probably contrived cases where this bug could occur in C++ code, but I don't think it would ever occur in practice.
Comment 5 Walter Bright 2010年08月05日 14:22:53 UTC
http://www.dsource.org/projects/dmd/changeset/601 
Comment 6 Don 2010年08月06日 01:24:53 UTC
*** Issue 3761 has been marked as a duplicate of this issue. ***


AltStyle によって変換されたページ (->オリジナル) /