-
Notifications
You must be signed in to change notification settings - Fork 1.1k
How to debug a flaky test failure? #11931
-
We have a flaky test failure on Windows:
I'm wondering what are the good tricks/practices for debugging such flaky failures?
(In this case, we ignore flaky failures related to concurrency, as it seems it's not related to concurrency).
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment
-
After two days of debugging, now I figured out the root cause and proposed a fix #11960.
What's learned is that when there is non-determinism, concurrency in the code should be always be examined carefully.
Of course, the first step is always to reliably reproduce the problem. Luckily, the following changes is enough to reproduce the problem reliably (within 1 minute) with sbt > testOnly -- *.idempotency:
diff --git a/compiler/test/dotty/tools/dotc/IdempotencyTests.scala b/compiler/test/dotty/tools/dotc/IdempotencyTests.scala index 30dbb47475..187d382666 100644 --- a/compiler/test/dotty/tools/dotc/IdempotencyTests.scala +++ b/compiler/test/dotty/tools/dotc/IdempotencyTests.scala @@ -24,8 +24,8 @@ class IdempotencyTests { val opt = defaultOptions val posIdempotency = aggregateTests( - compileFilesInDir("tests/pos", opt)(TestGroup("idempotency/posIdempotency1")), - compileFilesInDir("tests/pos", opt)(TestGroup("idempotency/posIdempotency2")), + compileFile("tests/pos/i6507b.scala", opt)(TestGroup("idempotency/posIdempotency1")), + compileFile("tests/pos/i6507b.scala", opt)(TestGroup("idempotency/posIdempotency2")), ) val orderIdempotency = { @@ -55,9 +55,12 @@ class IdempotencyTests { val allTests = aggregateTests(orderIdempotency, posIdempotency) - val tests = allTests.keepOutput.checkCompile() - allChecks.checkRuns() - tests.delete() + while (true) do { + val tests = allTests.keepOutput.checkCompile() + allChecks.keepOutput.checkRuns() + } + // tests.delete() + } }
We can use the following script to compare why it failed:
sbt "scalac -decompile -print-tasty out/idempotency/posIdempotency1/i6507b/pos/i6507b/Test.tasty" > i6507b-1.txt
sbt "scalac -decompile -print-tasty out/idempotency/posIdempotency2/i6507b/pos/i6507b/Test.tasty" > i6507b-2.txt
The files can be found here, the diff is the following:
--- i6057b-1.txt 2021年03月30日 16:49:13.000000000 +0200 +++ i6057b-2.txt 2021年03月30日 16:49:19.000000000 +0200 @@ -256,7 +256,7 @@ 406: STRINGconst 56 [tests/pos/i6507b.scala] 408: - 292 position bytes: + 286 position bytes: lines: 11 line sizes: 26, 0, 13, 79, 36, 56, 3, 0, 28, 1, 0 positions: @@ -329,14 +329,14 @@ 319: 197 .. 215 321: 193 .. 193 327: 183 .. 196 - 330: 200 .. 215 + 330: 183 .. 183 333: 183 .. 196 337: 200 .. 215 341: 197 .. 215 344: 197 .. 215 346: 193 .. 193 352: 183 .. 196 - 355: 200 .. 215 + 355: 183 .. 183 358: 183 .. 196 362: 200 .. 215 366: 197 .. 215
The next step is to understand which tree gets the wrong position by reading the print closely. In the output, they are all related to the type of ValDef:
...
316: INTconst 98
...
327: VALDEF(8) 44 [x[Unique $ 1]]
330: SHAREDtype 316
...
341: INTconst 97
...
352: VALDEF(8) 44 [x[Unique $ 1]]
355: SHAREDtype 341
Compile the file with -Yshow-tree-ids will help us get the id of the trees that correspond to the above:
: scalac -d out -Xprint:typer -uniqid -Yshow-tree-ids tests/pos/i6507b.scala
We will get output like the following:
val x1ドル#8163: (96 : Int#40)#768 = 96#798#799 { val x1ドル#8172: (97 : Int#40)#860 = 97#879#880 ( { val x1ドル#8180: (98 : Int#40)#941 = 98#960#961 ( EmptyTuple#6744#61#998.*:#6759#1001[(98 : Int#40)#951, Tuple#406#906 ]#1002(x$1#8180#953)#1003 :Any#444 *: Tuple#406#696)#1017 }#1018#1019.*:#6759#1022[(97 : Int#40)#870, Tuple#406#825]#1023( x$1#8172#872 )#1024 :Any#444 *: Tuple#406#696)#1025 }#1026#1027.*:#6759#1030[(96 : Int#40)#778, Tuple#406#733]#1031( x$1#8163#780 )#1032:Any#444 *: Tuple#406#696#1033
The tree with id 941 and 860 are relevant. Now we can track how a tree evolves to 941 with the position in the tasty print:
--- a/compiler/src/dotty/tools/dotc/ast/Trees.scala +++ b/compiler/src/dotty/tools/dotc/ast/Trees.scala @@ -128,6 +128,10 @@ object Trees { (if (myTpe == null || (myTpe.asInstanceOf[AnyRef] eq tpe.asInstanceOf[AnyRef])) this else cloneIn(source)).asInstanceOf[Tree[Type]] + if tree.uniqueId == 941 then + Thread.dumpStack() + println("span = " + span) + println("tpe = " + tpe) + println("myTpe = " + myTpe) + println("this.uniqueId = " + this.uniqueId) tree overwriteType tpe tree.asInstanceOf[ThisTree[Type]] } --- a/compiler/src/dotty/tools/dotc/ast/Positioned.scala +++ b/compiler/src/dotty/tools/dotc/ast/Positioned.scala @@ -44,6 +44,9 @@ abstract class Positioned(implicit @constructorOnly src: SourceFile) extends Src def span: Span = mySpan def span_=(span: Span): Unit = + if uniqueId == 941 then + println("setting position from") + Thread.dumpStack()
By tracking where tree 941 comes from, we will notice that the span of the tree is pretty stable during Typer (<193 .. 193> on Mac, <183 .. 183> on Windows), but its span gets updated to [200..212..215] in the phase Inlining.
But the phase Pickler happens before the phase Inlining, which serializes the trees to bytes. How it is possible that a later phase influences the result of pickling? By reading the code in Pickler.scala, we find the usage of parallelism:
val pickledF = inContext(ctx.fresh) { Future { ... } def force(): Array[Byte] = val result = Await.result(pickledF, Duration.Inf) positionWarnings.foreach(report.warning(_)) result if !Pickler.ParallelPickling || ctx.settings.YtestPickler.value then force() unit.pickled += (cls -> force)
The code enables parallel pickling by default without waiting for pickling to finish. It means, if the future runs a little slowly, it may observe the mutation in the Inlining phase, thus non-determinism!
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1